With reasonably tight storms the empty sash weight pockets and lack of weatherstripping can be of remarkably minimal consequence. Stack effect infiltration from leaks at the attic & basement are much bigger contributors to heat loss than air leaks that happen in-between. The former are a 24/7 drive whereas lateral leakage rates are mostly wind driven.
As for the "basis in reality", for a rough cut, what's the size & shape of the house, and how many windows & doors? Even if the nightime setback is 58F, I assume it only gets cool enough to trip that setpoint on the coldest nights of the year (?). Setbacks of that magnitude would only result in single-digit percentage savings, whereas your total annual domestic hot water that uses the same fuel would usually be into double-digits. If you were regularly keeping the place at 50F for most of the day you'd be well into double-digit annual savings though. If you can verify the gas company data using your own heating degree day data from a third party source against fuel use during a winter billing period you can find out pretty easily if their numbers are WAY off. No matter what they're not going to be 2x-3x off, which is what it would have to be to rationalize installing a 110KBTU/hr mod-con.
Blown cellulose tightens up walls by quite a bit- something on the order of 90% tighter than low-density batts or low density blown fiberglass, and the installation process fills in all voids & anomalies, and plugging crack-leaks with cellulose. The only way to get tighter in a retrofit is with ultra-fine new-school blown fiberglass dense-packed to 1.8lbs/ft^3 or higher, or to use expanding polyurethane foams (or non expanding injection foam.) It's better than you might think, independent of absolute R value.
When having the heat loss calc done, insist on using 68F for the interior temp, and -6F as the design temp. If they use a different number, make them explain why. I've had heating professionals tell me with a straight face to use values that were more than 10F lower than my local 99th percentile design temp, which is something I'm very loathe to do. Since heat load tools tend to overshoot by double-digit percentages even using the correct inputs, adding another ~15% error to the high side is a ridiculous thing to do.