(1) Census data on who lives where, and, ideally, the paired addresses in the journey-to-work questions.
(2) Map data to make sense of those addresses
(3) Interstate data: number of lanes, location of overpasses, grade and curvature details, location of tunnels (especially ones that aren't just a single tunnel for the whole interstate both ways), bridges, multi-deck highways.
(4) Traffic data: in an ideal world, by every hour of the year. In a less ideal world, at least to know what things look like on holidays (holiday weekends, especially), Sundays, Saturdays, morning and evening commutes. For rural interstate that provides ag access (such as I-90 through Montana), the pattern and speed of agricultural vehicles. If you're taking away one lane each of 2 lanes each way, and a tractor is moving down the interstate car travel lane at <35 mph, what does _that_ do. It may or may not matter, but if it's going to result in multi-hour backups that extend for a dozen miles, you'd sort of like to know about it ahead of time.
(5) Truck traffic data: volume of freight shipments already moving in which corridors, especially LTL. Be sweet if you could get some kind of dollar amount on that.
In order for this system to really work, there almost has to be some kind of surface transportation available along the overpasses where stations are located. The author suggests trams, but buses or jitneys could easily work as well. The construction of interstates is such that these will virtually never be walkable to sufficient density of interest; public transit in the country has little motivation to locate stops on these overpasses (altho some buses do stop at park and ride lots located right on a freeway entrance/exit).
Judging by my limited understanding of How to Build a Quick and Dirty Railroad and Make Money on It So You Can Afford to Improve the Line Over Time, you want this thing to connect to communities with people who will be willing to pay you to move them or their stuff from where they are to somewhere else. The Interstate system is generally not bad for that, but some parts of the interstate system are better than others, and in some interesting ways. For example, I-5 through Seattle would in theory generate some lush passenger and freight traffic, altho it might be a bit tricky biting off the 1-3 lanes per direction that would enable one to run a combination of express/through and local trains (I say 1-3 because you might want a lane for people blazing straight up 5 with a stop downtown, a lane for an express that separates at I-90, and a lane for the train that stops at all the stops. And don't be thinking you're going to take the existing HOV/bus lanes, because those have a finer granularity yet.). By the time you make I-5 work for a train setup, you might have precluded even buses, never mind anything else.
Not so in a city with fewer choke points, like Phoenix or Atlanta. You can carve off a few lanes there, and assuming you pull enough vehicles off the road and put their drivers in the trains, there should still be enough lanes left to permit some kind of non-railed traffic. OTOH, you might be getting into trouble, in that you'll be serving so many miles of interstate that the economic density of passengers and light freight is not enough to support the rails.
Moving the problem to a place like San Francisco, well, I think you'll probably be going around San Francisco. But I could be wrong; that's why I want the grade and lane data. And it'll be really interesting trying to figure out what order to build this system in -- my guess is that connecting Denver to everything else isn't going to be really high on the priority list, and Seattle to Portland's connection to anything else is downright fascinating.
Since our proponent is really enamored of a wider gauge (and there are some good reasons to contemplate a wider gauge, not least of which is smoothness of ride and capaciousness/sturdiness of vehicle), you can't solve the gee, that interstate grade is unworkable, can I just run on a Real Railway Right of Way for the Nasty Bits? Altho you could. If it turns out that the number of miles of I-Train and the number of miles of standard in the country are roughly comparable, and somehow I-Train gets built not-standard-gauge, I would sort of expect that a lot of the standard gauge right of way (particularly over difficult terrain where the existing railroad worked crazy hard to get it below 2.2%, and the highway grade is pushing 6% or has a waiver for more, or the curvature is insanely tight or...) might acquire a third rail to permit both kinds of train to run on it.
So you'd need a dataset that included all the standard gauge track, traffic on it in terms of congestion and in terms of how much it's charging to run on it, but you _would not_ necessarily need grade details because you can just sort of assume it'll be better than whatever you decided was unacceptable. And then you'd have to do some kind of calculation in terms of figuring out whether to put catenary on it and if so in what order. It would also be nice to put together some kind of guesstimate in terms of which high speed rail corridors will get developed in which order, since as those come online, you probably _don't want to duplicate that service on this system, because that kind of thing kills both and leaves everyone looking foolish; further, it probably brings the number of lanes you need to eat on I-5 through Seattle down to 2 each way, and that kind of thing could be very, very helpful.
I'm going to stop now.