Where's the Real Bottleneck in Scientific Computing?
Scientists would do well to pick up some tools widely used in the software industry
When I first started doing computational science in 1986, a new generation of fast, cheap chips had just ushered in the current era of low-cost supercomputers, in which multiple processors work in parallel on a single problem. Suddenly, it seemed as though everyone who took number crunching seriously was rewriting his or her software to take advantage of these new machines. Sure, it hurt—the compilers that translated programs to run on parallel computers were flaky, debugging tools were nonexistent, and thinking about how to solve problems in parallel was often like trying to solve a thousand crossword puzzles at once—but the potential payoff seemed enormous. Many investigators were positive that within a few years, computer modeling would let scientists investigate a whole range of phenomena that were too big, too small, too fast, too slow, too dangerous or too complicated to examine in the lab or to analyze with pencil and paper.
But by the mid-1990s, I had a nagging feeling that something was wrong. For every successful simulation of global climate, there were a dozen or more groups struggling just to get their program to run. Their work was never quite ready to showcase at conferences or on the cover of their local supercomputing center's newsletter. Many struggled on for months or years, tweaking and tinkering until their code did something more interesting than grinding to a halt or dividing by zero. For some reason, getting to computational heaven was taking a lot longer than expected.
I therefore started asking scientists how they wrote their programs. The answers were sobering. Whereas a few knew more than most of the commercial software developers I'd worked with, the overwhelming majority were still using ancient text editors like Vi and Notepad, sharing files with colleagues by emailing them around and testing by, well, actually, not testing their programs systematically at all.
I finally asked a friend who was pursuing a doctorate in particle physics why he insisted on doing everything the hard way. Why not use an integrated development environment with a symbolic debugger? Why not write unit tests? Why not use a version-control system? His answer was, "What's a version-control system?"
A version-control system, I explained, is a piece of software that monitors changes to files—programs, Web pages, grant proposals and pretty much anything else. It works like the "undo" button on your favorite editor: At any point, you can go back to an older version of the file or see the differences between the way the file was then and the way it is now. You can also determine who else has edited the file or find conflicts between their changes and the ones you've just made. Version control is as fundamental to programming as accurate notes about lab procedures are to experimental science. It's what lets you say, "This is how I produced these results," rather than, "Um, I think we were using the new algorithm for that graph—I mean, the old new algorithm, not the new new algorithm."
My friend was intelligent and intimately familiar with the problems of writing large programs—he had inherited more than 100,000 lines of computer code and had already added 20,000 more. Discovering that he didn't even know what version control meant was like finding a chemist who didn't realize she needed to clean her test tubes between experiments. It wasn't a happy conversation for him either. Halfway through my explanation, he sighed and said, "Couldn't you have told me this three years ago?"