Where's the Real Bottleneck in Scientific Computing?
Scientists would do well to pick up some tools widely used in the software industry
A Little Knowledge
In 1998, Brent Gorda (now at Lawrence Livermore National Laboratory) and I started trying to address this issue by teaching a short course on software-development skills to scientists at Los Alamos National Laboratory. Our aim wasn't to turn LANL's physicists and metallurgists into computer scientists. Instead, we wanted to show them the 10 percent of modern software engineering that would handle 90 percent of their needs.
The first few rounds had their ups and downs, but from what participants said, and from what they did after the course was over, it was clear that we were on the right track. A few techniques, and an introduction to the tools that supported them, could save scientists immense frustration. What's more, we found that most scientists were very open to these ideas, which probably shouldn't have surprised us as much as it did. After all, the importance of being methodical had been drilled into them from their first undergraduate lab.
Six years and one dot-com boom later, I received funding from the Python Software Foundation to bring that course up to date and to make it available on the Web under an open license so that anyone who wants to use it is free to do so. It covers tools and working practices that can improve both the quality of what scientific programmers produce, and the speed with which they produce it, so that they can spend less time wrestling with their programs and more doing their research. Topics include version control, automating repetitive tasks, systematic testing, coding style and reading code, some basic data crunching and Web programming, and a quick survey of how to manage development in a small, geographically distributed team. None of this is rocket science—it's just the programming equivalent of knowing how to titrate a solution or calibrate an oscilloscope.