Logo IMG


Where's the Real Bottleneck in Scientific Computing?

Scientists would do well to pick up some tools widely used in the software industry

Greg Wilson

As the Twig Is Bent…

Once I knew to look, I saw this "computational illiteracy" everywhere. Most scientists had simply never been shown how to program efficiently. After a generic freshman programming course in C or Java, and possibly a course on statistics or numerical methods in their junior or senior year, they were expected to discover or reinvent everything else themselves, which is about as reasonable as showing someone how to differentiate polynomials and then telling them to go and do some tensor calculus.

Yes, the relevant information was all on the Web, but it was, and is, scattered across hundreds of different sites. More important, people would have to invest months or years acquiring background knowledge before they could make sense of it all. As another physicist (somewhat older and more cynical than my friend) said to me when I suggested that he take a couple of weeks and learn some Perl, "Sure, just as soon as you take a couple of weeks and learn some quantum chromodynamics so that you can do my job."

His comment points at another reason why many scientists haven't adopted better working practices. After being run over by one bandwagon after another, these investigators are justifiably skeptical when someone says, "I'm from computer science, and I'm here to help you." From object-oriented languages to today's craze for "agile" programming, scientists have suffered through one fad after another without their lives becoming noticeably better.

Scientists are also often frustrated by the "accidental complexity" of what computer science has to offer. For example, every modern programming language provides a library for regular expressions, which are patterns used to find data in text files. However, each language's rules for how those patterns actually work are slightly different. When something as fundamental as the Unix operating system itself has three or four slightly different notations for the same concept, it's no wonder that so many scientists throw up their hands in despair and stick to lowest common denominators.

Just how big an impact is the lack of programming savvy among scientists having? To get a handle on the answer, consider a variation on one of the fundamental rules of computer architecture, known as Amdahl's Law. Suppose that it takes six months to write and debug a program that then has to run for another six months on today's hardware to generate publishable results. Even an infinitely fast computer (perhaps one thrown backward in time by some future physics experiment gone wrong) would only cut the mean time between publications in half, because it would only eliminate one restriction in the pipeline. Increasingly, the real limit on what computational scientists can accomplish is how quickly and reliably they can translate their ideas into working code.

comments powered by Disqus


Subscribe to American Scientist