The Nature of Scientific Proof in the Age of Simulations
Is numerical mimicry a third way of establishing truth?
Bigger, Better, Faster: A Need for Standards of Quality
A fundamental limitation of any simulation is that there is a practical limit to how finely one may slice space and time in a computer such that the simulation completes within a reasonable amount of time (say, within the duration of one’s Ph.D. thesis). For multiscale problems, there will always be phenomena operating on scales smaller than the size of one’s simulation pixel. Astrophysicists call these subgrid physics—literally physics happening below the grid of the simulation. This difficulty of simulating phenomena from microscopic to macroscopic scales, across many, many orders of magnitude in size, is known as a dynamic range problem.
As computers become more powerful, one may always run simulations that explore a greater range of sizes and divide up space and time ever more finely, but in multiscale problems there will always be unresolved subgrid phenomena. Astrophysics and climate science appear to share this nightmare. In simulating the formation of galaxies, the birth, evolution, and death of stars are determining the global appearance of these synthetic galaxies themselves. Galaxies typically span tens of thousands of light years across, whereas stars operate on scales that are roughly 100 billion times smaller.
The climate of Earth appears to be significantly influenced by clouds, which both heat and cool the atmosphere. On scales of tens to hundreds of kilometers, the imperfect cancellation between these two effects is what matters. To get the details of this cancellation correct, we need to understand how the clouds formed and how their emergent properties developed, which ultimately requires an intimate understanding of how the microscopic seed particles of clouds were first created. Remarkably, uncertainties about cloud formation on such fine scales are hindering our ability to predict whether a given exoplanet is potentially habitable. Cloud formation remains a largely unsolved puzzle across several scientific disciplines. In both examples, it remains challenging to simulate the entire range of phenomena, due to the prohibitive amount of computing time needed and our incomplete understanding of the physics involved on smaller scales.
Another legitimate concern is the use of simulations as “black boxes” to churn out results and generate seductive graphics or movies without deeply questioning the assumptions involved. For example, simulations involving the Navier-Stokes equation often assume a Newtonian fluid—one that retains no memory of what was done to it in the past and offers more resistance or friction when layers of it are forced to slide past one another. Newtonian fluids are a plausible starting point for a rich variety of simulations, ranging from planetary atmospheres to accretion disks around black holes. Curiously, several common fluids are non-Newtonian. Dough is an example of a fluid with a memory of its past states, whereas ketchup tends to become less viscous when it is increasingly deformed. Attempting to simulate these fluids using a Newtonian assumption is an exercise in futility.
To use a simulation as a laboratory, one has to understand how to break it—otherwise, one may mistake an artifact as a result. In approximating continua as being discrete, one has to pay multiple penalties. Spurious oscillations or enhanced viscosity that are artifacts of this procedure may easily be misinterpreted as being physically meaningful. When one slices up space and time in a simulation, it may introduce features that look like real waves or make the fluid more viscous in an artificial way. The conservation of mass, momentum, and energy—cornerstones of theoretical physics—may no longer be taken for granted in a simulation and depend on the numerical scheme being employed, even if the governing equation conserves all of these quantities perfectly on paper.
Despite these concerns, a culture of “bigger, better, faster” is prevailing. It is not uncommon to hear discussions centered on how one can make one’s code more complex and run even faster on a mind-boggling number of computing cores. It is almost as if gathering exponentially increasing amounts of information will automatically translate into knowledge, that the simulated system attains self-awareness. As terabytes upon terabytes of information are being churned out by ever more massive simulations, the gulf between information and knowledge is widening. We appear to be missing a set of guiding principles—a metacomputational astrophysics, for lack of a better term.
Questions for metacomputational astrophysics include: Is scientific truth more robustly represented by the simplest or the most complex model? (Many would say simplest, but this view is not universally accepted.) How may we judge when a simulation has successfully approximated reality in some way? (The visual inspection of a simulated image of, say, a galaxy versus one obtained with a telescope is sentimentally satisfying, but objectively unsatisfactory.) When is “bigger, better, faster” enough? Does one obtain an ever-better physical answer by simply ramping up the computational complexity?
An alternative approach is to construct a model hierarchy—a suite of models of varying complexity that develops understanding in steps, allowing each physical effect to be isolated. Model hierarchies are standard practice in climate science. Focused models of microprocesses (turbulence, cloud formation, and so on) buttress global simulations of how the atmosphere, hydrosphere, biosphere, cryosphere, and lithosphere interact.