Statistics on the Table: The History of Statistical Concepts and Methods. Stephen M. Stigler. 448 pp. Harvard University Press, 1998. $45.
In 1910 Karl Pearson and an assistant tested the claim, ardently voiced by the temperance movement, that parental alcoholism harmed children. Based on data gathered for other purposes, they found no adverse effect. In the ensuing debate, Pearson issued this challenge to a critic: "Statistics—and unselected statistics—on the table, please." Pearson's determination to put qualitative claims to quantitative test will strike most modern readers as reasonable. But it was controversial at the time.
In Statistics on the Table: The History of Statistical Concepts and Methods, statistician and historian of science Stephen M. Stigler collects and revises 22 of his scholarly and often witty essays from the past 25 years reflecting the combination of detective work and statistical thinking that characterize his research. As in his classic History of Statistics, Stigler shows modern practitioners of statistics that even basic concepts, methods and applications had to be discovered incrementally and were by no means self- evident to even the most creative and sophisticated thinkers of the past. Although some points that engross Stigler may strike nonhistorians as arcane, most of them compellingly serve this purpose.
To Stigler, the primary barrier to applying statistics in the social sciences in the 19th century, against which Pearson butted even early in the 20th, was the want of a single unifying theory. Until then, statistical methods had been restricted to such fields as astronomy, where error could be attributed solely to measurement. In the social sciences, a host of unknown causes might affect observations. Stigler views Adolphe Quetelet as having catalyzed statistical approaches in social science through his work on "social physics" in the 1830s.
Nineteenth-century scientists who appreciated the importance of quantifying uncertainty did not necessarily use probability—essential in statistical theory—in their own work, but their insights paved the way for later advances by Francis Ysidro Edgeworth, Francis Galton and others. For instance, William Stanley Jevons, who performed an influential analysis of the impact of the 1849 gold discoveries on gold prices, soundly argued that local causal factors should trigger not exclusion of supposedly exceptional cases but rather the inclusion of more cases so that variations between cases would cancel one another. Interestingly, Stigler attributes the relatively early acceptance of statistics in psychology, as opposed to economics or sociology, to the possibility of experimental design, which permitted construction of meaningful baselines via randomization.
Stigler does not revel in exposing the statistical debacles of earnest scientists like Horace Secrist, but expose them he must. Charged with uncovering the causes of the Depression, Secrist conducted a battery of time-series analyses tracking American businesses before and during that period. This work led him to the startling discovery that businesses were converging toward a state of mediocrity. In one, he grouped 49 department stores into quartiles and plotted their profitability from 1920 to 1930; all four converged toward the overall mean over time. After his book on the subject received some favorable reviews, Secrist's splashy message took a nose dive when Harold Hotelling noted that he had fallen prey to the regression fallacy. Loosely put, one can expect stores that do particularly well or badly at one point—and are grouped together on that basis—to perform closer to the overall mean at a later point merely due to random variations in performance.
Misunderstanding of error variance also figures prominently in the "trial of the Pyx." Established in medieval times, the trial involved testing London's Royal Mint coins for adherence to standards of weight and fineness. In the procedure, the weight of a sample of coins was compared to the standard weight for that number of coins. Realizing that some error had to be allowed, the devisers of the trial specified a "remedy," or a margin of error within which the total weight must lie. During the hundreds of years that the trial of the Pyx was held (until 1977), only twice was the remedy not met—an apparently remarkable feat. But by analyzing surviving descriptions of the procedure, Stigler finds that the remedy was allowed to increase linearly with the number of coins, n, rather than with n's square root, as a modern conception of error distributions would suggest. Given the sample sizes typically employed, the total weight could therefore lie hundreds of standard deviations away from the aggregate standard and still meet the remedy. Thus, by shorting coins, the master of the Mint could have turned a tidy profit. Isaac Newton served as master from 1699 to 1727, a fact that has understandably excited speculation in historical circles about his grasp (or nongrasp) of the underlying principle.
Stigler devotes particular attention to questions of scientific priority, starting with Stigler's Law of Eponymy: "No scientific discovery is named after its original discoverer." (True to the Law, Stigler acknowledges sociologist of science Robert K. Merton as having earlier expressed many of his points on this topic.) In an amusing passage supporting the Law's descriptive validity, he writes: "Recent scholarship has shown that ... the Pythagorean theorem was known before Pythagoras, was first proved after Pythagoras and in fact Pythagoras may have been unaware of the geometrical significance of the theorem."
What underlies the Law of Eponymy? Stigler mainly attributes it to a desire to honor the best scientists in a nonpartisan way and thereby to win consensus for new terms. Consequently, he argues, most discoveries are eponymously ascribed to scientists remote from the naming community in space and time. He marshals support for this explanation by analyzing a frequency diagram of printed references to either Laplace or Gauss as the discoverer of the normal distribution, split by year and country of publication. In another characteristic application of statistical methodology to a historical question, Stigler attempts to ascertain whether Bayes's theorem is a misnomer by tallying pieces of evidence in favor of Thomas Bayes and Nicholas Saunderson (whom Stigler considers the most likely alternative) and submits them to a Bayesian (or Saundersonian) analysis. He concludes half-seriously that the evidence favors Saunderson over Bayes three to one.
Although Stigler has written a general introduction and each essay can stand alone, preambles to the five thematic parts of the collection might have provided a welcome wide-angle perspective. Fortunately, even at the chapter level, Stigler skillfully keeps his attention—and ours—on the forest even when stopping to marvel at the trees.—Valerie M. Chase, Center for Adaptive Behavior and Cognition, Max Planck Institute for Human Development, Berlin