The method of least squares is a familiar and trusted implement in the toolkit of statistics, learned by generations of students in all the sciences. It is the usual procedure for fitting a line or a curve to a set of data points that may be subject to errors of measurement. The invention of the method is usually ascribed to Carl Friedrich Gauss, the superstar of German mathematics in the first half of the 19th century, but the French mathematician Adrien-Marie Legendre had a rival claim to priority. Several others also contributed to the development of the technique, most notably Pierre Simon de Laplace, who was Legendre's senior colleague in the French Academy. All of these names are still immediately recognized today; they are to be found inscribed on marble busts in the main rotunda of the mathematics hall of fame.
But the method of least squares was also invented by another mathematician of the same era, whose fame is more narrowly circumscribed. This lesser-known inventor was Robert Adrain, who published an account of the method—and also of the closely related bell-shaped curve we now know as the normal or Gaussian distribution—at roughly the same time as Gauss's own publication. Yet Gauss and Legendre and Laplace knew nothing of Adrain or his writings. The reason is that Adrain lived and worked and published in an out-of-the-way corner of the world, cut off from communication with the main centers of learning. He spent his career teaching at small institutions with names such as Columbia and Rutgers. He was a citizen of a developing country: the United States of America.
Adrain's story is already well known to historians of mathematics, and I have nothing new to add to the factual record. But the story is worth telling again, if only for what it has to say about the practice of science on the margins. One obvious fact is that it can be very hard to get noticed when you are standing on the farther shore of the ocean, no matter how vigorously you wave your arms. Another truth, even more bitter, is that it's also very hard in those circumstances to do anything worth noticing. And yet there is a more cheerful outlook, at least for those who can afford to be patient: The world turns, and eventually the farther shore may become the center.
The Young Schoolmaster
Adrain was born in Ireland in 1775, in the coastal town of Carrickfergus, near Belfast. What is known of his early years has more to do with politics than mathematics. In 1798 he joined the insurgency of the United Irishmen, a coalition of Catholic and Protestant forces opposed to British rule. He survived a gunshot wound, but after the failure of the rebellion he had to flee the country, escaping to New York with his wife, Anna Pollock, and an infant daughter. He found refuge in Princeton with the widow of Theobald Wolfe Tone, the founder of the United Irishmen.
Adrain seems to have become a teacher of mathematics without ever pausing along the way to be a student. Back in Ireland he had already worked as a schoolmaster and tutor, perhaps as early as age 15, and in America he was soon employed as a teacher at the Princeton Academy (not the university, but a school for younger students). A few years later he took a similar position in York, Pennsylvania, then became principal of yet another academy 50 miles away in Reading. In 1809 Adrain moved his family back across the Delaware River into New Jersey, but this time the calling was a grander one: He was named professor of mathematics and natural philosophy at Queens College in New Brunswick. As a matter of fact, Adrain was the first person to be accorded the title of professor at Queens, and the college had to organize a public lottery to pay his salary. They also awarded him an honorary master's degree (perhaps to help justify the title and the salary).
Despite the largesse of Queens, Adrain was soon on the move again. He accepted a professorship at Columbia College in New York, which conferred another honorary degree, this time with doctoral rank. He stayed at Columbia for more than a dozen years, then in 1826 returned to New Brunswick, where in the meantime Queens College, after an interlude of financial distress, had reopened as Rutgers College (it is now officially styled Rutgers, the State University of New Jersey). But this second tenure in New Brunswick was even shorter than the first; a year later Adrain was wooed away to Philadelphia by the University of Pennsylvania.
Adrain's last years took a curious and humiliating turn. In 1834 he was forced to resign from the Penn faculty, apparently because he couldn't maintain classroom discipline. He wound up teaching at a grammar school in New York—quite a step down for a university professor, although the new job may in fact have been better paid. In 1840 Adrain retired to a farm in New Brunswick, where he died three years later.
Even allowing for the late disappointments, Adrain had a distinguished academic career by American standards of the time. But conditions were certainly different in France or Germany; one can scarcely imagine Gauss being dismissed from Göttingen University because of some unruly students. Moreover, Adrain's teaching duties cannot have left him much time for research; at Penn he was expected to teach the entire mathematics curriculum, from remedial arithmetic through calculus, to all four classes of undergraduates. Nevertheless, he not only wrote original articles but also became editor and publisher of the journals they appeared in.
A $6 Problem
Adrain was among the contributors to the very first mathematical periodical published in the United States, the Mathematical Correspondent, begun in 1804 by George Baron, who had been the first mathematics instructor at West Point. When Baron abandoned the journal, Adrain took over, and then transformed the Correspondent into a magazine of his own, renamed The Analyst, or Mathematical Museum. This enterprise also foundered, after just four issues appeared, but not before Adrain published in its pages his one claim to lasting recognition.
Adrain's moment of inspiration came in 1808, in an article summarizing work on a problem posed by a reader, with the offer of a $6 prize. The prize was awarded to Nathaniel Bowditch (who is better remembered than Adrain, primarily for his American Practical Navigator). As editor, Adrain was disqualified from the prize competition, but his discussion of the problem goes deeper than Bowditch's.
The problem concerns a land surveyor who traces the boundary of a polygonal field, measuring each of its five sides by traversing a prescribed distance on a prescribed angular bearing. At the end, the survey should return to the starting point, forming a closed pentagon, but instead there is a small gap. The $6 problem is to adjust the end points of the five segments so that the path closes. Of course there are innumerable ways to achieve this goal; the idea is to choose from among all possible adjustments those that put the vertices in their most probable positions.
Adrain begins his analysis by simplifying and generalizing the problem, dispensing with the surveyor's vocabulary of perches and chains and bearings. He writes: "The question which I propose to resolve is this: Supposing AB to be the true value of any quantity, of which the measure by observation or experiment is Ab, the error being Bb; what is the expression of the probability that the error Bb happens in measuring AB?" A tiny diagram like the one in the margin here makes it clear that Adrain is thinking of AB as the length of a line segment, which the error Bb can either increase or decrease. He argues that such errors should have a particular distribution, based on the "evident principle" that the uncertainty in measuring the length of a segment is proportional to the length itself. Since the actual length AB is the unknown quantity in this problem, it is not much use as an error estimator; Adrain brushes aside this subtlety, taking the measured length Ab as the basis.
Now suppose there are two measured segments, with unknown individual errors a and b but a known total error c. If the errors are independent, then the probability of both occurring together is the product of the separate probabilities, Pr(a)Pr(b). (This is another trouble spot in the argument: The hypotheses of proportionality and independence appear to be inconsistent. But Adrain presses on.) The likeliest values of a and b are those that maximize the product Pr(a)Pr(b), subject to the constraints that a+b=c and that the errors are proportional to the measured lengths. Adrain proceeds by taking the derivative of the probability equation and setting it to zero, in the usual process for identifying a maximum or minimum. After several further manipulations—some of them a little murky—he emerges with a famous equation. In modernized notation it states:
This is the equation of the normal distribution, or density, which gives the probability of observing the result x as a function of the true result µ and the standard deviation s. For any given µ and s, Pr(x) takes on its maximum value when the expression (x??)2 is made as small as possible. This fact is the origin of the least-squares principle: The best predictor of a normally distributed variable is the one that minimizes the square of the difference between the observed and the predicted values.
Adrain went on to give a second derivation of the same distribution, based on a geometric argument. He also applied the least-squares method to four practical problems, including a version of the original prize question.
Meanwhile, Back in Europe...
Pragmatic advice on how best to survey a farmer's field is not something you often find today in a journal of research mathematics, and the practical emphasis of Adrain's work might be taken as a sign of his provincialism. But in fact very similar problems in geodesy and astronomy motivated Gauss and Legendre as well. Indeed, both of them not only analyzed survey results; they also went out in the field and made measurements of their own.
Both Gauss and Legendre introduced the method of least squares in works on astronomy. Legendre was first to publish; he presented the technique (and also coined the name) in a book on comets published in 1805. The method is given as part of a recipe for determining an orbit from a set of observations, but Legendre offered no theoretical justification, and he did not mention probability at all.
Gauss's treatment of the subject appeared four years later, in his major work on celestial mechanics, Theoria Motus Corporum Coelestium. Based on dates of publication, Legendre would seem to have clear priority, but Gauss remarked that he had been using the method since 1795 and insisted the invention was his. Legendre complained in a bitter letter to Gauss: "There is no discovery that one cannot claim for oneself by saying that one had found the same thing some years previously...." They never made peace.
Although Gauss wasn't first to publish, he was certainly more thorough. His treatment was no ad hoc recipe for curve-fitting; he started from the premise that the arithmetic mean of several independent measurements "gives the most probable value, if not rigorously, yet very nearly, so that it is always most safe to hold onto it." He then derived the normal distribution from this premise, and showed how the distribution implies the method of least squares. (The arithmetic mean can be seen as a special case of the method of least squares.)
What Gauss could not establish was that the errors in real-world data—from land surveys or comet observations, say—actually follow a normal distribution. Gauss's "law of error" was widely accepted anyway. As Henri Poincaré quipped a century later: "Everyone believes in it, because the experimenters imagine that it is a mathematical theorem, and the mathematicians that it is an experimental fact." The actual scope of the "law" was not rigorously settled until a century later with the proof of the Central Limit Theorem.
Rediscovery and Reappraisal
We have no Science Citation Index for the early 19th century, but it seems a safe bet that the works of Gauss and Legendre were more widely noted than those of Robert Adrain. Indeed, Adrain's papers seem to have gone almost entirely unnoticed for 60 years, until Cleveland Abbe and Mansfield Merriman reprinted some excerpts in the 1870s, both in American journals. The republication finally attracted some attention on the other side of the Atlantic: J. W. L. Glaisher wrote a stern critique. Thereafter, Adrain dropped out of sight again for another 50 years, until Julian L. Coolidge and M. J. Babb wrote appreciative biographical articles in the 1920s. There have been a few more re-appraisals since then, such as those of Dirk Struik, E. R. Hogan and Jacques Dutka. Most important, Stephen M. Stigler has included three of Adrain's papers in a compendium of early sources on mathematical statistics in the U.S., making readily available what might otherwise be a very rare item of incunabula.
Modern judgments of Adrain range from warmly sympathetic to glacially cool. Anders Hald, in a vast work on the history of statistics, praises Adrain's "intuition and common sense." But Stigler describes Adrain's arguments in support of the normal distribution as "more wishful thinking than proofs."
Adrain's priority and originality have also been questioned. The article on least squares appeared in an issue of The Analyst dated 1808, but Stigler has found evidence it was not actually printed until 1809, so that it may not have preceded Gauss's publication, which came out in the spring of that year. The doubt here is strictly about priority, not borrowing: Adrain could not possibly have seen Gauss's book before writing his own paper. On the other hand, Adrain could very well have seen Legendre's 1805 description of the least-squares method. Babb found a copy of Legendre's book in Adrain's library, although there is no way of telling when it was acquired. Adrain never mentions Legendre's work.
Given the recent spate of movies about mathematicians, we should brace ourselves for the big-screen version of the Robert Adrain story. The script is easy to guess: The puffed-up, powdered-wig figures of Gauss and Legendre squabble childishly over credit for a discovery that is actually made by a self-taught genius doing brilliant mathematics deep in the hinterland, scratching theorems onto birchbark with a bit of charcoal. If only it were true. Although Adrain's accomplishments are impressive for their time and place, they do not put him in the first rank of 19th-century mathematicians.
Mathematics and other kinds of science are so intensely social that only the most extraordinary talent could overcome the handicap of isolation. It takes more than a village to raise a scientist. It takes a village full of scientists. As it happens, I am writing these words from just such a village: the Abdus Salam International Centre for Theoretical Physics, in Trieste, Italy. The center is named for a distinguished scientist who had to choose in youth between his vocation (physics) and his nation (Pakistan). He went to Cambridge. The center he founded has among its explicit aims to spare others that bitter choice, providing scientists from developing countries opportunities for collaboration without forcing them into emigration. Some 80,000 have visited since 1964.
Technology has also made a difference in the lives of scientists on the farther shores. If Adrain had been able to read e-prints on the arXiv server every morning—and, equally important, if he could have posted his own contributions there—perhaps today we would speak of the Adrainian distribution instead of the Gaussian.
© Brian Hayes