Do We Really Need the S-word?
The use of “significance” in reporting statistical results is fraught with problems—but they could be solved with a simple change in practice
Join the S-word Movement
In my experience, scientists making their first attempts at abandoning the s-word discover how wedded they are to it. The real challenge, however, lies in replacing the s-word with substance, not with an equally ambiguous synonym. If the scenario is a simple one—the p-value was 0.048, the confidence interval did not include 0 or the variable in question ended up in your top model—use the space to explicitly define your criteria. If, in fact, you believe the results are practically meaningful and important, convince your readers with sound justification using both statistical and general scientific reasoning.
Statistical inference is an art, uncomfortably dependent on practitioners and their backgrounds. It should not be construed as a way to objectivize inference or a straightforward means to classify results as significant or not. Omission of the s-word may seem like a rather insignificant request among the bigger issues facing statistical inference and science in general. However, given the simplicity and accessibility of this change, it is worth the potential improvements it offers in the dissemination of our scientific results. I hope you will join me and my students in working to curtail use of the s-word and its negative impacts on science.
- Aldrich, J. 2011. Contribution to Earliest Known Uses of Some of the Words of Mathematics. http://Jeff560.tripod.com/s.html.
- Berger, J. O., and D. A. Berry. 1988. Statistical analysis and the illusion of objectivity. American Scientist 76:159–165.
- Cohen, J. 1994. The earth is round (p < .05). American Psychologist 49:997–1003.
- Cowles, M., and C. Davis. 1982. On the origins of the .05 level of statistical significance. American Psychologist 37:553–558.
- Edgeworth, F. Y. 1885. Jubilee Volume, Royal Statistical Society 181–217.
- Fisher, R. A. 1973. Statistical Methods and Scientific Inference, 3rd ed. New York: Hafner Press.
- Fisher, R. A. 1944. Statistical Methods for Research Workers, 9th ed. London: Oliver and Boyd.
- Gelman, A., and H. Stern. 2006. The difference between “significant” and “not significant” is not itself statistically significant. American Statistician 60:328–331.
- Gill, J. 1999. The insignificance of null hypothesis significance testing. Political Science Quarterly 52:647–674.
- Goodman, S. N. 2001. Of P-values and Bayes: A modest proposal. Epidemiology 12:295–297.
- Poole, C. 2001. Low P-values or narrow confidence intervals: Which are more durable? Epidemiology 12:291–294.
- Salsburg, D. 2001. The Lady Tasting Tea: How Statistics Revolutionized Science in the Twentieth Century. New York: W.H. Freeman and Co.
- Siegfried, T. 2010. Odds are, it’s wrong. Science News 177:26.
- Thompson, B. 1996. AERA editorial policies regarding statistical significance testing: Three suggested reforms. Educational Researcher 25:26–30.
- Weinberg, C. R. 2001. It’s time to rehabilitate the P-value. Epidemiology 12:288–290.
- Wikipedia contributors. Statistical significance. Wikipedia, The Free Encyclopedia, http://en.wikipedia.org/wiki/Statistical_significance. Accessed August 17, 2012.