Certain effects of uncertain models

  • Authors

    • Brian Knaeble University of Wisconsin-Stout
    2014-11-05
    https://doi.org/10.14419/ijasp.v2i2.3698
  • Statistical summaries of multiple regression analyses often state conclusions as if model uncertainty is of little concern. The error due to a mis-specified model, however, can be more significant in practice than the sampling error associated with commonly reported statistics. The true effect of an explanatory variable may be opposite that indicated by a fitted coefficient of a linear model, even if the model is well fit and the coefficient is deemed statistically significant. Here we study the sensitivity of the sign of a fitted coefficient to changes in the model structure. As a consequence of the principle of least squares, we show generally, that a set of covariates with a relatively weak coefficient of determination can not reverse the sign of a relatively strong fitted coefficient of a linear model that has been fit with a regression matrix having orthogonal columns. A consequence of the theory is a necessary condition for Simpson's paradox.

    Keywords: confounding, least squares, model uncertainty, regression, sensitivity analysis.

  • References

    1. C. Chatfield, Model uncertainty, data mining and statistical inference, Journal of the Royal Statistical Society: Series A, 158, part 3, (1995), pp. 419–466.
    2. Davis et al, Rice consumption and urinary arsenic concentrations in U.S. children, Environmental Health Perspectives, vol.120, issue 10, (2012), p1418-1424.
    3. Jungert et al, Serum 25-hydroxyvitamin D3 and body composition in an elderly cohort from Germany: a cross-sectional study, Nutrition & Metabolism, 9,42, (2012), Accessed in 2013 from http://www.nutritionandmetabolism.com/content/9/1/42.
    4. Nelson et al, Daily physical activity predicts degree of insulin resistance: a cross-sectional observational study using the 2003--2004 National Health and Nutrition Examination Survey, International Journal of Behavioral Nutrition and Physical Activity, 10, 10, (2013), Accessed in 2013 from http://www.ijbnpa.org/content/10/1/10.
    5. Lignell et al, Prenatal exposure to polychlorinated biphenyls and polybriminated diphenyl ethers may influence birth weight among infants in a Swedish cohort with background exposure: a cross-sectional study, Environmental Health, 12, 44, (2013), Accessed in 2013 from http://www.ehjournal.net/content/12/1/44.
    6. Cervellati et al, Bone mass density selectively correlates with serum markers of oxidative damage in post-menopausal women, Clinical Chemistry and Laboratory Medicine, volume 51, issue 2, (2012), pages 333-338.
    7. K. Dickersin, The existence of publication bias and risk factors for its occurrence, The Journal of the American Medical Association, (1990), 1385-1389.
    8. Tarino et al, Meta-analysis of prospective cohort studies evaluating the association of saturated fat with cardiovascular disease, The American Journal of Clinical Nutrition, 91, 3, (2010), 535-546.
    9. Scarborough et al, Meta-analysis of effect of saturated fat intake on cardiovascular disease: overadjustment obscures true associations, The American Journal of Clinical Nutrition, vol. 92, no. 2, (2010), 458-459.
    10. C.Y. Lu, Observational studies: a review of study designs, challenges and strategies to reduce confounding, The International Journal of Clinical Practice, Blackwell Publishing Ltd., 63, 5, (2009), 691-697.
    11. R. Sagarin, A. Pauchard, Observational approaches in ecology open new ground in a changing world, Frontiers in Ecology and the Environment, 8, (2010), 379-386.
    12. J. Wooldridge, Introductory Econometrics, A Modern Approach, South-Western Cengage Learning, USA, (2013).
    13. S.L. Morgan, C Winship, Counterfacutals and Causal Inference: Methods and Principles for Social Research, Cambridge University Press, New York USA, (2007).
    14. G. Brumfiel, High-energy physics: down the petabyte highway, Nature, 469, (2011), 282-283.
    15. P.R. Rosenbaum, Observational study, Encyclopedia of Statistics in Behavioral Science, volume 3, (2005), pp. 1451-1462.
    16. G. Seber, A. Lee, Linear Regression Analysis, John Wiley & Sons, Hoboken USA, (2003), Equation (3.32).
    17. C.A. Hosman, B.B. Hansen, P.W. Holland, The sensitivity of linear regression coefficients' confidence limits to the omission of a confounder, The Annals of Applied Statistics, vol. 4, no. 2, (2010), 849-870, Proposition 2.1.
    18. Myers et al, Effects of adjusting for instrumental variables on bias and precision of effect estimates, American Journal of Epidemiology, 174, 11, (2011), 1213-1222.
    19. D. Rubin, Author's reply: Should observational studies be designed to allow lack of balance in covariate distributions across treatment groups?, Statistics in Medicine, 28, 9, (2009), 1420-123.
    20. D. Kurth, J. Sonis, Assessment and control of confounding in trauma research, Journal of Traumatic Stress, vol. 20, no. 5, (2007), pp. 807–820.
    21. J.M. Robins, S. Greenland, The role of model selection in causal inference from nonexperimental data, American Journal of Epidemiology, vol. 123, no. 3, (1986).
    22. J. Pearl, Causal inference in statistics: an overview, Statistical Surveys, (2009), 96-146.
    23. Cornfield et al, Smoking and lung cancer: recent evidence and a discussion of some questions, Journal of the National Cancer Institute, 22, (1959), 173-203, Appendix A.
    24. D.Y Lin, B.M. Psaty, R.A. Kronmal, Assessing the sensitivity of regression results to unmeasured confounders in observational studies, Biometrics, 54, (1998), 948-963.
    25. R.A. Fisher, Cigarettes, cancer and statistics, Centennial Rev Arts and Sciences, Michigan State University, 2, 151, (1958).
    26. Cornfield et al, Smoking and lung cancer: recent evidence and a discussion of some questions, Journal of the National Cancer Institute, 22, (1959), 173-203.
    27. D. Giles, Coefficient sign changes when restricting regression models under instrumental variables estimation, Oxford Bulletin of Economics and Statistics, 51, (1989), 465-467.
    28. McAleer et al, A further result on the sign of restricted least-squares estimates, Journal of Econometrics, 32, (1986), 287-290.
    29. P.R. Rosenbaum, D.B. Rubin, Assessing sensitivity to an unobserved binary covariate in an observational study with binary outcome, Journal of the Royal Statistical Society, Series B, 11, (1983), 212-218.
    30. G. Seber, A. Lee, Linear Regression Analysis, John Wiley & Sons, Hoboken USA, (2003), Section 3.6.
    31. Chen et al; Geographic study of mortality, biochemistry, diet and lifestyle in rural China; Epidemiological Studies Unit, Oxford; http://www.ctsu.ox.ac.uk/~china/monograph/; Revised (1990); Accessed 2009.
    32. C.H. Wagner, Simpson's paradox in real life, The American Statistician, 36, 1, (1982), 46–48.
    33. I.J. Good, Y. Mittal, The amalgamation and geometry of two-by-two contingency tables, The Annals of Statistics, vol. 15, no. 2, (1987), pp. 694-711.
    34. S.A. Julious, M.A. Mullee, Confounding and Simpson's paradox. British Medical Journal, 309, 6967, (1994), 1480–1481.
    35. P.J. Bickel, E.A. Hammel, J.W. O'Connell, Sex bias in graduate admissions: data from Berkeley, Science, 187, 4175, (1975), 398–404.
    36. D.R. Appleton, J.M. French, M. Vanderpump, Ignoring a covariate: an example of Simpson's paradox, The American Statistician, volume 50, issue 4, (1996), 340-341.
    37. W. Cheney, Analysis for Applied Mathematics, Springer, New York USA, (2001).
    38. R. McNamee, Regression modeling and other methods to control confounding, Occupational & Environmental Medicine, 62, (2004), 500-506, doi:10.1136/oem.2002.001115.
  • Downloads

    Additional Files