Machine Learning on Big Data Clusters

  • Benjamin Weissman
  • Enrico van de Laar


In the previous chapters, we spent significant time on how we can query data stored inside SQL Server instances or on HDFS through Spark. One advantage of having access to data stored in different formats is that it allows you to perform analysis of the data at a large, and distributed, scale. One of the more powerful options we can utilize inside Big Data Clusters is the ability to implement machine learning solutions on our data. Because Big Data Clusters allow us to store massive amounts of data in all kinds of formats and sizes, the ability to train, and utilize, machine learning models across all of that data becomes far easier.

Copyright information

© Benjamin Weissman and Enrico van de Laar 2020

Authors and Affiliations

  • Benjamin Weissman
    • 1
  • Enrico van de Laar
    • 2
  1. 1.NurnbergGermany
  2. 2.DrachtenThe Netherlands

Personalised recommendations