Isaac Triguero is Distinguished Senior Researcher at the Department of Computer Science and Artificial Intelligence, University of Granada, and Associate Professor of Data Science at the School of Computer Science of the University of Nottingham. He won the 2019 School of Computer Science – University of Nottingham Award for Teaching. Mikel Galar is Associate Professor of Computer Science and Artificial Intelligence at the Department of Statistics, Computer Science and Mathematics, Public University of Navarre. He is a co-founder of Neuraptic AI and won the 2020 Excellence in Teaching Award of the Public University of Navarre.
'With the growing ubiquity of large and complex datasets, MapReduce and Spark's dataflow programming models have become mission-critical skills for data scientists, data engineers, and ML engineers. Triguero and Galar leverage their extensive teaching experience on this topic to deliver this tour de force deep dive into both the technical concepts and programming knowhow needed for such modern large-scale data analytics. They interleave intuitive exposition of the concepts and examples from data engineering and classical ML pipelines with well-thought-out hands-on code and outputs. This book not only shows how all this knowledge is useful in practice today but also sets up the reader to be able to successfully 'generalize' to future workloads.' Arun Kumar, University of California, San Diego