Prof. Paolo Missier
Newcastle University, UK
Visiting Professor AY 2022-2023

Teaching offer – The teaching offer is designed to fit within the context of Unimore’s Master’s Degrees in Computer Science / Informatics and in Maths (Data Science) and PhD programs in Computer Science, in Computer Engineering, and in Math. It will consist of a series of topical seminars and practical lab sessions, for a total of about 15 hours, and organised into two complementary parts:
Part I: Scalable data processing for data science: architectures and programming models
Part II: Data Engineering and Data Science for Healthcare and medicine applications: challenges and case studies


Teaching program – Part I
Scalable data processing for data science: architectures and programming models
Key notions in distributed data processing, from theory (MapReduce) to the Hadoop architecture (Hadoop) will be introduced. The PySpark implementation of the framework will then be used for practical lab sessions, where students will be able to experiment with simple exercises, and then tackle the more complex challenges of making their solutions scalable for increasingly large input sizes.
Class schedule:
Tuesday November 22 Room M0.2 11:00 a.m. – 1:00 p.m.
Wednesday November 23 Room M0.1 9:00 a.m. – 11:00 a.m.
Friday November 25 Room M0.2 9:00 a.m. – 11:00 a.m.

Prof. Paolo Missier is Professor of Scalable Data Analytics with the School of Computing at Newcastle University, where he leads the School of Computing’s post-graduate academic teaching on Data Engineering for AI (aka Big Data Analytics), and a Fellow (2018-2023) of the Alan Turing Institute, UK’s National Institute for Data Science and Artificial Intelligence. His current research interests focus on the challenges of Health Data Engineering and Data Science, as well as on the efficient generation of data provenance to make Data Science more explainable and trustworthy.


Host
Prof. Federica Mandreoli

PhD Teaching – Scalable data processing for data science: architectures and programming models