Specialization in Data science for economics, business and finance (Università degli Studi di Milano)


Lectures are in italian.

Course material

The course is based on the textbook Mining of Massive Datasets (MMD hereafter).


The course explains the topics listed in the lecture calendar.

Lectures calendar

Topic Material
07/07/2018 Distributed storage (HDFS).
07/07/2018 Map-reduce
12/07/2018 Examples of map-reduce algorithms.
14/07/2018 Apache Spark.
14/07/2018 Link analysis.
14/07/2018 Similar items.
19/07/2018 Frequent itemsets.
19/07/2018 Clustering.
19/07/2018 Recommender systems.
20/07/2018 Regression.

Exam modalities

The exam involves the courses Parallel and distributed computing, Elements of R and python (python module), Databases, data linking and data visualization e Cloud computing, Data Base and Web Scraping Lab (Cloud computing module). Students can download a notebook describing a project to implement and a file containing data to be processed. The project implementations (individual or by a group of max 2 people) should be sent via e-mail to the teachers.