MSc in Computer science (Università degli Studi di Milano)


This course introduces the principal techniques related to the analysis of large amounts of data.

News

Date Info
05/06/2013 Schedule change for the Big scale analytics course
On Friday 7/6 there will be two classes for the Big scale analytics course: the first one between 11:30 and 13:30 (aula 5), and the second one between 13:30 and 15:30 (aula 5).
23/05/2013 Cancellation of the Big scale analytics class of 31/5
31/5 classe is canceled due to a conference.
10/05/2013 Office hours canceled
Regular office hour are canceled until next semester; students can arrange an appointment via e-mail.
10/05/2013 Cancellation of the Big scale analytics classes of 17/5 and 21/5
17/5 and 21/5 classes are canceled due to a EC project meeting.
30/04/2013 Tutorial about Hadoop installation
I published a tutorial about installing Hadoop on a virtual machine.
29/04/2013 Office hours canceled
Office hours of 2/5 and 9/5 are canceled; students can arrange an appointment via e-mail.
19/04/2013 Cancellation of the Big scale analytics class of 26/4
26/4 class is canceled due to 25/4 holidays
26/03/2013 Seminar series on the creation of startup companies
A seminar series on the creation of startup companies is going to be held in our University.
19/03/2013 Cancellation of the Big scale analytics class of 22/3
22/3 class is canceled due to a strike of public transportations.
13/03/2013 Schedule change for the Big scale analytics course
Starting 19/03, classes of will be held on Tuesday between 14:30 and 16:30 in aula 6 and on Friday between 13:30 and 15:30 in aula 5.
13/03/2013 Schedule change for the Big scale analytics class of 15/03
Class of Friday 15/03 will begin at 14:30.
05/03/2013 Cancellation of the Big scale analytics class of 8/3
8/3 class is canceled.
25/02/2013 Office hours canceled
Office hours of 28/02 are canceled; students can arrange an appointment via e-mail.

Language

Lectures are in italian.

Course schedule

Lectures will take place at the Computer science department, according to the following schedule:

Day Hour Place
Tuesday 14:30 - 16:30 aula 6
Friday 13:30 - 15:30 aula 5

Any change to the schedule will be announced in class and published in paragraph News of this page.

Office hours

By appointment, room 5015 of the Computer Science Department. It is possible contact the teacher by e-mail, taking care to read in advance the guide prepared by Prof. Sebastiano Vigna and clearly specifying in the message the course name and the academic year. In particular, students are encouraged to always use their academic address (i.e. based on the domain studenti.unimi.it) signing with name and student ID number and recalling that the response time may vary depending on the teacher commitments.

Course material

The course is based on the following textbook: Anand Rajaraman and Jeff Ullman, Mining of Massive Datasets, available both as a freely downloadable PDF and published in hardcopy by Cambridge University Press (ISBN:9781107015357)

The part on distributed file systems and MapReduce is based on the adopted textbook and on the Hadoop tutorial published by Yahoo!

The part on machine learning is described on the additional chapter of the textbook available online, in the third chapter of S. Haykin, Neural Networks: A Comprehensive Foundation, Prentice Hall, 1999 (ISBN 0-13-908385-5) and in two online tutorials about classification and regression.

The part on dimensionality reduction is described on the additional chapter of the textbook available online.

Syllabus

The course explains the topics listed in the lecture calendar, covering the textbook contents in chapters 1, 2 (excluding section 2.6.7), 3 (until section 3.7 included), 4 (until section 4.5 included), 5 (excluding sections 5.2.4 and 5.2.5), 6 (until section 6.5.1 included), 7 (until section 7.5 included), 8 (until section 8.4.6 included), 9 (until section 9.4 included), 10 (sections 10.1, 10.2, 10.4, and 10.5), 11 (until section 11.3 included) and 12 (until section 12.3 included), as well as the contents of the remaining documents listed in Course material.

Lectures calendar

Loading...

Exam modalities

The exam consists in an oral test, by appointment.