Python

Web Mining

Credits: 
2
Hours: 
20
Area: 
Big Data Mining
Description: 

The course presents the main web data analysis techniques. By using the query log of a real search engine as a case study, students are guided in the development of a set of methodologies for data analysis aimed at creating the knowledge base for building a recommendation system. Then, the course discusses how the same information can be used to optimize the ranking in Web services. To this regard, the course introduces the learning to rank techniques aimed at estimating the relevance of objects with respect to specific user information needs.

High Performance & Scalable Analytics, NO-SQL Big Data Platforms

Credits: 
2
Hours: 
22
Area: 
Big Data Technology
Teachers: 
Academic Year: 
Description: 

This course aims at teaching the basic theoretical concepts behind the MapReduce distributed computing paradigm, and Hadoop in particular, and at building expertise in the practical usage of high-performance computing tools for data engineering, analysis and mining. In particular, the students will learn how classical data mining algorithms can be applied to Big Data using Hadoop (Spark). Real (and open source) datasets will be used to present examples and to let the students build their own projects.

Text Analysis & Web Mining

Credits: 
3
Hours: 
36
Area: 
Big Data Mining
Description: 

This module introduces the main techniques for the analysis and mining of user based opinions on Big Data generated mainly from the web. Emphasis will be put on text mining methods applied to text originated on social media. Moreover, the module presents the main web data analysis techniques. By using the query log of a real search engine as a case study, students are guided in the development of a set of methodologies for data analysis aimed at creating the knowledge base for building a recommendation system.

Data Visualization and Data Journalism

Credits: 
3
Hours: 
36
Area: 
Big Data Story Telling
Description: 

The module aims at preparing students to the approprieted presentation of data and knowledge extracted from them through visualization tools and narratives that exploit multimedia.
The module first presents the basic visualization techniques for the effective presentation of information from several different sources: structured data (relational, hierarchies, trees), relational data (social networks), temporal data, spatial data and spatio-temporal data.

High Performance & Scalable Analytics, NO-SQL Big Data Platforms

Credits: 
2
Hours: 
22
Area: 
Big Data Technology
Teachers: 
Academic Year: 
Description: 

This course aims at teaching the basic theoretical concepts behind the MapReduce distributed computing paradigm, and Hadoop in particular, and at building expertise in the practical usage of high-performance computing tools for data engineering, analysis and mining. In particular, the students will learn how classical data mining algorithms can be applied to Big Data using Hadoop (Spark). Real (and open source) datasets will be used to present examples and to let the students build their own projects.

Statistical and Neural Machine Learning for Text Analysis

Credits: 
2
Hours: 
20
Area: 
Big Data Mining
Teachers: 
Academic Year: 
Description: 

This module introduces the main methods of analysis and mining of opinions and personal evaluations for users based on Big Data generated on the web or other sources. Emphasis will be put on text mining method applied to text originated on social media. Lessons will be supported by case studies developed in the SoBigData.eu lab.

Information Retrieval

Credits: 
4
Hours: 
42
Area: 
Big Data Sensing & Procurement
Teachers: 
Description: 

The module provides the description of a search engine structure and of Text Mining tools, by analyzing their characteristics and limits with respect to the computational cost, the precision/recall/F1 parameters, and the expressivity of the supported queries. The module is also based on hands-on activities that will present well-known open-source Python tools for the crawling and analysis of web pages, the semantic annotation of texts (TagMe), and the indexing of text data collections (ElasticSearch).

Pages

Subscribe to RSS - Python

Partners