Statistical Methods for Data Science

Credits: 
2
Hours: 
24
Area: 
Big Data Mining
Teachers: 
Academic Year: 
2023-2024
Description: 

The course introduces the student to the main concepts of statistical analysis, the methods used and the software implementations to carry out a quantitative and rigorous study of a dataset. After introducing the basic tools of descriptive statistics, the course focuses on probabilistic statistics and its use for data modelling, estimation methods through an inferential approach and statistical hypothesis testing. The course also introduces the concepts of linear and logistic regression (also multivariate) and the computational bootstrap techniques for estimating parameters and confidence intervals.

Prerequisites: Python

Notions: 
  • Lesson 1: Introduction to data and descriptive statistics
  • Lesson 2: Basic concepts of probability and use for data modeling
  • Lesson 3: Statistical inference. Statistical estimation methods. Correlation and dependence
  • Lesson 4: Simple and multiple regression, logistic regression and classification
  • Lesson 5: Confidence intervals. Bootstrap
  • Lesson 6: Hypothesis testing
Competences: 
  • Know how to use, and understand, the main tools of descriptive and probabilistic statistics.
  • Know how to conduct a statistical analysis of a dataset
  • Build a probabilistic model, estimate the model parameters, verify its goodness and use it in a predictive mode.

Partners