11396 - Multivariate Statistical Analysis

Academic Year 2017/2018

  • Docente: Assimo Maris
  • Credits: 6
  • SSD: MAT/06
  • Language: Italian
  • Moduli: Assimo Maris (Modulo 1) Giovanni Valenti (Modulo 2)
  • Teaching Mode: Traditional lectures (Modulo 1) Traditional lectures (Modulo 2)
  • Campus: Ravenna
  • Corso: Second cycle degree programme (LM) in Environmental Assessment and Management (cod. 8418)

Learning outcomes

Learning outcomes

At the end of the course, the student knows several topics in Applied Multivariate Statistical Analysis. She/He can use the tools of multivariate normal distribution for inference on population means, Multivariate analysis of Variance, Discriminant analysis, Multivariate regression, cluster analysis by means of hierarchical methods and multidimensional scaling, PCA and factor analysis.

Course contents

THEORY

• Data Organization. Sample multivariate descriptic statistics

• Random vectors and matrices. Mean vector and covariance matrix. Linear combinations of random vectors. Expected values and sample covariance matrix. Generalized and total sample variance

• Random Samples and the Expected Values of the Sample Mean and Covariance Matrix. Generalized Variance.

• Normal multivariate distribution: from the univariate to the multivariate case, fundamental properties and Contours of constant density. Multivariate Normal Likelihood. Large samples. Tests on normality: Q-Q plot. Transformation to normality.

• T2 di Hotelling: statistical hypotheses and level of significance, from the t-student to the T2 test. Confidence region and simultaneous confidence intervals for the mean vector of a normal distribution. Large samples. Charts for Monitoring a Sample of Individual Multivariate Observations for Stability.

• Paired comparisons. Comparing Mean Vectors from Two Populations. Comparing Several Multivariate Population Means: from ANOVA to MANOVA (with Wilks test).

• Classical multivariate regression model. Least squares estimation. Confidence intervals for the estimates. Model evaluation. Regression for new predictive variables.

• Principal component analysis (PCA). Analysis of data variability. Data reduction and compression of images.

• Similarity and distance measures. Cluster analysis with aggregation methods.

LAB

Univariate statistical analysis. Descriptive statistics. Confidence interval. Tests of Significance

Linear Regression. MLR method. Leverages. Regression coefficients. Evaluation parameters of a regression model. Correlation coefficient.

Structure multivariate data. Main matrix operations transposition, centering, covariance, correlation. Pretreatment of the data. Transformation of variables. Missing data.

Principal components analysis. Loading plots. Score plots. Choice of principal components (rank analysis).

Cluster analysis. Distance matrix, similarity matrix. Dendrograms. Cluster analysis on PCA.

Models and Classification. Classification methods. Validation of a model.

Regression methods. Principal Component Regression (PCR). Partial Least Squares Method (PLS).

Readings/Bibliography

Applied Multivariate Statistical Analysis, R. A. Johnson e D. W. Wichern, Prentice Hall, V edizione, 2002

Introduzione alla chemiometria, Tedeschi Roberto, Edises, 1998

Metodi Statistici per la Sperimentazione Biologica, A. Camussi, F. Möller, E. Ottaviano, M. Sari Gorla, Zanichelli, II edizione, 1995.

Teaching methods

Classroom lectures and computer lab sessions.

Assessment methods

Written examination plus computer-lab exercise (about 120-180 minutes in total).

Teaching tools

1) Blackboard (lectures and exercises) and video-projector. Lecture notes

2) Computational laboratory practicals

Office hours

See the website of Assimo Maris

See the website of Giovanni Valenti