# Tools and Issues in Data Collection and Analysis (Scuola Normale Superiore, PhD, 2014)

Tools and Issues in Data Collection and Analysis (Third part)

April-June 2014, Palazzo Strozzi

Dr. Federico Russo

The third part covers the basic concepts of inferential statistical analysis and introduces the classical linear regression model. The recommended text for the third part (final set of nine encounters) is:

Alan Agresti and Barbara Finlay, Statistical Methods for the Social Sciences (3rd or 4th Edition), Pearson.

The more mathematically inclined students may find useful also the following book:

Damodar N. Gujarati, Basic Econometrics with Applications, (4th o 5th Edition) McGraw−Hill.

23 April – 10-12:30 p.m. (SIENA)

Introduction to the third part

Statistical Methods are increasingly employed in Political Science to test hypotheses about social and political phenomena. The growing power offered by computers and simple statistical packages opened new analytical possibilities, but are no substitute for a firm understanding of the basic inferential techniques.

Slides

23 April – 2-4:30 p.m. (SIENA)

Probability distributions

Inferential statistical methods use sample statistics to make predictions about the values of some parameters of the population of interest. To understand how this is done it is essential to introduce the concept of probability and sampling distributions.

REQUIRED READING: Chapter 4. Probability Distribution (Agresti & Finaly, 4th edition)

Sildes

7 May – 10-12:30 p.m.

Estimation

Sample data can be used to form two types of estimator of parameters, a point estimate and an interval estimate. Both can be estimated for quantitative variable (means) and for qualitative variables (proportions).

REQUIRED READING: Chapter 5. Statistical Inference: Estimation (Agresti & Finaly, 4th edition)

Slides

14May – 3-5:30 p.m.

Significance test

Theories generate hypotheses. A common aim in many studies is to check whether the hypotheses generated by a theory are compatible with the empirically observed data. This can be done with two complementary approaches, the significance test and confidence interval approach.

REQUIRED READING: Chapter 6. Significance Tests (Agresti & Finaly, 4th edition)

Slides

21 May – 3-5:30 p.m.

Introduction to the two variable regression model

The two variable regression model studies whether an association exists between two quantitative variables, the strength and the form of that relationship.

REQUIRED READING: Chapter 9. Linear Regression and Correlation (Agresti & Finaly, 4th edition)

Slides

26 May – 3-5:30 p.m.

When there are several Independent Variables: Multiple Regression

It is often necessary to go beyond bivariate analysis to study partial relationships between two variable controlling for other variables. The multiple regression model allows for that.

REQUIRED READING: Chapter 11. Multple Regression and Correlation (Agresti & Finaly, 4th edition)

Slides

28 May – 3-5:30 p.m.

ANOVA models

Qualitative explanatory variables often play an important role in political theories. For quantitative response variables, ANOVA model is a way to compare the mean responses of several groups defined by the categories of the qualitative explanatory variable.

REQUIRED READING: Chapter 12. Comparing Groups (Agresti & Finaly, 4th edition)

4 June – 3-5:30 p.m.

Slides

ANCOVA models

When there are both quantitative and qualitative explanatory variables regression and ANOVA must be combined.

REQUIRED READING: Chapter 13. Combining Regression an ANOVA (Agresti & Finaly, 4th edition)

Slides

11 June – 3-5:30 p.m.

Issues and tools in model building

Building a regression model involves various steps that are often neglected, such as checking regression assumptions and take remedial actions when some of them are not entirely satisfied.

REQUIRED READING: Chapter 14. Model Building with Multiple Regression (Agresti & Finaly, 4th edition)

Slides