Learning Objectives

2.2. Learning Objectives#

Conceptual#

This week we are thinking about how to describe data – covering measures of centre (mean, median, mode), measures of spread (variance, standard deviation, inter quartile range, percentiles), and description of distributions (shape and skew).

After this week you should understand:

Conceptual difference between the mean, median and mode, and when each is used
Conceptual difference between the standard deviation and interquartile range and when each is used
Why measure based on ranks (median and inter quartile range) are robust to outliers
Why the mean is useful in predicting the behaivour of large samples

Describe the shape and skew of a distribution in words (based on viewing a data plot)
Make predictions about the shape of a distribution from summary statistics (for example, what is the skew for a distribution where the median is higher than the mean?)
Appreciate common factors affecting the shape of distributions (what happens when a measure can only take values above zero for example).

This material is covered in the lecture (also in the lecture videos on Canvas)

Python skills#

We are working with Pandas dataframes and some of the associated methods

After this week you should be able to:

Read data from a .csv file into a pandas dataframe using pandas.read_csv
View a dataframe using display() including viewing only certain rows (selected by row index or condition)
Obtain a set of descriptive statsitics using describe(), including for a subset of rows or columns
Obtain specific descriptive statistics using methods such as mean(), count(), quantile(), including for a subset of rows or columns
Remove rows from a dataframe (eg those corresponding to bad data records)
Replace values in a dataframe with new values or with NaN
This material is covered in the Jupyter Notebooks in this section

Learning Objectives

Contents

2.2. Learning Objectives#

Conceptual#

Python skills#