3.2. Learning Objectives#
Conceptual#
This week is heavier on Python than conceptual material.
However it is important that you understand which plots successfully show the distribution of data, the relationship within paired data.
It is also important that you view your plots critically and choose approprate settings so that your graph successfully illustrates the point you are trying to make.
Examples would be:
- appropriate axis labels and legend
- appropriate axis scaling
- choice of colours
- inclusion of reference lines (such as the line x=y in the scatterplot example)
This material is covered in the lecture
Python skills#
In this course we will use plotting functions from the libraries matplotlib (imported as plt) and seaborn (imported as sns).
Therefore all the plotting commands will be preceded by either plt. or sns.
Seaborn is designed to work seamlessly with Pandas dataframes. It also produces aesthetically pleasing plots.
Matplotlib is another plotting library that contains, amongst other tings, many useful functions to customize plots (for example editing the axis ranges)
After this week you should be able to:
- Plot a data distribution using sns.histplot choosing appropriate bin sizes and locations
- Plot data using a Kernel Density Estimate (KDE) plot, using sns.kdeplot or adding the KDE option to sns.histplot
- Add a rug plot to a KDE plot using sns.kdeplot and sns.rugplot
- Plot category counts using sns.countplot
- Plot category means using sns.barplot
- Plot data using a box and whisker plot with sns.boxplot
- Plot data using a violin plot with sns.violinplot
- Plot paired data using sns.scatterplot
- Plot paired data using a scatterplot + histograms using sns.jointplot
- Break down plots by a categorical variable from a pandas dataframe, using the arguments x, y and hue of the plotting tools
- Make a plot with multiple panels using plt.subplot
- Adjust axis ranges
- Change axis labels
This material is covered in the Jupyter Notebooks in this section