{
"cells": [
{
"cell_type": "markdown",
"id": "8501b536",
"metadata": {},
"source": [
"# Histogram\n",
"\n",
"If we want to see the shape of a data distribution, the histgram can be a good choice\n",
"\n",
"In this section we will see how to plot a histogram using Python and what choices we can make to show the data distribution clearly and accurately\n",
"\n",
"We will also consider some of the limitations of the histogram for small datasets. In the next section we meet a related plot, the Kernel Density Estimate plot, which can mitigate these limitations."
]
},
{
"cell_type": "markdown",
"id": "06a3540a",
"metadata": {},
"source": [
"## Example\n",
"\n",
"We will look at a small sample of height data for brother-sister pairs.\n",
"\n",
"\n",
"\n",
"### Set up Python libraries\n",
"\n",
"As usual, run the code cell below to import the relevant Python libraries"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "7f1d34e0",
"metadata": {},
"outputs": [],
"source": [
"# Set-up Python libraries - you need to run this but you don't need to change it\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"import scipy.stats as stats\n",
"import pandas \n",
"import seaborn as sns\n",
"sns.set_theme() # use pretty defaults"
]
},
{
"cell_type": "markdown",
"id": "fb218a2a",
"metadata": {},
"source": [
"### Load and inspect the data"
]
},
{
"cell_type": "markdown",
"id": "3bbd70d4",
"metadata": {},
"source": [
"Load the file brotherSisterData.csv which contains heights in cm for 25 brother-sister pairs"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "5b37c633",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n", " | brother | \n", "sister | \n", "
---|---|---|
0 | \n", "174 | \n", "172 | \n", "
1 | \n", "183 | \n", "180 | \n", "
2 | \n", "154 | \n", "148 | \n", "
3 | \n", "172 | \n", "180 | \n", "
4 | \n", "172 | \n", "165 | \n", "
5 | \n", "161 | \n", "159 | \n", "
6 | \n", "167 | \n", "159 | \n", "
7 | \n", "172 | \n", "164 | \n", "
8 | \n", "195 | \n", "188 | \n", "
9 | \n", "189 | \n", "175 | \n", "
10 | \n", "161 | \n", "160 | \n", "
11 | \n", "181 | \n", "177 | \n", "
12 | \n", "175 | \n", "168 | \n", "
13 | \n", "170 | \n", "169 | \n", "
14 | \n", "175 | \n", "165 | \n", "
15 | \n", "169 | \n", "164 | \n", "
16 | \n", "169 | \n", "163 | \n", "
17 | \n", "180 | \n", "176 | \n", "
18 | \n", "180 | \n", "176 | \n", "
19 | \n", "180 | \n", "172 | \n", "
20 | \n", "175 | \n", "170 | \n", "
21 | \n", "162 | \n", "157 | \n", "
22 | \n", "175 | \n", "172 | \n", "
23 | \n", "181 | \n", "179 | \n", "
24 | \n", "173 | \n", "171 | \n", "