{ "cells": [ { "cell_type": "markdown", "id": "5cfdceec", "metadata": {}, "source": [ "# Independent samples t-test" ] }, { "cell_type": "markdown", "id": "741220b6", "metadata": {}, "source": [ "### Set up Python libraries\n", "\n", "As usual, run the code cell below to import the relevant Python libraries" ] }, { "cell_type": "code", "execution_count": 2, "id": "692abf91", "metadata": {}, "outputs": [], "source": [ "# Set-up Python libraries - you need to run this but you don't need to change it\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "import scipy.stats as stats\n", "import pandas \n", "import seaborn as sns" ] }, { "cell_type": "markdown", "id": "38124844", "metadata": {}, "source": [ "## Example\n", "\n", "\n", "\n", "\n", "Below we have data on the average weight, in grams, of apples from each of 20 apple trees.\n", "\n", "10 of the trees received MiracleGro fertilizer and the other 10 received Brand X.\n", "\n", "Test the hypothesis that the trees given MiracleGro produced heavier apples." ] }, { "cell_type": "markdown", "id": "f2ba6e37", "metadata": {}, "source": [ "### Inspect the data\n", "\n", "The data are provided in a text (.csv) file.\n", "\n", "Let's load the data as a Pandas dataframe, and plot them to get a sense for their distribution (is it normal?) and any outliers" ] }, { "cell_type": "code", "execution_count": 3, "id": "3f184f9d", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
FertilizermeanAppleWeight
0BrandX172
1BrandX165
2BrandX175
3BrandX164
4BrandX165
5BrandX157
6BrandX183
7BrandX186
8BrandX191
9BrandX173
10MiracleGro164
11MiracleGro198
12MiracleGro184
13MiracleGro200
14MiracleGro180
15MiracleGro189
16MiracleGro177
17MiracleGro170
18MiracleGro192
19MiracleGro193
20MiracleGro187
21MiracleGro176
\n", "
" ], "text/plain": [ " Fertilizer meanAppleWeight\n", "0 BrandX 172\n", "1 BrandX 165\n", "2 BrandX 175\n", "3 BrandX 164\n", "4 BrandX 165\n", "5 BrandX 157\n", "6 BrandX 183\n", "7 BrandX 186\n", "8 BrandX 191\n", "9 BrandX 173\n", "10 MiracleGro 164\n", "11 MiracleGro 198\n", "12 MiracleGro 184\n", "13 MiracleGro 200\n", "14 MiracleGro 180\n", "15 MiracleGro 189\n", "16 MiracleGro 177\n", "17 MiracleGro 170\n", "18 MiracleGro 192\n", "19 MiracleGro 193\n", "20 MiracleGro 187\n", "21 MiracleGro 176" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# load the data and have a look\n", "pandas.read_csv('https://raw.githubusercontent.com/jillxoreilly/StatsCourseBook/main/data/AppleWeights.csv')" ] }, { "cell_type": "markdown", "id": "4ce8aef6", "metadata": {}, "source": [ "Let's plot the data and see if they look Normally distributed.\n", "\n", "As we saw in the session on plotting, a good choice here will be a KDE plot (to get an estimate of the shape of the distribution) and a rug plot (individual data values as the KDE plot is based on only a small sample)" ] }, { "cell_type": "code", "execution_count": 4, "id": "5b4ca4f7", "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "apples = pandas.read_csv('https://raw.githubusercontent.com/jillxoreilly/StatsCourseBook/main/data/AppleWeights.csv')\n", "\n", "# let's make separate dataframes for the two brands of fertilizer\n", "apples_BrandX = apples[apples[\"Fertilizer\"]==\"BrandX\"]\n", "apples_MiracleGro = apples[apples[\"Fertilizer\"]==\"MiracleGro\"]\n", "\n", "sns.kdeplot(apples_BrandX[\"meanAppleWeight\"], color='b', shade=True, label='BrandX')\n", "sns.kdeplot(apples_MiracleGro[\"meanAppleWeight\"], color='r', shade=True, label='MiracleGro')\n", "\n", "sns.rugplot(apples_BrandX[\"meanAppleWeight\"], color='b')\n", "sns.rugplot(apples_MiracleGro[\"meanAppleWeight\"], color='r')\n", "\n", "plt.xlabel(\"Mean weight of apples (g)\", fontsize = 12)\n", "plt.ylabel(\"Density\", fontsize = 12)\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "id": "08178c3d", "metadata": {}, "source": [ "It looks like both distributions are unimodal but a bit skewed. \n", "\n", "Typically, it would be hard to say if the data are really non-normal due to the small number \n", "of data points (the apparent skew could be a fluke). By default, I would err on the side of caution and use a non-parametric test.\n", "\n", "However, in this particular case I am confident that the data in each sample are drawn from a normal distribution \n", "because each data point represents the mean of a large sample (the weights of all apples on one tree) \n", "and such means are always normally distributed due to the Central Limit theorem.\n", "There is a video explaining why here.\n", "\n", "So we can go ahead and use a t-test." ] }, { "cell_type": "markdown", "id": "97b80d53", "metadata": {}, "source": [ "### Hypotheses\n", "\n", "Ho: the mean weight of apples is the same for trees fertilized with MiracleGro and Brand X\n", "\n", "Ha: the mean weight of apples is greater for trees fertilized with MiracleGro\n", " \n", "This is a one tailed test as the manufacturers of MiracleGro are only looking for an effect in one direction \n", "(evidence that MiracleGro is better!)\n", "\n", "We will test at the $\\alpha = 0.05$ significance level" ] }, { "cell_type": "markdown", "id": "b79fc39f", "metadata": {}, "source": [ "### Descriptive statistics\n", "\n", "First, we obtain the relevant desriptive statistics. By relevant, I mean the ones that go into the equation for the t-test:\n", "\n", "\n", "$$ t = \\frac{\\bar{x_1} - \\bar{x_2}}{s \\sqrt{\\frac{1}{n_1}+\\frac{1}{n_2}}} $$\n", "\n", "This would be the sample means $\\bar{x_1}$ and $\\bar{x_2}$, and the standard deviations for both samples (these feed into the pooled standard deviation $s$) and the sample size $n$.\n", "\n", "Remember, apples is our original Pandas dataframe:" ] }, { "cell_type": "code", "execution_count": 9, "id": "4825c72f", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
FertilizermeanAppleWeight
0BrandX172
1BrandX165
2BrandX175
3BrandX164
4BrandX165
5BrandX157
6BrandX183
7BrandX186
8BrandX191
9BrandX173
10MiracleGro164
11MiracleGro198
12MiracleGro184
13MiracleGro200
14MiracleGro180
15MiracleGro189
16MiracleGro177
17MiracleGro170
18MiracleGro192
19MiracleGro193
20MiracleGro187
21MiracleGro176
\n", "
" ], "text/plain": [ " Fertilizer meanAppleWeight\n", "0 BrandX 172\n", "1 BrandX 165\n", "2 BrandX 175\n", "3 BrandX 164\n", "4 BrandX 165\n", "5 BrandX 157\n", "6 BrandX 183\n", "7 BrandX 186\n", "8 BrandX 191\n", "9 BrandX 173\n", "10 MiracleGro 164\n", "11 MiracleGro 198\n", "12 MiracleGro 184\n", "13 MiracleGro 200\n", "14 MiracleGro 180\n", "15 MiracleGro 189\n", "16 MiracleGro 177\n", "17 MiracleGro 170\n", "18 MiracleGro 192\n", "19 MiracleGro 193\n", "20 MiracleGro 187\n", "21 MiracleGro 176" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "display(apples)" ] }, { "cell_type": "markdown", "id": "9f37ee4a", "metadata": {}, "source": [ "We obtain some commonly used descriptive statistics using the describe() method in pandas" ] }, { "cell_type": "code", "execution_count": 12, "id": "3832fb73", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
meanAppleWeight
count22.000000
mean179.136364
std12.123552
min157.000000
25%170.500000
50%178.500000
75%188.500000
max200.000000
\n", "
" ], "text/plain": [ " meanAppleWeight\n", "count 22.000000\n", "mean 179.136364\n", "std 12.123552\n", "min 157.000000\n", "25% 170.500000\n", "50% 178.500000\n", "75% 188.500000\n", "max 200.000000" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "apples.describe()" ] }, { "cell_type": "markdown", "id": "09028ae0", "metadata": {}, "source": [ "We need the descriptive statistics separately for each fertilizer type. \n", "\n", "We could use the separate dataframes that we created for plotting, but pandas has a handy method called groupby that will do the job:" ] }, { "cell_type": "code", "execution_count": 11, "id": "3b0c2aae", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
meanAppleWeight
countmeanstdmin25%50%75%max
Fertilizer
BrandX10.0173.10000010.867382157.0165.00172.5181.00191.0
MiracleGro12.0184.16666711.101460164.0176.75185.5192.25200.0
\n", "
" ], "text/plain": [ " meanAppleWeight \\\n", " count mean std min 25% 50% \n", "Fertilizer \n", "BrandX 10.0 173.100000 10.867382 157.0 165.00 172.5 \n", "MiracleGro 12.0 184.166667 11.101460 164.0 176.75 185.5 \n", "\n", " \n", " 75% max \n", "Fertilizer \n", "BrandX 181.00 191.0 \n", "MiracleGro 192.25 200.0 " ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "apples.groupby([\"Fertilizer\"]).describe()" ] }, { "cell_type": "markdown", "id": "e1c672fd", "metadata": {}, "source": [ "It does look like the mean weight of apples from the MiracleGro trees is higher, but is the difference statistically significant?" ] }, { "cell_type": "markdown", "id": "255f692a", "metadata": {}, "source": [ "### Carry out the test\n", "\n", "We carry out an independent samples t-test using the function ttest_ind from scipy.stats, here loaded as stats" ] }, { "cell_type": "code", "execution_count": 14, "id": "744b1ab0", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Ttest_indResult(statistic=2.350347501385599, pvalue=0.014564862730138283)" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "stats.ttest_ind(apples_MiracleGro[\"meanAppleWeight\"], apples_BrandX[\"meanAppleWeight\"], alternative='greater')" ] }, { "cell_type": "markdown", "id": "4dd4feae", "metadata": {}, "source": [ "The inputs to stats.ttest are the two samples to be compared (the values in the meanAppleWeight column from our separated Pandas data frames apples_BrandX and apples_MiracleGro) \n", "and the argument alternative='greater', which tells the computer to run a one tailed test \n", "that mean of the first input apples.MiracleGro is greater than the second apples.BrandX.\n", "\n", "The outputs are statistic ($t=2.35$) and pvalue ($p=0.0146$) - if this is less than our $\\alpha$ value 0.5, there is a significant difference.\n", "\n", "### Degrees of freedom\n", "\n", "In a scientific write-up we also need to report the degrees of freedom of the test. This tells us how many observations (data-points) the test was based on, corrected for the number of means we had to estimate from the data in order to do the test.\n", "\n", "In the case of the independent samples t-test $df = n_1 + n_2 - 2$ so in this case, df=(10+12-2)=20 and we can report out test results as:\n", "\n", "$t(20) = 2.35, p=0.0146$ (one-tailed)\n", "\n", "### Interpretation\n", "\n", "Our t value of 2.35 means that the difference in mean apple weights between MiracleGro and BrandX trees is 2.35 times the standard error (where $ SE = s \\sqrt{\\frac{1}{n_1}+\\frac{1}{n_2}}$).\n", "\n", "Such a large difference (in the expected direction) would occur 0.0146 (1.46%) of the time due to chance if the null hypothesis were true (if MiracleGro was really no better than Brand X), hence the p value of 0.0146.\n", "\n", "This diagram shows the expected distribution of t-values if the null were true, with our obtained t-value marked:\n", "\n", "\"There" ] }, { "cell_type": "markdown", "id": "5f263742", "metadata": {}, "source": [ "### Draw conclusions\n", "\n", "As p<0.05 we conclude that the mean weight of apples on trees fertilized with MiracleGro is indeed greater than the mean weight of apples on trees fertilized with Brand X" ] }, { "cell_type": "markdown", "id": "99f1c539", "metadata": {}, "source": [ "## Write-up \n", "
\n", "\n", "
\n", " \n", "Above, I walked you through how to run the t-test and why we make different choices. \n", " \n", "In this section we revisit the analysis, but here we practice writing up our analysis in the correct style for a scientific report. \n", " \n", "Replace the XXXs with the correct values! \n", "\n", "
\n", "\n", "
" ] }, { "cell_type": "markdown", "id": "6f5439f3", "metadata": {}, "source": [ " \n", "We tested the hypothesis that the mean weight of apples produced by trees fertilized with MiracleGro was higher than for trees fertilized with Brand X.\n", "\n", "The mean apple weight was measured for each of XX trees fertilized with MiracleGro (mean of mean apple weights over XX trees XXX.Xg, sd of mean apple weights over 100 trees, XXX.Xg) and XX trees fertilized with Brand X (mean over XX trees XXX.Xg, sd XXX.Xg).\n" ] }, { "cell_type": "code", "execution_count": 17, "id": "5403735d", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
meanAppleWeight
countmeanstdmin25%50%75%max
Fertilizer
BrandX10.0173.10000010.867382157.0165.00172.5181.00191.0
MiracleGro12.0184.16666711.101460164.0176.75185.5192.25200.0
\n", "
" ], "text/plain": [ " meanAppleWeight \\\n", " count mean std min 25% 50% \n", "Fertilizer \n", "BrandX 10.0 173.100000 10.867382 157.0 165.00 172.5 \n", "MiracleGro 12.0 184.166667 11.101460 164.0 176.75 185.5 \n", "\n", " \n", " 75% max \n", "Fertilizer \n", "BrandX 181.00 191.0 \n", "MiracleGro 192.25 200.0 " ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "apples = pandas.read_csv('https://raw.githubusercontent.com/jillxoreilly/StatsCourseBook/main/data/AppleWeights.csv')\n", "apples.groupby([\"Fertilizer\"]).describe()" ] }, { "cell_type": "markdown", "id": "59fd722c", "metadata": {}, "source": [ "Theoretical considerations suggest that data for each group of trees should be drawn from a normal distribution: as individual data points were themselves the means of large samples (all apples from a given tree), these data points should follow a normal distribubtion due to the Central Limit Theorem. This was supported by a plot of the data:" ] }, { "cell_type": "code", "execution_count": 5, "id": "e06d0871", "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# let's make separate dataframes for the two brands of fertilizer\n", "apples_BrandX = apples[apples[\"Fertilizer\"]==\"BrandX\"]\n", "apples_MiracleGro = apples[apples[\"Fertilizer\"]==\"MiracleGro\"]\n", "\n", "sns.kdeplot(apples_BrandX[\"meanAppleWeight\"], color='b', shade=True, label='BrandX')\n", "sns.kdeplot(apples_MiracleGro[\"meanAppleWeight\"], color='r', shade=True, label='MiracleGro')\n", "\n", "sns.rugplot(apples_BrandX[\"meanAppleWeight\"], color='b')\n", "sns.rugplot(apples_MiracleGro[\"meanAppleWeight\"], color='r')\n", "\n", "plt.xlabel(\"Mean weight of apples (g)\", fontsize = 12)\n", "plt.ylabel(\"Density\", fontsize = 12)\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "id": "6550a8cb", "metadata": {}, "source": [ "An independent samples t-test was therefore used to compare the means (alpha = XXX, XXX-tailed). " ] }, { "cell_type": "code", "execution_count": 18, "id": "36f701e7", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Ttest_indResult(statistic=2.350347501385599, pvalue=0.014564862730138283)" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "stats.ttest_ind(apples_MiracleGro[\"meanAppleWeight\"], apples_BrandX[\"meanAppleWeight\"], alternative='greater')" ] }, { "cell_type": "markdown", "id": "7e5d4dc7", "metadata": {}, "source": [ "The weight of apples from the trees fertilized with MiracleGro was indeed found to be significantly higher: t(18) = 2.35, p=0.0146." ] }, { "cell_type": "markdown", "id": "6a113d25", "metadata": {}, "source": [ "## Exercises\n", "\n", "
    \n", "
  1. Can you work out how to run a two-tailed test on the data?\n", " \n", "
  2. What happens to the p value if you run a two tailed test instead of one tailed? Is it more or less significant? Why?\n", "
" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.13" } }, "nbformat": 4, "nbformat_minor": 5 }