{ "cells": [ { "cell_type": "markdown", "id": "a418aa9e", "metadata": {}, "source": [ "# Grouping data" ] }, { "cell_type": "markdown", "id": "e0643307", "metadata": {}, "source": [ "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/jillxoreilly/StatsCourse/blob//main/notebooks/descriptives_groupby.ipynb#)\n", "\n", "In many datasets, data can be categorized and we would wish to give descriptive statistics separately for each category." ] }, { "cell_type": "markdown", "id": "ab5d768b", "metadata": {}, "source": [ "### Set up Python libraries\n", "\n", "As usual, run the code cell below to import the relevant Python libraries" ] }, { "cell_type": "code", "execution_count": 2, "id": "b37a7c8e", "metadata": {}, "outputs": [], "source": [ "# Set-up Python libraries - you need to run this but you don't need to change it\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "import scipy.stats as stats\n", "import pandas \n", "import seaborn as sns\n", "sns.set_theme()" ] }, { "cell_type": "markdown", "id": "254c65ff", "metadata": {}, "source": [ "### Load and view the data\n", "\n", "\"Picture\n", "\n", "Let's load the datafile \"vehicles.csv\" which contains size data on vehicles parked at a vehicle-ferry terminal at 1pm on Sunday 24th April 2022, which they regard as a representative sample." ] }, { "cell_type": "code", "execution_count": 3, "id": "60152969", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
lengthheightwidthtype
03.91871.53201.8030car
14.64861.59361.6463car
23.57851.54471.7140car
33.55631.55491.7331car
44.03211.50691.7320car
...............
135915.50004.20652.5112truck
136014.49604.19652.5166truck
136115.98904.19642.4757truck
136214.37004.20092.5047truck
136314.23504.20162.5212truck
\n", "

1364 rows × 4 columns

\n", "
" ], "text/plain": [ " length height width type\n", "0 3.9187 1.5320 1.8030 car\n", "1 4.6486 1.5936 1.6463 car\n", "2 3.5785 1.5447 1.7140 car\n", "3 3.5563 1.5549 1.7331 car\n", "4 4.0321 1.5069 1.7320 car\n", "... ... ... ... ...\n", "1359 15.5000 4.2065 2.5112 truck\n", "1360 14.4960 4.1965 2.5166 truck\n", "1361 15.9890 4.1964 2.4757 truck\n", "1362 14.3700 4.2009 2.5047 truck\n", "1363 14.2350 4.2016 2.5212 truck\n", "\n", "[1364 rows x 4 columns]" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "vehicles = pandas.read_csv('https://raw.githubusercontent.com/jillxoreilly/StatsCourseBook/main/data/vehicles.csv')\n", "display(vehicles)\n", "\n", "\n" ] }, { "cell_type": "markdown", "id": "11272619", "metadata": {}, "source": [ "That was a long list of vehicles!\n", "\n", "* What information do we have about each vehicle?" ] }, { "cell_type": "markdown", "id": "982fd768", "metadata": {}, "source": [ "### Obtain descriptive statistics\n", "\n", "We can use the built in functions in pandas.describe() to return descriptives for our data" ] }, { "cell_type": "code", "execution_count": 3, "id": "48ac89ab", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "count 1364.000000\n", "mean 6.722972\n", "std 4.232075\n", "min 3.110900\n", "25% 3.929450\n", "50% 4.419300\n", "75% 9.260325\n", "max 16.231000\n", "Name: length, dtype: float64" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "vehicles['length'].describe()" ] }, { "cell_type": "markdown", "id": "317be025", "metadata": {}, "source": [ "### Why group the data?" ] }, { "cell_type": "markdown", "id": "089a2a2f", "metadata": {}, "source": [ "You can see above that the mean length of vehicles in the car park is 6.72m.\n", "\n", "This is surprising as it is rather longer than even a large family car\n", "\n", "To get a better sense of the length data, I am going to plot them. \n", "\n", "Don't worry too much about the plotting code for now, as there are dedicated exercises on plotting later." ] }, { "cell_type": "code", "execution_count": 31, "id": "2e3b7d7c", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Text(0.5, 0, 'vehicle length (m)')" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "sns.histplot(data=vehicles, x=\"length\", bins = np.arange(0,16,0.5))\n", "plt.xlabel('vehicle length (m)')" ] }, { "cell_type": "markdown", "id": "920de4e2", "metadata": {}, "source": [ "Interesting. It looks like there are several clusters of vehicle lengths. \n", "\n", "Have a look back at our dataframe - is there some information there that could explain the different clusters?\n", "\n", "\n", "I can plot vehicle types in different colours (again no need ot worry about the plotting code at this stage)" ] }, { "cell_type": "code", "execution_count": 30, "id": "6e139522", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Text(0.5, 0, 'vehicle length (m)')" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYcAAAEJCAYAAAB/pOvWAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/MnkTPAAAACXBIWXMAAAsTAAALEwEAmpwYAAAssklEQVR4nO3de1iUdeL//+cMA6jhEQYtMuuXlmV5rJQ0KHdBC8hEMw+pZQf9roeyPpQi6aZF5lKmHdx1M6vVDkYp6irWph0UzeJTmqZdZuKRcPCEoBxm5v794afZcBQGYg7h63FdXhdzz/u+53XPCK+573vmvk2GYRiIiIj8htnfAUREJPCoHERExI3KQURE3KgcRETEjcpBRETcqBxERMSNxZsLnzNnDmvWrMFkMjFw4EDuv/9+Jk+eTG5uLg0bNgRg3LhxxMXFsWPHDqZMmUJJSQk33HADTz/9NBaLV+OJiMh5eO2v7+bNm9m0aRPLly/Hbrdzxx13EBsby7Zt21i0aBGRkZGVxqekpPDMM8/QuXNnUlNTWbJkCUOHDvX48Y4dK8HprPlXNsLDwzhypLjG8/lKIOdTttoJ5GwQ2PmUrXbOlc1sNtG8+UXnncdr5XDTTTfx9ttvY7FYKCgowOFw0KBBAw4dOkRqaioFBQXExcUxbtw48vPzKS0tpXPnzgAkJyczd+7cGpWD02nUqhx+nTeQBXI+ZaudQM4GgZ1P2Wqnptm8eswhODiYuXPnkpCQQHR0NHa7nR49epCens6SJUv45ptvyMzM5PDhw1itVtd8VquVgoICb0YTEZEqeH2n/oQJE3jooYcYM2YMGzdu5NVXX3XdN3z4cJYtW8aVV16JyWRyTTcMo9JtT4SHh9U6o9XauNbz+kIg51O22gnkbBDY+ZStdmqazWvlsHv3bsrLy7nmmmto2LAh8fHxrFq1imbNmtGnTx/gTAlYLBZatWqFzWZzzVtYWOh2TKI6R44U12qTzmptjM12ssbz+Uog51O22gnkbBDY+arKZhgGx47ZKC8vBXy/e8dsNuN0On3+uFUzERLSgHbtruDIkZJK95jNpirfVHutHA4cOMDcuXN59913Afj000+58cYbSU9Pp0ePHjRq1Ij333+f/v37ExUVRWhoKLm5uXTr1o2srCxiYmK8FU1E6qHi4hOYTCZatrwUk8n3n9K3WMzY7YFVDobh5PjxQgoLCzGZGtZoXq+VQ2xsLFu3buWuu+4iKCiI+Ph4xo0bR/PmzRkyZAh2u534+HgSExMByMjIIC0tjeLiYjp06MCIESO8FU1E6qHTp4tp0aKlX4ohUJlMZho3bs6xYzZatKhZOZjqyym7tVvJ95StdgI5GwR2vqqy/fLLXlq2vKzGxyvrSiBuOcCZ3W022wEiI1tXml7dbiVVrIjUG/4qhkBW2+dEX0EWN02bNiI4uPr3DXZH4L1LEvGmiRPHMm3aszRr1szfUbxO5SBugoPNvL50a7XjHuzf0QdpRALH119/5e8IPqNyEBHxQHr60wD069cHh8PBF19sxmw2U1paysCBSbz7bib3338vf/5zH77++iuKi08yePC99O8/EID167/grbcWYLdX0KBBA8aOfZTrrgvcN1g65iAi4oHU1GkAZGWtoW3bq/jqqxwA/vOfNXTrdiPNmzcHoKjoBK+//jYvv/wPFiz4O7t3/8T+/fuYP/9VMjLmsHDhO6SkTGHKlBROnz7tt/WpjrYcRERqKDn5bpYvX0Z0dC+ysj5i7NhHfnPfIEwmE5GRLenePZrNmzcRGhrKkSOFPPLIX1zjTCYzBw7sp127q/yxCtVSOYiI1FB8/O3Mn/8q//u/33D69Gk6d+7qui8oKMj1s9NpEBRkxul00K3bTUyf/pzrvoKCX4iIsBKotFtJRMRDQUFB2O12GjRoQHz87Tz33HTuuiu50pjs7H8D8Msvv/D115vo0eNmunW7ic2bN7F3bx4AGzeuZ+TIIZSVlfl6FTymLQepNYfTICKi+hMeVlQ4OXHilA8SiXjXrbf+iXHjHiY9fRZ33HEny5cvpW/fxEpj8vMPMWrUvZSXl/HII//DZZddDsATT0xh2rRUDMMgKCiI559/kUaNGvlhLTyjcpBaCzKb9JFXuaA8/XQ6cOZbx4sWvUXfvgmEhVV+gzR06HDat7/Wbd7evf9M795/9knOuqByEBGpoUGD+hEeHsFzz73g7yheo3IQEamhDz5Yfs7pmZkrfJzEe3RAWkRE3KgcRETEjcpBRETcqBxERMSNykFERNzo00oiUi81bdaIkOCg6gfWUHmFgxPH6/+XOlUOIlIvhQQH8c+PttT5ch9K7lTnywxEKgcRkTpmGAbz5r3MF198hsUSxJ13JtOu3VXMn/8aZWWlnDxZzIQJE7nlllt59tm/cuLECQ4e3M//+38T6NUrxt/xAZWDiEidW7fuU77/fgtvv/0edrudv/zlQZo2bcakSU/Rps3l5OZ+zZw5Gdxyy60ANG3alFmzZvs39FlUDiIidey773Lp3TuOkJAQQkJCePPNdygrKyMn50vWrfsP27d/X+lCP9dee50f056bVz+tNGfOHO644w4SEhJYuHAhADk5OSQlJREfH8/s2f9tyh07dpCcnEyfPn2YMmUKdrvdm9FERLzGYrFgMv33dn7+IcaOfYgdO7Zz9dXtGTFiFIZhuO4PDQ31Q8qqea0cNm/ezKZNm1i+fDkffvgh//rXv9i5cyepqam89tprrFq1im3btvH5558DkJKSwtSpU1mzZg2GYbBkyRJvRRMR8apOnbry2WdrsdvtlJaWMnHiOH7+eTcPPDCGHj168uWXn+N0Ov0ds0pe261000038fbbb2OxWCgoKMDhcFBUVESbNm1o3bo1AElJSWRnZ9O2bVtKS0vp3LkzAMnJycydO5ehQ4d6K56I1HPlFQ6vfLKovMJR7ZjY2NvYufMHRo0ahtNpcM89QzlwYB/Dhw/CYrHQteuNlJaWXrjXkA4ODmbu3Lm88cYb9O3bl8OHD2O1/veyeJGRkRQUFLhNt1qtFBQUeDOaiNRz/v4uwujRYxk9emylaePHP+b6+X/+ZxIAU6b81ZexPOb1A9ITJkzgoYceYsyYMeTl5WH6zY44wzAwmUw4nc5zTq+J8PDqr0h2PlZr41rP6wu+zmcYBhdd5Nk+UE/H+eM5DuTXNZCzQWDnO1+2w4fNWCz+PemDvx+/KjV9Tb1WDrt376a8vJxrrrmGhg0bEh8fT3Z2dqWLb9tsNiIjI2nVqhU2m801vbCwkMjIyBo93pEjxTidRvUDz2K1NsZmO1nj+XzFH/kiIsIoKfHs2raejvP1OgTy6xrI2SCw81WVzel0Yrf7bz++xWL26+NX5+znzWw2Vfmm2ms1d+DAAdLS0igvL6e8vJxPP/2UwYMHs2fPHvbu3YvD4WDlypXExMQQFRVFaGgoubm5AGRlZRETExhfBBERuRB5bcshNjaWrVu3ctdddxEUFER8fDwJCQm0aNGC8ePHU1ZWRmxsLH379gUgIyODtLQ0iouL6dChAyNGjPBWNBERqYZXjzmMHz+e8ePHV5oWHR3N8uXul9hr3749mZmZ3owjIiIeCtyjJyIi4jc6fYaI1EstmjUgKDi4zpfrqKjg6PHSKscUFxfz7LN/5bnnMmr9OPfdN5Q333yn1vP/XioHEamXgoKDObD67Tpf7qW3jwCqLoeTJ4vYtevH3/U4/iwGUDmIiNS5l176G4WFNiZP/h969YrhvfcWYTKZuPrqa5g48Qn+8Y9XuPzy/4/+/QeSlfURS5a8w+LFmdjtdgYN6seSJVncemsP1q//hgUL/kFhoY39+/dRUPALiYn9GDnyAex2O3/7Wzpbt36H1RqJyWRi5MgH6Nr1hjpZBx1zEBGpY48+mkJEhJUHHxzD22+/wSuvzOftt9+nQYOGLFz4T6Kje5GbuxmA//3frykqKuLo0SNs3fod113XEYul8vv2n37axezZrzJ//pssWvQWJ0+eZNmyTEpLT/POOx+SmjqNHTt+qNN1UDmIiHjJd9/l0rPnLTRt2gyAO+/sT27uZrp06cYPP2zH4XCwd+9e/vSneL777ls2bdrAzTf3cltO1643EBwcTPPmLWjSpAklJcV8/fVXxMXdjslkolWri+nW7cY6za5yEBHxEvezNhg4HA5CQ0Np2/YqPv54NW3atKFLl258910umzd/RY8ePd2WExIS4vrZZDJhGAZmcxCG4b1vZKscRETqWFBQEA6Hgy5durF+/RcUFZ0AYPnyZXTpcuaYwM039+TNN1+nS5durnENGzakWbNmHj3GDTfcxH/+8zGGYVBYaOPbb3NrfE66quiAtIhIHWvRIpyWLVsxZ04Gw4ffz7hxD2O327n66mtISZkMQHR0LzIyZtKlyw00adKEZs2an3OX0vn065fMTz/tYsSIewgPj6BVq4vr9KJBKgcRqZccFRX/97HTul9udSwWC3//+xuu20lJd7mNadmyFevXf+O6/cYbiyrd/+t9DzwwutL0zMwVAOTkrKdXrxiefHIKxcXF3H//MC69tLXH61EdlYOI1EtnvqhW9fcR/sguv/wKZsyYyj//OQ+ABx8cTZMmTets+SoHEZE/oEsuiWLevAVeW74OSIuIiBuVg4iIuFE5iIiIG5WDiIi4UTmIiIgbfVpJROqlZs0bEmyp+z9xFXY7x4+drnJMXVzPAWDcuIcZNerhOjvTak2oHESkXgq2WHjzqw/rfLn3dR9Q7Zi6uJ6Dv6kcRETq2G+v57B37x6aNm1GaGgo8fG38+23uUyZ8lfgv1sGXbp0Y968l/nii8+wWIK4885kBg0a4lresWNHmTBhDA8//BduueVWn6yDjjmIiNSxX6/nMGHCY+zbt5epU2fw0kuvnXf8unWf8v33W3j77feYP/8tVq1awZEjhQCUlBSTkvIoo0Y97LNiAC9vObzyyiusXr0agNjYWJ544gkmT55Mbm4uDRs2BGDcuHHExcWxY8cOpkyZQklJCTfccANPP/202wUvRET+aJo3b8HFF19S5Zjvvsuld+84QkJCCAkJqXSJ0L/97TlatAgnNra3t6NW4rUth5ycHNavX8/SpUtZtmwZ27dv55NPPmHbtm0sWrSIrKwssrKyiIuLAyAlJYWpU6eyZs0aDMNgyZIl3oomIuIzvz1T6q/XYviVw2EHzpyo77dn287PP8Tp02cOeg8bNoJmzZqxdGmmbwL/H6+Vg9VqZdKkSYSEhBAcHMyVV17JoUOHOHToEKmpqSQlJTF37lycTicHDx6ktLSUzp07A5CcnEx2dra3oomIeNWv13M4W9Omzdi7dw+GYXDo0EF++uknADp16spnn63FbrdTWlrK44+Px2Y7DEC7dlfz+OOTWLjwn65pvuC1/Tbt2rVz/ZyXl8fq1atZvHgxmzdvZtq0aTRu3JjRo0eTmZlJu3btsFqtrvFWq5WCggJvRRORC0CF3e7RJ4tqs9zq/Ho9h/T0pytNv+GGm/j3v7MYMmQAbdq0oWPHzgDExt7Gzp0/MGrUMJxOg7vvHsJll7Vxzde69WUkJ9/Niy/O+t0fj/WUyfjtNo4X7Nq1i9GjRzN+/Hj69+9f6b5PPvmEZcuWMWrUKF544QXeeefMfra8vDzGjBmjrQc/MQyDdz+u/mN4Q+Kv9nhcXV6hSuRctm//gUsuaVP9wAvQoUN76dDh2hrN49Ujvrm5uUyYMIHU1FQSEhL48ccfycvLo0+fPsCZP0IWi4VWrVphs9lc8xUWFhIZGVmjxzpypPgc12utntXaGJvtZI3n8xV/5IuICKOkpMyjsZ6O8/U6BPLrGsjZILDzVZXN6XRit3vvmsrVsVjMfn386pz9vJnNJsLDw8473mvHHPLz8xk7diwZGRkkJCQAZ8ogPT2dEydOUFFRwfvvv09cXBxRUVGEhoaSm5sLQFZWFjExMd6KJiIi1fDalsOCBQsoKytj5syZrmmDBw/m4YcfZsiQIdjtduLj40lMTAQgIyODtLQ0iouL6dChAyNG1P3l/URExDNeK4e0tDTS0tLOed+wYcPcprVv357MTN9+VEtE6hfDMHR86yy1Paysb0iLSL1gsYRQUlJU6z+G9ZFhGJSUFNGwYYMaz6uvIItIvdC8uZVjx2wUFx/3y+ObzWaczsA7IG2xhNC27RUcP15as/m8lEdExKeCgixERFzst8cP5E95BQcHAzUrB+1WEhERNyoHERFxo3IQERE3KgcREXGjchARETcqBxERcaNyEBERNyoHERFxo3IQERE3KgcREXGjchARETcqBxERcaNyEBERNyoHERFxo3IQERE3KgcREXGjchARETcqBxERcaNyEBERN14th1deeYWEhAQSEhKYNWsWADk5OSQlJREfH8/s2bNdY3fs2EFycjJ9+vRhypQp2O12b0YTEZEqeK0ccnJyWL9+PUuXLmXZsmVs376dlStXkpqaymuvvcaqVavYtm0bn3/+OQApKSlMnTqVNWvWYBgGS5Ys8VY0ERGphtfKwWq1MmnSJEJCQggODubKK68kLy+PNm3a0Lp1aywWC0lJSWRnZ3Pw4EFKS0vp3LkzAMnJyWRnZ3srmoiIVMPiyaDU1FTS09MrTZswYQJz58497zzt2rVz/ZyXl8fq1au59957sVqtrumRkZEUFBRw+PDhStOtVisFBQUerwRAeHhYjcb/ltXauNbz+oKv8xmGwUUXhXo01tNx/niOA/l1DeRsENj5lK12apqtynKYNm0aBQUF5ObmcvToUdd0u93O/v37PXqAXbt2MXr0aJ544gmCgoLIy8tz3WcYBiaTCafTiclkcpteE0eOFON0GjWaB848YTbbyRrP5yv+yBcREUZJSZlHYz0d5+t1COTXNZCzQWDnU7baOVc2s9lU5ZvqKsth4MCB7Nq1ix9//JE+ffq4pgcFBbl2AVUlNzeXCRMmkJqaSkJCAps3b8Zms7nut9lsREZG0qpVq0rTCwsLiYyMrHb5IiLiHVWWw/XXX8/111/PzTffTKtWrWq04Pz8fMaOHcvs2bOJjo4GoFOnTuzZs4e9e/dy6aWXsnLlSgYMGEBUVBShoaHk5ubSrVs3srKyiImJqf1aiYjI7+LRMYf8/HxSUlI4ceIEhvHfXTcrVqw47zwLFiygrKyMmTNnuqYNHjyYmTNnMn78eMrKyoiNjaVv374AZGRkkJaWRnFxMR06dGDEiBG1XScREfmdPCqHqVOnkpyczLXXXuvxsYC0tDTS0tLOed/y5cvdprVv357MzEyPli0iIt7lUTlYLBbuv/9+b2cREZEA4dH3HNq1a8ePP/7o7SwiIhIgPNpy2L9/PwMGDOCSSy4hNPS/n2uv6piDiIj8cXlUDhMnTvR2DhERCSAelcNVV13l7RwiIhJAPCqHHj16YDKZKn1z2Wq18sUXX3g1nIiI+IdH5bBz507Xz+Xl5axcuZI9e/Z4LZSIiPhXjc/KGhISQnJyMhs2bPBGHqmHHE6DiIiwKv81bdrI3zFF5Dc82nI4fvy462fDMNi2bRtFRUXeyiT1TJDZxOtLt1Y55sH+HX2URkQ8UeNjDgDh4eFMmTLFq8FERMR/anzMQURE6j+PysHpdLJgwQK++OIL7HY7PXv2ZMyYMVgsHs0uIiJ/MB4dkH7hhRfYtGkTI0eO5P777+fbb79l1qxZ3s4mIiJ+4tFb/y+//JIPP/yQ4OBgAG699VbuvPNOUlNTvRpORET8w6MtB8MwXMUAZz7O+tvbIiJSv3hUDu3btyc9PZ19+/axf/9+0tPTdUoNEZF6zKNymDZtGkVFRQwePJi7776bY8eO8dRTT3k7m4iI+EmV5VBeXs6TTz7Jxo0bmTlzJjk5OXTs2JGgoCDCwsJ8lVFERHysynKYO3cuxcXFdO3a1TVtxowZFBUV8fLLL3s9nIiI+EeV5fDZZ5/xwgsvEB4e7prWsmVLZs2axX/+8x+vhxMREf+oshyCg4Np0KCB2/SwsDBCQkK8FkpERPyrynIwm80UFxe7TS8uLsZut1e78OLiYhITEzlw4AAAkydPJj4+nn79+tGvXz8++eQTAHbs2EFycjJ9+vRhypQpHi1bRES8p8pySExMJC0tjVOnTrmmnTp1irS0NOLj46tc8JYtWxgyZAh5eXmuadu2bWPRokVkZWWRlZVFXFwcACkpKUydOpU1a9ZgGAZLliz5HaskIiK/V5XlMHLkSBo3bkzPnj0ZNGgQAwcOpGfPnjRp0oSxY8dWueAlS5Ywbdo0IiMjATh9+jSHDh0iNTWVpKQk5s6di9Pp5ODBg5SWltK5c2cAkpOTyc7Orpu1ExGRWqny9Blms5kZM2YwZswYtm/fjtlspmPHjq4/+FV59tlnK90uLCykR48eTJs2jcaNGzN69GgyMzNp164dVqvVNc5qtVJQUFDL1RERkbrg0bmVoqKiiIqK+l0P1Lp1a1599VXX7eHDh7Ns2TKuvPJK13WpgUrXqa6J8PDaf+/Cam1c63l9wdf5DMPgootCPRpbl+Pqej0D+XUN5GwQ2PmUrXZqms1n59z+8ccfycvLo0+fPsCZP0AWi4VWrVphs9lc4woLCz3aMjnbkSPFOJ1GjeezWhtjs52s8Xy+4o98ERFhlJSUeTS2LsfV5XoG8usayNkgsPMpW+2cK5vZbKryTXWNryFdW4ZhkJ6ezokTJ6ioqOD9998nLi6OqKgoQkNDyc3NBSArK4uYmBhfxRIRkXPw2ZZD+/btefjhhxkyZAh2u534+HgSExMByMjIIC0tjeLiYjp06MCIESN8FUtERM7B6+Wwdu1a18/Dhg1j2LBhbmPat29PZmamt6OIiIiHfLZbSURE/jhUDiIi4kblICIiblQOIiLiRuUgIiJuVA4iIuJG5SAiIm5UDiIi4kblICIiblQOIiLiRuUgIiJuVA4iIuJG5SAiIm5UDiIi4kblICIiblQOIiLiRuUgIiJuVA4iIuJG5SAiIm5UDiIi4kblICIiblQOIiLixqvlUFxcTGJiIgcOHAAgJyeHpKQk4uPjmT17tmvcjh07SE5Opk+fPkyZMgW73e7NWCIiUg2vlcOWLVsYMmQIeXl5AJSWlpKamsprr73GqlWr2LZtG59//jkAKSkpTJ06lTVr1mAYBkuWLPFWLBER8YDXymHJkiVMmzaNyMhIALZu3UqbNm1o3bo1FouFpKQksrOzOXjwIKWlpXTu3BmA5ORksrOzvRVLREQ8YPHWgp999tlKtw8fPozVanXdjoyMpKCgwG261WqloKCgxo8XHh5W66xWa+Naz+sLvs5nGAYXXRTq0di6HFfX6xnIr2sgZ4PAzqdstVPTbF4rh7M5nU5MJpPrtmEYmEym806vqSNHinE6jRrPZ7U2xmY7WeP5fMUf+SIiwigpKfNobF2Oq8v1DOTXNZCzQWDnU7baOVc2s9lU5Ztqn31aqVWrVthsNtdtm81GZGSk2/TCwkLXrigREfEPn5VDp06d2LNnD3v37sXhcLBy5UpiYmKIiooiNDSU3NxcALKysoiJifFVLBEROQef7VYKDQ1l5syZjB8/nrKyMmJjY+nbty8AGRkZpKWlUVxcTIcOHRgxYoSvYomIyDl4vRzWrl3r+jk6Oprly5e7jWnfvj2ZmZnejiIiIh7SN6RFRMSNykFERNyoHERExI3KQURE3KgcRETEjcpBRETcqBxERMSNykFERNyoHERExI3KQURE3KgcRETEjcpBRETcqBxERMSNykFERNyoHERExI3KQURE3KgcRETEjc8uEyoiItC0WQOCLdX/6a2w2zlxvNQHic5N5SAi4kPBFgtvbf6o2nH33ngXERFh1Y7zVomoHCQgOJyGZ78IFU5OnDjlg0Qi/hVkMntUIiNvSvbK46scLiBNmzYiODgwDzMFmU28vnRrteMe7N/RB2lExC/lMHz4cI4ePYrl//a7TZ8+nZKSEp577jnKysq4/fbbmThxoj+i1WvBwWb9ARYRj/i8HAzDIC8vj3Xr1rnKobS0lL59+/Kvf/2Liy++mNGjR/P5558TGxvr63giEsAcTodf98NfSHxeDj///DMAo0aN4vjx4wwaNIirrrqKNm3a0Lp1awCSkpLIzs5WOYhIJWY/74e/kPh8B3RRURHR0dG8+uqrvPnmm7z33nscOnQIq9XqGhMZGUlBQYGvo4mIyP/x+ZZDly5d6NKli+v2wIEDmTt3Lt26dXNNMwwDk8lUo+WGh1e/qXk+VmvjWs/rC3WVzzAMLroo1KOx/hjn6bI8fT4C+XUN5GwQuPkMw6BRHf8/qUuePGZN1qEu17Wmz4fPy+Gbb76hoqKC6Oho4MwTFRUVhc1mc42x2WxERkbWaLlHjhTjdBo1zmO1NsZmO1nj+XylLvNFRIRRUlLm0Vh/jPN0WZ48H4H8ugZyNgjsfBERYZyqw/8ndcnT560m61BX63qubGazqco31T7frXTy5ElmzZpFWVkZxcXFLF26lMcee4w9e/awd+9eHA4HK1euJCYmxtfRRKSecBhOIiLCqv3XtFkDf0cNWD7fcrjtttvYsmULd911F06nk6FDh9KlSxdmzpzJ+PHjKSsrIzY2lr59+/o6mojUE/7+All94JfvOTz66KM8+uijlaZFR0ezfPlyf8QREZGzBObXZUVExK9UDiIi4kblICIiblQOIiLiRuUgIiJuVA4iIuJG5SAiIm5UDiIi4kblICIibnSZUBGROuDphYj+KFQOIiJ1oL5diEi7lURExI22HETE75o2a0CwRX+OAoleDRHxu2CLpV7tkqkPtFtJRETcqBxERMSNdiuJyAXr18uJVqXCbufE8VIfJQocKgf5Q3E4DY8+S253OH2QpnYMh2efh3fY7Rzz4I9S82YNCPLgYK6ny7uQeHI50Qv1OIfKQf5QgswmXl+6tdpxD/bv6IM0tWQ2czD7X9UOi+o73KPFBVksdbo8EdAxBxEROQdtOYjUIU938Xiknp2O4Y/Kk+MS9ZHKQaQOebKLx+PdO+agut1d5GHZGA6HZ8u7QHhyXALq37GJgCqHFStWMG/ePOx2OyNHjmTYsGH+jiRSf9R12Ui9FjDlUFBQwOzZs/noo48ICQlh8ODBdO/enbZt2/o7mojIBSdgyiEnJ4cePXrQrFkzAPr06UN2djbjxo3zaH6z2VTrx/498/pCXeYLaxQcsOPq+jHr8nlr2iTU42MJQQ0vqpMx/hznj9+JsJBGATsukLOBZ6/X2WOqm8dkGIbh0aN72T/+8Q9OnTrFxIkTAfjggw/YunUrM2bM8HMyEZELT8B8lNXpdGIy/bfJDMOodFtERHwnYMqhVatW2Gw2122bzUZkZKQfE4mIXLgCphxuvvlmNm7cyNGjRzl9+jQff/wxMTEx/o4lInJBCpgD0i1btmTixImMGDGCiooKBg4cSMeOAXwKBBGReixgDkiLiEjgCJjdSiIiEjhUDiIi4kblICIiblQOIiLi5oIuhxUrVnDHHXcQHx/P4sWL/R2nkldeeYWEhAQSEhKYNWuWv+Oc0/PPP8+kSZP8HaOStWvXkpyczO23384zzzzj7zhusrKyXK/r888/7+84ABQXF5OYmMiBAweAM6eySUpKIj4+ntmzZwdUtvfff5/ExESSkpKYPHky5eXlAZPtV4sWLWL4cP+fvPDsfN9++y2DBg0iISGBxx57rPrnzrhA/fLLL8Ztt91mHDt2zCgpKTGSkpKMXbt2+TuWYRiGsWHDBuOee+4xysrKjPLycmPEiBHGxx9/7O9YleTk5Bjdu3c3nnzySX9Hcdm3b5/Rq1cvIz8/3ygvLzeGDBlifPbZZ/6O5XLq1CnjxhtvNI4cOWJUVFQYAwcONDZs2ODXTN99952RmJhodOjQwdi/f79x+vRpIzY21ti3b59RUVFhjBo1ym/P4dnZfv75ZyMuLs44efKk4XQ6jSeeeMJYuHBhQGT71a5du4xbbrnFuPfee/2S61dn5zt58qTRs2dPY8eOHYZhGMbEiRONxYsXV7mMC3bL4bcn+mvUqJHrRH+BwGq1MmnSJEJCQggODubKK6/k0KFD/o7lcvz4cWbPns2YMWP8HaWSTz75hDvuuINWrVoRHBzM7Nmz6dSpk79juTgcDpxOJ6dPn8Zut2O32wkNDfVrpiVLljBt2jTX2Qi2bt1KmzZtaN26NRaLhaSkJL/9XpydLSQkhGnTphEWFobJZOKqq67y2+/F2dkAysvLmTp1KhMmTPBLpt86O9+GDRvo3Lkz7du3ByAtLY24uLgqlxEwX4LztcOHD2O1Wl23IyMj2bq1+msT+0K7du1cP+fl5bF69WreffddPyaqbOrUqUycOJH8/Hx/R6lk7969BAcHM2bMGPLz87n11lt59NFH/R3LJSwsjEceeYTbb7+dhg0bcuONN9K1a1e/Znr22Wcr3T7X70VBQYGvYwHu2aKiooiKigLg6NGjLF68mOeee84f0dyyAbzwwgsMGDCASy+91A+JKjs73969e2nUqBETJ07k559/pmvXrtXuEr5gtxz+CCf627VrF6NGjeKJJ57g8ssv93cc4MzZci+++GKio6P9HcWNw+Fg48aNpKen8/7777N161aWLl3q71guO3fu5MMPP2TdunV8+eWXmM1mFixY4O9YlfwRfi8KCgoYOXIkAwYMoHv37v6OA5x5Z56fn8+AAQP8HeWcHA4H69ev57HHHuOjjz7i9OnTzJ8/v8p5LthyCPQT/eXm5nLffffx+OOP079/f3/HcVm1ahUbNmygX79+zJ07l7Vr15Kenu7vWABEREQQHR1NixYtaNCgAX/+858DZmsQYP369URHRxMeHk5ISAjJycls3rzZ37EqCfTfi927dzN48GD69+/P2LFj/R3HZeXKlezatYt+/fqRlpbGtm3bAmqrNSIigk6dOtG6dWuCgoK4/fbbq/3duGDLIZBP9Jefn8/YsWPJyMggISHB33EqWbhwIStXriQrK4sJEybQu3dvUlNT/R0LgNtuu43169dTVFSEw+Hgyy+/pEOHDv6O5dK+fXtycnI4deoUhmGwdu1arr/+en/HqqRTp07s2bOHvXv34nA4WLlyZcD8XhQXF/PAAw/wyCOPMGrUKH/HqeS5555j9erVZGVl8cwzz3Ddddfx0ksv+TuWS69evdi+fbtrV/C6deuq/d24YI85BPKJ/hYsWEBZWRkzZ850TRs8eDBDhgzxY6rA16lTJx588EGGDh1KRUUFPXv2DKjN/F69evHDDz+QnJxMcHAw119/PQ8//LC/Y1USGhrKzJkzGT9+PGVlZcTGxtK3b19/xwIgMzOTwsJCFi5cyMKFCwHo3bs3jzzyiJ+TBb6LL76Y6dOnM2bMGMrKyrjmmmt48sknq5xHJ94TERE3F+xuJREROT+Vg4iIuFE5iIiIG5WDiIi4UTmIiIgblYPUSy+//DLTp08/530PPfQQP/3003nn/eqrr0hMTKyzx/u9PvjgA9dZg2vyOA6Hg9GjR1NYWOjxY23bto2nnnqqVjmlflE5yAXnn//8J23btvV3DI/l5uZSWlpa4/neeOMNbrrpJiIiIjye57rrrsNut7Nu3boaP57ULxfsl+Dkj+Pxxx+nQ4cOrm/FvvPOO2zevJmXXnqJtWvXMm/ePCoqKmjQoAFPPvkkXbp0AeDnn39m+PDh2Gw2IiIiePHFF4mMjKR3797MmTOH66+/nszMTBYuXIjZbKZ58+Zu11goLy8nIyODr7/+GofDwbXXXktaWhphYWHnzVtQUMD06dPJz8+noqKChIQExowZw4EDB7jvvvuIjY1ly5YtFBUVkZKSQlxcHKdPn2batGls2bKFxo0bu8rrT3/6E2vXrmXDhg00aNCgyvX6rdOnT/PWW2+xYsUK4MwWx759+ygoKMBms9GhQwe6d+/OsmXLOHDgACkpKa6tpXvuuYe//vWv3HbbbXXw6skflbYcJODdfffdlU6gt3TpUgYNGkReXh6zZ89m/vz5LFu2jBkzZjB+/HhOnToFwP79+5kzZw7Z2dk0adKEDz74oNJyd+7cSUZGBq+//jorVqygd+/ezJs3r9KY+fPnExQUxEcffcTy5cuJjIwkIyOjyrwpKSkMGDCAjz76iMzMTHJycli1apUrU69evcjMzOTxxx93nZfqtddew+FwsHr1at58801++OEHAOLi4ujduzf33Xcfw4YN82i9ADZt2sQVV1xB8+bNXdNyc3N59dVXWbp0KV988QW7d+9m8eLFPPXUU7z88suucZ07d2bfvn3s37+/6hdG6jVtOUjA6969O2VlZXz//fc0bNiQo0ePEh0dzTvvvMPhw4e57777XGNNJhP79u0DoGfPnrRo0QI4c16jo0ePVlruxo0b6dWrFxdffDGAazlfffWVa8xnn33GyZMnycnJAaCiooLw8PDzZj116hRff/01J06cYM6cOa5pO3fupGPHjgQHBxMbGwvAtddey/HjxwH4/PPPmTx5MmazmbCwMPr378+PP/54zseobr3gzNbFZZddVmnazTffTOPGjYEzp+K+5ZZbALjssstcOX516aWXsmfPHlq3bn3edZX6TeUgAc9kMjFw4ECysrIIDg5m4MCBmEwmnE4n0dHRlU5wlp+fT2RkJJ988gkWi6XSMs4+U0xQUFCl01GXlpZy8ODBSmOcTiepqamuP+glJSWUlZWdN6vT6cQwDN577z0aNmwInLn2QGhoKMeOHSM4OBiz2ezK9CuLxVIp369jzqW69fp1utPprDQtJCTkvMs512MEBQWd936p/7RbSf4Q+vfvz9q1a1mzZg3JyckAREdHs2HDBnbv3g2cefd95513enzwtnv37mzcuJHDhw8D8N577/G3v/2t0phevXqxePFiysvLcTqdPPXUU7z44ovnXWZYWBidO3d2nRiuqKiIIUOG8Omnn1aZJTY2lg8//NB1pbiVK1e6yiMoKAi73e7ROv3qiiuuqPVuIcMwOHToEFdccUWt5pf6QVsO8odgtVq59tprsdvttGzZEoC2bdsyffp0HnvsMQzDwGKxMG/ePC666CKPlnn11VeTkpLCgw8+6HqM9PR08vLyXGP+8pe/8Pzzz9O/f38cDgfXXHNNtVfQysjIYMaMGSQlJVFeXk5iYiJ33nmn24Xof2v06NFMnz6dpKQkGjduTHh4uOsAdExMTKUz9Hri5ptvZsqUKRQVFdGkSZMazfv9999z2WWXcckll9RoPqlfdFZWkQDw73//m7CwMGJjY3E6nYwfP56ePXsydOjQWi/z73//O0FBQTz00EM1mm/SpEn07duXW2+9tdaPLX982q0kEgDatWvHvHnz6NevH4mJiURGRnL33Xf/rmWOGjWKTZs2VbqyW3W2bduGyWRSMYi2HERExJ22HERExI3KQURE3KgcRETEjcpBRETcqBxERMSNykFERNz8//rPNPQU1rp6AAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "sns.histplot(data=vehicles, x=\"length\", bins = np.arange(0,16,0.5), hue=\"type\")\n", "plt.xlabel('vehicle length (m)')" ] }, { "cell_type": "markdown", "id": "499ed448", "metadata": {}, "source": [ "Aha. We might want to describe our data separately for each by vehicle type.\n", "\n", "### Grouping into separate dataframes\n", "\n", "One way to do this is to create separate dataframes for each vehicle type:" ] }, { "cell_type": "code", "execution_count": 37, "id": "0aa7145e", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
lengthheightwidth
count981.000000981.000000981.000000
mean4.1979941.5808101.791925
std0.5177610.0592630.046921
min3.1109001.4304001.624100
25%3.8154001.5400001.760200
50%4.1216001.5745001.790400
75%4.5184001.6119001.820900
max6.1024001.8993001.958000
\n", "
" ], "text/plain": [ " length height width\n", "count 981.000000 981.000000 981.000000\n", "mean 4.197994 1.580810 1.791925\n", "std 0.517761 0.059263 0.046921\n", "min 3.110900 1.430400 1.624100\n", "25% 3.815400 1.540000 1.760200\n", "50% 4.121600 1.574500 1.790400\n", "75% 4.518400 1.611900 1.820900\n", "max 6.102400 1.899300 1.958000" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cars = vehicles[vehicles['type']=='car']\n", "cars.describe()" ] }, { "cell_type": "markdown", "id": "ba7987fd", "metadata": {}, "source": [ "we can see that 981 of the vehicles were cars, and their mean length was 4.198m, much shorter than the mean over all vehicles.\n", "\n", "Try modifying the code below to get descriptive statistics for trucks:" ] }, { "cell_type": "code", "execution_count": 38, "id": "15f86927", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
lengthheightwidth
count981.000000981.000000981.000000
mean4.1979941.5808101.791925
std0.5177610.0592630.046921
min3.1109001.4304001.624100
25%3.8154001.5400001.760200
50%4.1216001.5745001.790400
75%4.5184001.6119001.820900
max6.1024001.8993001.958000
\n", "
" ], "text/plain": [ " length height width\n", "count 981.000000 981.000000 981.000000\n", "mean 4.197994 1.580810 1.791925\n", "std 0.517761 0.059263 0.046921\n", "min 3.110900 1.430400 1.624100\n", "25% 3.815400 1.540000 1.760200\n", "50% 4.121600 1.574500 1.790400\n", "75% 4.518400 1.611900 1.820900\n", "max 6.102400 1.899300 1.958000" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# modify the code to get descritives for trucks\n", "cars = vehicles[vehicles['type']=='car']\n", "cars.describe()" ] }, { "cell_type": "markdown", "id": "09ef22e0", "metadata": {}, "source": [ "### pandas.groupby\n", "\n", "We can also use the pandas function groupby to split up our dataframe according to a categorical variable, in this case vehicle type." ] }, { "cell_type": "code", "execution_count": 41, "id": "f8ae074e", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
lengthheightwidth
countmeanstdmin25%50%75%maxcountmean...75%maxcountmeanstdmin25%50%75%max
type
car981.04.1979940.5177613.11093.81544.12164.51846.1024981.01.580810...1.61191.8993981.01.7919250.0469211.62411.76021.790401.820901.9580
towing53.08.6729510.7134607.25618.13238.68949.219110.098053.02.897838...2.90642.944553.02.2483260.0082222.22922.24422.247902.254002.2642
truck330.013.9158641.34302811.148012.564014.365015.075016.2310330.04.072725...4.20094.2137330.02.5013040.0158712.46292.48982.501452.511552.5467
\n", "

3 rows × 24 columns

\n", "
" ], "text/plain": [ " length \\\n", " count mean std min 25% 50% 75% \n", "type \n", "car 981.0 4.197994 0.517761 3.1109 3.8154 4.1216 4.5184 \n", "towing 53.0 8.672951 0.713460 7.2561 8.1323 8.6894 9.2191 \n", "truck 330.0 13.915864 1.343028 11.1480 12.5640 14.3650 15.0750 \n", "\n", " height ... width \\\n", " max count mean ... 75% max count mean \n", "type ... \n", "car 6.1024 981.0 1.580810 ... 1.6119 1.8993 981.0 1.791925 \n", "towing 10.0980 53.0 2.897838 ... 2.9064 2.9445 53.0 2.248326 \n", "truck 16.2310 330.0 4.072725 ... 4.2009 4.2137 330.0 2.501304 \n", "\n", " \n", " std min 25% 50% 75% max \n", "type \n", "car 0.046921 1.6241 1.7602 1.79040 1.82090 1.9580 \n", "towing 0.008222 2.2292 2.2442 2.24790 2.25400 2.2642 \n", "truck 0.015871 2.4629 2.4898 2.50145 2.51155 2.5467 \n", "\n", "[3 rows x 24 columns]" ] }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" } ], "source": [ "vehicles.groupby(['type']).describe()" ] }, { "cell_type": "markdown", "id": "3a561698", "metadata": {}, "source": [ "Yikes, that was an unweildy table!\n", "\n", "It may be preferable to output descriptives only for one measure (eg length):" ] }, { "cell_type": "code", "execution_count": 43, "id": "e39d894a", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
countmeanstdmin25%50%75%max
type
car981.04.1979940.5177613.11093.81544.12164.51846.1024
towing53.08.6729510.7134607.25618.13238.68949.219110.0980
truck330.013.9158641.34302811.148012.564014.365015.075016.2310
\n", "
" ], "text/plain": [ " count mean std min 25% 50% 75% \\\n", "type \n", "car 981.0 4.197994 0.517761 3.1109 3.8154 4.1216 4.5184 \n", "towing 53.0 8.672951 0.713460 7.2561 8.1323 8.6894 9.2191 \n", "truck 330.0 13.915864 1.343028 11.1480 12.5640 14.3650 15.0750 \n", "\n", " max \n", "type \n", "car 6.1024 \n", "towing 10.0980 \n", "truck 16.2310 " ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "vehicles.groupby(['type'])['length'].describe()" ] }, { "cell_type": "markdown", "id": "837e21f4", "metadata": {}, "source": [ "... or to output one descriptive (such as the mean) at a time, rather than the whole table" ] }, { "cell_type": "code", "execution_count": 45, "id": "cf15c7ef", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
lengthheightwidth
type
car4.1979941.5808101.791925
towing8.6729512.8978382.248326
truck13.9158644.0727252.501304
\n", "
" ], "text/plain": [ " length height width\n", "type \n", "car 4.197994 1.580810 1.791925\n", "towing 8.672951 2.897838 2.248326\n", "truck 13.915864 4.072725 2.501304" ] }, "execution_count": 45, "metadata": {}, "output_type": "execute_result" } ], "source": [ "vehicles.groupby(['type']).mean()" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (Spyder)", "language": "python3", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.13" } }, "nbformat": 4, "nbformat_minor": 5 }