{ "cells": [ { "cell_type": "markdown", "id": "47fbeb51-864b-4e14-a0b2-fed06b014c2e", "metadata": {}, "source": [ "# Pandas Exercises" ] }, { "cell_type": "markdown", "id": "1d26ef1e-3c49-4a46-a4c5-b68825d5a953", "metadata": {}, "source": [ "## Creating DataFrames and Using Sample Data Sets" ] }, { "cell_type": "markdown", "id": "5252953e-79d1-43f3-9bf9-e4ef72e0987c", "metadata": {}, "source": [ "This is the Jupyter Notebook **solution set* for the article, [Pandas Practice Questions – Fifty-Two Examples to Make You an Expert](https://codesolid.com/pandas-practice-questions-twenty-one-examples-to-make-you-an-expert/)." ] }, { "cell_type": "code", "execution_count": 1, "id": "df91e92f-5f2b-4e98-81f7-84e7478ed432", "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import numpy as np\n", "import seaborn as sb" ] }, { "cell_type": "markdown", "id": "6950cc47-3cc0-4565-8df1-d01741b7ea5f", "metadata": {}, "source": [ "**1.** Using NumPy, create a Pandas DataFrame with five rows and three columms:" ] }, { "cell_type": "code", "execution_count": 2, "id": "dd1a5014-d17e-4fd0-b7a2-db261a53a02a", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
012
0012
1345
2678
391011
4121314
\n", "
" ], "text/plain": [ " 0 1 2\n", "0 0 1 2\n", "1 3 4 5\n", "2 6 7 8\n", "3 9 10 11\n", "4 12 13 14" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [] }, { "cell_type": "markdown", "id": "9c74f94d-4adf-403d-a2c4-9efd200ba5b2", "metadata": {}, "source": [ "**2.** For a Pandas DataFrame created from a NumPy array, what is the default behavior for the labels for the columns? For the rows?" ] }, { "cell_type": "markdown", "id": "e7508b50-d724-47a7-8272-9ba040cd9c65", "metadata": {}, "source": [ "Both the \"columns\" value and the \"index\" value (for the rows) are set to zero based numeric arrays." ] }, { "cell_type": "markdown", "id": "1a20442a-ea2c-4d32-877b-bfe5ae349318", "metadata": {}, "source": [ "**3.** Create a second DataFrame as above with five rows and three columns, setting the row labels to the names of any five major US cities and the column labels to the first three months of the year." ] }, { "cell_type": "code", "execution_count": 3, "id": "02686eae-cba2-4180-94ff-4dc85eca0731", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
JanuaryFebruaryMarch
NewYork012
LosAngeles345
Atlanta678
Boston91011
SanFrancisco121314
\n", "
" ], "text/plain": [ " January February March\n", "NewYork 0 1 2\n", "LosAngeles 3 4 5\n", "Atlanta 6 7 8\n", "Boston 9 10 11\n", "SanFrancisco 12 13 14" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = DataFrame(np.arange(15).reshape(5,3))\n", "df.index = [\"NewYork\", \"LosAngeles\", \"Atlanta\", \"Boston\", \"SanFrancisco\"]\n", "df.columns = [\"January\", \"February\", \"March\"]\n", "df" ] }, { "cell_type": "markdown", "id": "f6c2f45b-0fea-49bf-98a0-572d594a0680", "metadata": {}, "source": [ "**4.** You recall that the Seaborn package has some data sets built in, but can't remember how to list and load them. Assuming the functions to do so have \"data\" in the name, how might you locate them? You can assume a Jupyter Notebook / IPython environment and explain the process, or write the code to do it in Python." ] }, { "cell_type": "markdown", "id": "8afae59d-eb5a-4692-9924-b54d49942769", "metadata": {}, "source": [ "Method 1: In an empty code cell, type sb + tab to bring up a list of names. Type \"data\" to filter the names." ] }, { "cell_type": "code", "execution_count": 4, "id": "ce709b0a-cb22-4f33-9492-62f8346d0ef4", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['get_data_home', 'get_dataset_names', 'load_dataset']" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Method 2:\n", "[x for x in dir(sb) if \"data\" in x]" ] }, { "cell_type": "code", "execution_count": 5, "id": "81dabb30-8220-49c3-981c-106d190a5de1", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['anagrams',\n", " 'anscombe',\n", " 'attention',\n", " 'brain_networks',\n", " 'car_crashes',\n", " 'diamonds',\n", " 'dots',\n", " 'exercise',\n", " 'flights',\n", " 'fmri',\n", " 'gammas',\n", " 'geyser',\n", " 'iris',\n", " 'mpg',\n", " 'penguins',\n", " 'planets',\n", " 'taxis',\n", " 'tips',\n", " 'titanic']" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sb.get_dataset_names()" ] }, { "cell_type": "markdown", "id": "70efcde8-cd73-4d61-8534-e6a3c469dbff", "metadata": {}, "source": [ "## Loading data from CSV" ] }, { "cell_type": "markdown", "id": "00214582-1340-4700-99b5-9f36afce3e16", "metadata": {}, "source": [ "**5**. Zillow home data is available at this URL: https://files.zillowstatic.com/research/public_csvs/zhvi/Metro_zhvi_uc_sfrcondo_tier_0.33_0.67_sm_sa_month.csv\n", "\n", "Open this file as a DataFrame in Pandas." ] }, { "cell_type": "code", "execution_count": 6, "id": "2838d3d2-5f1c-4e97-aab6-c7236ced82ad", "metadata": {}, "outputs": [], "source": [ "df_homes = pd.read_csv(\"https://files.zillowstatic.com/research/public_csvs/zhvi/Metro_zhvi_uc_sfrcondo_tier_0.33_0.67_sm_sa_month.csv\")\n" ] }, { "cell_type": "markdown", "id": "4476553a-1f93-4259-87df-69670ef486c0", "metadata": {}, "source": [ "**6.** Save the DataFrame, df_homes, to a local CSV file, \"zillow_home_data.csv\". " ] }, { "cell_type": "code", "execution_count": 7, "id": "1d237615-d23c-4f62-a0b8-c709deb95647", "metadata": {}, "outputs": [], "source": [ "df_homes.to_csv(\"../data/zillow_home_data.csv\")" ] }, { "cell_type": "markdown", "id": "182e0039-63dc-49df-9413-7eede1453fe0", "metadata": {}, "source": [ "**7.** Load zillow_home_data.csv back into a new Dataframe, df_homes_2" ] }, { "cell_type": "code", "execution_count": 8, "id": "53665adb-adb2-462f-a244-01665182e870", "metadata": {}, "outputs": [], "source": [ "df_homes_2 = pd.read_csv(\"../data/zillow_home_data.csv\")" ] }, { "cell_type": "markdown", "id": "eaaa6d3b-21e6-4e4a-916a-d522f1fd60c6", "metadata": {}, "source": [ "**8.** Compare the dimensions of the two DataFrames, df_homes and df_homes_2. Are they equal? If not, how can you fix it?" ] }, { "cell_type": "code", "execution_count": 9, "id": "dff1485e-fb64-4a19-a6ee-15dc38c72b2e", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(908, 271)\n", "(908, 272)\n", "False\n" ] } ], "source": [ "print(df_homes.shape)\n", "print(df_homes_2.shape)\n", "print(df_homes.shape == df_homes_2.shape)" ] }, { "cell_type": "markdown", "id": "7e6d423c-75b5-4813-85cd-700f598491ee", "metadata": {}, "source": [ "To fix the fact that they're not equal, save file again this time using index=False to avoid saving the index as a CSV column." ] }, { "cell_type": "code", "execution_count": 10, "id": "553a674c-345a-46ff-8466-6a8f003630db", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "True\n" ] } ], "source": [ "df_homes.to_csv(\"../data/zillow_home_data.csv\", index=False)\n", "df_homes_2 = pd.read_csv(\"../data/zillow_home_data.csv\")\n", "print(df_homes.shape == df_homes_2.shape)" ] }, { "cell_type": "markdown", "id": "0f34b104-fc3a-492c-964b-45daa12850b0", "metadata": {}, "source": [ "**9.** A remote spreadsheet showing how a snapshot of how traffic increased for a hypothetical website is available here: https://github.com/CodeSolid/CodeSolid.github.io/raw/main/booksource/data/AnalyticsSnapshot.xlsx. Load the worksheet page of the spreasheet data labelled \"February 2022\" as a DataFrame named \"feb\". Note: the leftmost column in the spreadsheet is the index column." ] }, { "cell_type": "code", "execution_count": 11, "id": "7b8772dd-5d19-4510-a725-80c7273ce61b", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
This MonthLast MonthMonth to Month Increase
Users1800.0280.05.428571
New Users1700.0298.04.704698
Page Views2534.0436.04.811927
\n", "
" ], "text/plain": [ " This Month Last Month Month to Month Increase\n", "Users 1800.0 280.0 5.428571\n", "New Users 1700.0 298.0 4.704698\n", "Page Views 2534.0 436.0 4.811927" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "url = \"https://github.com/CodeSolid/CodeSolid.github.io/raw/main/booksource/data/AnalyticsSnapshot.xlsx\"\n", "feb = pd.read_excel(url, sheet_name=\"February 2022\", index_col=0)\n", "feb" ] }, { "cell_type": "markdown", "id": "ad366534-b168-490c-a55d-654f3ef44288", "metadata": {}, "source": [ "**10.** The \"Month to Month Increase\" column is a bit hard to understand, so ignore it for now. Given the values for \"This Month\" and \"Last Month\", create a new column, \"Percentage Increase\"." ] }, { "cell_type": "code", "execution_count": 12, "id": "d053f773-c9a9-4592-aaaa-660ae8e189e5", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
This MonthLast MonthMonth to Month IncreasePercentage Increase
Users1800.0280.05.428571542.857143
New Users1700.0298.04.704698470.469799
Page Views2534.0436.04.811927481.192661
\n", "
" ], "text/plain": [ " This Month Last Month Month to Month Increase \\\n", "Users 1800.0 280.0 5.428571 \n", "New Users 1700.0 298.0 4.704698 \n", "Page Views 2534.0 436.0 4.811927 \n", "\n", " Percentage Increase \n", "Users 542.857143 \n", "New Users 470.469799 \n", "Page Views 481.192661 " ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "feb[\"Percentage Increase\"] = (feb[\"This Month\"] - feb[\"Last Month\"]) / feb[\"Last Month\"] * 100\n", "feb" ] }, { "cell_type": "markdown", "id": "8a71222f-2ab4-47bb-806d-2610e25b3a91", "metadata": {}, "source": [ "## Basic Operations on Data" ] }, { "cell_type": "markdown", "id": "0b9cf32b-3132-40cc-a5ca-26d6e6a36e4b", "metadata": {}, "source": [ "**11.** Using Seaborn, get a dataset about penguins into a dataframe named \"df_penguins\". Note that because all of the following questions depend on this example, we'll provide the solution here so no one gets stuck:" ] }, { "cell_type": "code", "execution_count": 13, "id": "a8b68caf-a998-414a-9de7-899eae7213c7", "metadata": {}, "outputs": [], "source": [ "df_penguins = sb.load_dataset('penguins')" ] }, { "cell_type": "markdown", "id": "f7170135-17bd-45e2-9239-dad647ed6eaf", "metadata": {}, "source": [ "**12.** Write the code to show the the number of rows and columns in df_penguins" ] }, { "cell_type": "code", "execution_count": 14, "id": "6f565afb-d5f4-462d-b3e5-946bc91bd83e", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(344, 7)" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_penguins.shape" ] }, { "cell_type": "markdown", "id": "9d3396f3-3fe6-4571-83b2-1e70f8a11513", "metadata": {}, "source": [ "**13.** How might you show the first few rows of df_penguins?" ] }, { "cell_type": "code", "execution_count": 15, "id": "68d5946e-5011-4735-983e-7485f44a60f2", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
speciesislandbill_length_mmbill_depth_mmflipper_length_mmbody_mass_gsex
0AdelieTorgersen39.118.7181.03750.0Male
1AdelieTorgersen39.517.4186.03800.0Female
2AdelieTorgersen40.318.0195.03250.0Female
3AdelieTorgersenNaNNaNNaNNaNNaN
4AdelieTorgersen36.719.3193.03450.0Female
\n", "
" ], "text/plain": [ " species island bill_length_mm bill_depth_mm flipper_length_mm \\\n", "0 Adelie Torgersen 39.1 18.7 181.0 \n", "1 Adelie Torgersen 39.5 17.4 186.0 \n", "2 Adelie Torgersen 40.3 18.0 195.0 \n", "3 Adelie Torgersen NaN NaN NaN \n", "4 Adelie Torgersen 36.7 19.3 193.0 \n", "\n", " body_mass_g sex \n", "0 3750.0 Male \n", "1 3800.0 Female \n", "2 3250.0 Female \n", "3 NaN NaN \n", "4 3450.0 Female " ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_penguins.head()" ] }, { "cell_type": "markdown", "id": "c841930d-e235-468a-809d-8427f8739076", "metadata": {}, "source": [ "**14.** How can you return the unique species of penguins from df_penguins? How many unique species are there?" ] }, { "cell_type": "code", "execution_count": 16, "id": "130ed9bd-397b-4504-bf26-6b0dc716ba8b", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0 Adelie\n", "152 Chinstrap\n", "220 Gentoo\n", "Name: species, dtype: object\n", "There are 3 unique species, ['Adelie', 'Chinstrap', 'Gentoo'].\n" ] } ], "source": [ "species = df_penguins[\"species\"].copy()\n", "unique = species.fillna(0)\n", "unique = unique.drop_duplicates()\n", "nrows = unique.shape[0]\n", "print(unique)\n", "print(f\"There are {nrows} unique species, {list(unique.values)}.\")" ] }, { "cell_type": "markdown", "id": "3912a3df-d931-4d74-a6fa-34ad11f83f94", "metadata": {}, "source": [ "**15.** What function can we use to drop the rows that have missing data?" ] }, { "cell_type": "code", "execution_count": 17, "id": "436cf3d3-db4a-42e3-81d3-9f2df28265f1", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
speciesislandbill_length_mmbill_depth_mmflipper_length_mmbody_mass_gsex
0AdelieTorgersen39.118.7181.03750.0Male
1AdelieTorgersen39.517.4186.03800.0Female
2AdelieTorgersen40.318.0195.03250.0Female
4AdelieTorgersen36.719.3193.03450.0Female
5AdelieTorgersen39.320.6190.03650.0Male
........................
338GentooBiscoe47.213.7214.04925.0Female
340GentooBiscoe46.814.3215.04850.0Female
341GentooBiscoe50.415.7222.05750.0Male
342GentooBiscoe45.214.8212.05200.0Female
343GentooBiscoe49.916.1213.05400.0Male
\n", "

333 rows × 7 columns

\n", "
" ], "text/plain": [ " species island bill_length_mm bill_depth_mm flipper_length_mm \\\n", "0 Adelie Torgersen 39.1 18.7 181.0 \n", "1 Adelie Torgersen 39.5 17.4 186.0 \n", "2 Adelie Torgersen 40.3 18.0 195.0 \n", "4 Adelie Torgersen 36.7 19.3 193.0 \n", "5 Adelie Torgersen 39.3 20.6 190.0 \n", ".. ... ... ... ... ... \n", "338 Gentoo Biscoe 47.2 13.7 214.0 \n", "340 Gentoo Biscoe 46.8 14.3 215.0 \n", "341 Gentoo Biscoe 50.4 15.7 222.0 \n", "342 Gentoo Biscoe 45.2 14.8 212.0 \n", "343 Gentoo Biscoe 49.9 16.1 213.0 \n", "\n", " body_mass_g sex \n", "0 3750.0 Male \n", "1 3800.0 Female \n", "2 3250.0 Female \n", "4 3450.0 Female \n", "5 3650.0 Male \n", ".. ... ... \n", "338 4925.0 Female \n", "340 4850.0 Female \n", "341 5750.0 Male \n", "342 5200.0 Female \n", "343 5400.0 Male \n", "\n", "[333 rows x 7 columns]" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_penguins.dropna()" ] }, { "cell_type": "markdown", "id": "f1feab44-7136-4736-a11c-ee005fdb0698", "metadata": {}, "source": [ "**16.** By default, will this modify df_penguins or will it return a copy?" ] }, { "cell_type": "markdown", "id": "8b0981bc-d8b1-42a5-9df6-e2723f61013c", "metadata": {}, "source": [ "It will return a copy." ] }, { "cell_type": "markdown", "id": "7f35c024-ac5c-43e0-8ea7-3e03953f644b", "metadata": {}, "source": [ "**17.** How can we override the default?" ] }, { "cell_type": "markdown", "id": "3f62e100-1000-40df-b21b-6af919fbb945", "metadata": {}, "source": [ "We can use ```df_penguins.dropna(inplace=True)```" ] }, { "cell_type": "markdown", "id": "a17f9c69-c8e2-4331-b9c7-c5be6a13a968", "metadata": {}, "source": [ "**18.** Create a new DataFrame, df_penguins_full, with the missing data deleted." ] }, { "cell_type": "code", "execution_count": 18, "id": "f160094c-ba0c-4fe1-96f0-d6b23e8162c9", "metadata": {}, "outputs": [], "source": [ "df_penguins_full = df_penguins.dropna()" ] }, { "cell_type": "code", "execution_count": 19, "id": "54156eb4-45a6-4fcf-b40c-b406ee493180", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Index(['species', 'island', 'bill_length_mm', 'bill_depth_mm',\n", " 'flipper_length_mm', 'body_mass_g', 'sex'],\n", " dtype='object')" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Expoloratory only\n", "df_penguins_full.columns" ] }, { "cell_type": "markdown", "id": "c34afa01-c0c4-40fa-b3a6-3b7c6abc58e1", "metadata": {}, "source": [ "**19.** What is the average bill length of a penguin, in millimeters, in this (df_full) data set?" ] }, { "cell_type": "code", "execution_count": 20, "id": "6012e669-3c97-4949-be33-3e44daf9534a", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "43.99279279279279" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_penguins_full['bill_length_mm'].mean()" ] }, { "cell_type": "markdown", "id": "38076367-ae05-4567-8963-12b5e0d77214", "metadata": {}, "source": [ "**20.** Which of the following is most strongly correlated with bill length? a) Body mass? b) Flipper length? c) Bill depth? Show how you arrived at the answer." ] }, { "cell_type": "markdown", "id": "2edbbf08-8f1d-40ea-b137-cdcdbfe02787", "metadata": {}, "source": [ "The answer is b) Flipper length. See below:" ] }, { "cell_type": "code", "execution_count": 21, "id": "93840bc0-eda0-48fd-b14d-4405a484c547", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.5894511101769488\n", "0.6530956386670861\n", "-0.2286256359130292\n" ] } ], "source": [ "print(df_penguins_full['bill_length_mm'].corr(df_penguins_full['body_mass_g']))\n", "print(df_penguins_full['bill_length_mm'].corr(df_penguins_full['flipper_length_mm']))\n", "print(df_penguins_full['bill_length_mm'].corr(df_penguins_full['bill_depth_mm']))\n" ] }, { "cell_type": "markdown", "id": "538ed22f-f2c0-46df-bf41-6a27246e356d", "metadata": {}, "source": [ "**21.** How could you show the median flipper length, grouped by species?" ] }, { "cell_type": "code", "execution_count": 22, "id": "a39321c8-1822-4078-91f9-6f9a36a3bab7", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "species\n", "Adelie 190.102740\n", "Chinstrap 195.823529\n", "Gentoo 217.235294\n", "Name: flipper_length_mm, dtype: float64" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_penguins_full.groupby('species').mean()['flipper_length_mm']" ] }, { "cell_type": "markdown", "id": "0d6ff0a0-f72b-47a0-a8c5-50cf2e9c8ef4", "metadata": {}, "source": [ "**22.** Which species has the longest flippers?" ] }, { "cell_type": "markdown", "id": "20f5ec8f-1580-438d-8ddb-ad2eeabed1ba", "metadata": {}, "source": [ "Gentoo" ] }, { "cell_type": "markdown", "id": "f50dacfd-a59d-4763-b8bf-004a02d8f05f", "metadata": {}, "source": [ "**23.** Which two species have the most similar mean weight? Show how you arrived at the answer." ] }, { "cell_type": "markdown", "id": "be5582aa-447d-4645-85cd-072894fcbb82", "metadata": {}, "source": [ "Adelie and Chinstrap" ] }, { "cell_type": "code", "execution_count": 23, "id": "c61f152b-d401-4c28-aea1-014d01100136", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
bill_length_mmbill_depth_mmflipper_length_mmbody_mass_g
species
Adelie38.82397318.347260190.1027403706.164384
Chinstrap48.83382418.420588195.8235293733.088235
Gentoo47.56806714.996639217.2352945092.436975
\n", "
" ], "text/plain": [ " bill_length_mm bill_depth_mm flipper_length_mm body_mass_g\n", "species \n", "Adelie 38.823973 18.347260 190.102740 3706.164384\n", "Chinstrap 48.833824 18.420588 195.823529 3733.088235\n", "Gentoo 47.568067 14.996639 217.235294 5092.436975" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_penguins_full.groupby('species').mean()" ] }, { "cell_type": "markdown", "id": "ddca88e5-374f-4c8a-bf61-26b70a8b3a90", "metadata": {}, "source": [ "**24.** How could you sort the rows by bill length?" ] }, { "cell_type": "code", "execution_count": 24, "id": "4faabd67-0417-45ca-b907-7a170f4e9b71", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
speciesislandbill_length_mmbill_depth_mmflipper_length_mmbody_mass_gsex
142AdelieDream32.115.5188.03050.0Female
98AdelieDream33.116.1178.02900.0Female
70AdelieTorgersen33.519.0190.03600.0Female
92AdelieDream34.017.1185.03400.0Female
8AdelieTorgersen34.118.1193.03475.0NaN
........................
321GentooBiscoe55.917.0228.05600.0Male
169ChinstrapDream58.017.8181.03700.0Female
253GentooBiscoe59.617.0230.06050.0Male
3AdelieTorgersenNaNNaNNaNNaNNaN
339GentooBiscoeNaNNaNNaNNaNNaN
\n", "

344 rows × 7 columns

\n", "
" ], "text/plain": [ " species island bill_length_mm bill_depth_mm flipper_length_mm \\\n", "142 Adelie Dream 32.1 15.5 188.0 \n", "98 Adelie Dream 33.1 16.1 178.0 \n", "70 Adelie Torgersen 33.5 19.0 190.0 \n", "92 Adelie Dream 34.0 17.1 185.0 \n", "8 Adelie Torgersen 34.1 18.1 193.0 \n", ".. ... ... ... ... ... \n", "321 Gentoo Biscoe 55.9 17.0 228.0 \n", "169 Chinstrap Dream 58.0 17.8 181.0 \n", "253 Gentoo Biscoe 59.6 17.0 230.0 \n", "3 Adelie Torgersen NaN NaN NaN \n", "339 Gentoo Biscoe NaN NaN NaN \n", "\n", " body_mass_g sex \n", "142 3050.0 Female \n", "98 2900.0 Female \n", "70 3600.0 Female \n", "92 3400.0 Female \n", "8 3475.0 NaN \n", ".. ... ... \n", "321 5600.0 Male \n", "169 3700.0 Female \n", "253 6050.0 Male \n", "3 NaN NaN \n", "339 NaN NaN \n", "\n", "[344 rows x 7 columns]" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_penguins.sort_values('bill_length_mm')" ] }, { "cell_type": "markdown", "id": "56d12fe3-805d-4f79-bf83-e515b5070229", "metadata": {}, "source": [ "**25.** How could you run the same sort in descending order?" ] }, { "cell_type": "code", "execution_count": 25, "id": "02544f08-974e-463c-a2da-ac0b3ba3ff0d", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
speciesislandbill_length_mmbill_depth_mmflipper_length_mmbody_mass_gsex
253GentooBiscoe59.617.0230.06050.0Male
169ChinstrapDream58.017.8181.03700.0Female
321GentooBiscoe55.917.0228.05600.0Male
215ChinstrapDream55.819.8207.04000.0Male
335GentooBiscoe55.116.0230.05850.0Male
........................
70AdelieTorgersen33.519.0190.03600.0Female
98AdelieDream33.116.1178.02900.0Female
142AdelieDream32.115.5188.03050.0Female
3AdelieTorgersenNaNNaNNaNNaNNaN
339GentooBiscoeNaNNaNNaNNaNNaN
\n", "

344 rows × 7 columns

\n", "
" ], "text/plain": [ " species island bill_length_mm bill_depth_mm flipper_length_mm \\\n", "253 Gentoo Biscoe 59.6 17.0 230.0 \n", "169 Chinstrap Dream 58.0 17.8 181.0 \n", "321 Gentoo Biscoe 55.9 17.0 228.0 \n", "215 Chinstrap Dream 55.8 19.8 207.0 \n", "335 Gentoo Biscoe 55.1 16.0 230.0 \n", ".. ... ... ... ... ... \n", "70 Adelie Torgersen 33.5 19.0 190.0 \n", "98 Adelie Dream 33.1 16.1 178.0 \n", "142 Adelie Dream 32.1 15.5 188.0 \n", "3 Adelie Torgersen NaN NaN NaN \n", "339 Gentoo Biscoe NaN NaN NaN \n", "\n", " body_mass_g sex \n", "253 6050.0 Male \n", "169 3700.0 Female \n", "321 5600.0 Male \n", "215 4000.0 Male \n", "335 5850.0 Male \n", ".. ... ... \n", "70 3600.0 Female \n", "98 2900.0 Female \n", "142 3050.0 Female \n", "3 NaN NaN \n", "339 NaN NaN \n", "\n", "[344 rows x 7 columns]" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_penguins.sort_values(['bill_length_mm'], ascending=False)" ] }, { "cell_type": "markdown", "id": "3b69d24f-cd5b-41c9-a4eb-b8325251fe7b", "metadata": {}, "source": [ "**26.** How could you sort by species first, then by body mass?" ] }, { "cell_type": "code", "execution_count": 26, "id": "bda80b1d-2773-493f-b13d-7b2fa60e6849", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
speciesislandbill_length_mmbill_depth_mmflipper_length_mmbody_mass_gsex
58AdelieBiscoe36.516.6181.02850.0Female
64AdelieBiscoe36.417.1184.02850.0Female
54AdelieBiscoe34.518.1187.02900.0Female
98AdelieDream33.116.1178.02900.0Female
116AdelieTorgersen38.617.0188.02900.0Female
........................
297GentooBiscoe51.116.3220.06000.0Male
337GentooBiscoe48.816.2222.06000.0Male
253GentooBiscoe59.617.0230.06050.0Male
237GentooBiscoe49.215.2221.06300.0Male
339GentooBiscoeNaNNaNNaNNaNNaN
\n", "

344 rows × 7 columns

\n", "
" ], "text/plain": [ " species island bill_length_mm bill_depth_mm flipper_length_mm \\\n", "58 Adelie Biscoe 36.5 16.6 181.0 \n", "64 Adelie Biscoe 36.4 17.1 184.0 \n", "54 Adelie Biscoe 34.5 18.1 187.0 \n", "98 Adelie Dream 33.1 16.1 178.0 \n", "116 Adelie Torgersen 38.6 17.0 188.0 \n", ".. ... ... ... ... ... \n", "297 Gentoo Biscoe 51.1 16.3 220.0 \n", "337 Gentoo Biscoe 48.8 16.2 222.0 \n", "253 Gentoo Biscoe 59.6 17.0 230.0 \n", "237 Gentoo Biscoe 49.2 15.2 221.0 \n", "339 Gentoo Biscoe NaN NaN NaN \n", "\n", " body_mass_g sex \n", "58 2850.0 Female \n", "64 2850.0 Female \n", "54 2900.0 Female \n", "98 2900.0 Female \n", "116 2900.0 Female \n", ".. ... ... \n", "297 6000.0 Male \n", "337 6000.0 Male \n", "253 6050.0 Male \n", "237 6300.0 Male \n", "339 NaN NaN \n", "\n", "[344 rows x 7 columns]" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_penguins.sort_values(['species', 'body_mass_g'])" ] }, { "cell_type": "markdown", "id": "82890e5f-d9f7-4729-a179-b0883f1bc498", "metadata": {}, "source": [ "## Selecting Rows, Columns, and Cells\n", "\n", "Let's look at some precious stones now, and leave the poor penguins alone for a while. Let's look at some precious stones now, and leave the poor penguins alone for a while. " ] }, { "cell_type": "markdown", "id": "5d1abb35-5d24-4bd2-89d4-dd9dbbdadccf", "metadata": {}, "source": [ "**27.** Load the Seaborn \"diamonds\" dataset into a Pandas dataframe named diamonds." ] }, { "cell_type": "code", "execution_count": 27, "id": "9247186e-0c21-43d3-88db-79c7f4ecbc06", "metadata": {}, "outputs": [], "source": [ "diamonds = sb.load_dataset('diamonds')" ] }, { "cell_type": "markdown", "id": "cbe20f32-5cb2-4191-9994-dbadaf7488dd", "metadata": {}, "source": [ "**28.** Display the columns that are available." ] }, { "cell_type": "code", "execution_count": 28, "id": "c12fd634-2c83-4130-87f3-5580e42ce496", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Index(['carat', 'cut', 'color', 'clarity', 'depth', 'table', 'price', 'x', 'y',\n", " 'z'],\n", " dtype='object')" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "diamonds.columns" ] }, { "cell_type": "markdown", "id": "505b677e-c1bb-4daa-8615-a78ea4860a71", "metadata": {}, "source": [ "**29.** If you select a single column from the diamonds DataFrame, what will be the type of the return value?" ] }, { "cell_type": "markdown", "id": "10307c17-e2ad-48ce-b0b9-ddde83b5f6c3", "metadata": {}, "source": [ "A Pandas Series." ] }, { "cell_type": "markdown", "id": "5f5d7cae-b784-449d-be28-a1473721b4fa", "metadata": {}, "source": [ "**30.** Select the 'table' column and show its type" ] }, { "cell_type": "code", "execution_count": 29, "id": "18963cfe-266d-4ca6-9471-b47000c35ee5", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "pandas.core.series.Series" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "table = diamonds['table']\n", "type(table)" ] }, { "cell_type": "markdown", "id": "82de5bb4-baed-422e-83a6-d7581ea15ab8", "metadata": {}, "source": [ "**31.** Select the first ten rows of the price and carat columns ten rows of the diamonds DataFrame into a variable called subset, and display them." ] }, { "cell_type": "code", "execution_count": 30, "id": "55ccbc17-b267-4ffb-9876-d2192ba2437d", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
pricecarat
03260.23
13260.21
23270.23
33340.29
43350.31
53360.24
63360.24
73370.26
83370.22
93380.23
\n", "
" ], "text/plain": [ " price carat\n", "0 326 0.23\n", "1 326 0.21\n", "2 327 0.23\n", "3 334 0.29\n", "4 335 0.31\n", "5 336 0.24\n", "6 336 0.24\n", "7 337 0.26\n", "8 337 0.22\n", "9 338 0.23" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "subset = diamonds.loc[0:9, ['price', 'carat']]\n", "subset" ] }, { "cell_type": "markdown", "id": "fdb4c150-ebdc-420d-979b-14231fc4764b", "metadata": {}, "source": [ "**32.** For a given column, show the code to display the datatype of the _values_ in the column? " ] }, { "cell_type": "code", "execution_count": 31, "id": "bb6b2d54-e58f-4da4-bcf2-a4d08ee34602", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "dtype('int64')" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "diamonds['price'].dtype" ] }, { "cell_type": "markdown", "id": "ca510203-8c7e-48a1-81c3-c9e1bc5aa7b1", "metadata": {}, "source": [ "**33.** Select the first row of the diamonds DataFrame into a variable called row." ] }, { "cell_type": "code", "execution_count": 32, "id": "8535dd1a-db9d-49c7-a6b2-0257616d7334", "metadata": {}, "outputs": [], "source": [ "row = diamonds.iloc[0,:]" ] }, { "cell_type": "markdown", "id": "edbd9394-ec43-479c-919a-5eab17acb792", "metadata": {}, "source": [ "**34.** What would you expect the data type of the row to be? Display it." ] }, { "cell_type": "markdown", "id": "22d7c0bd-cf41-491e-8fc5-41aed540c16e", "metadata": {}, "source": [ "A Pandas series" ] }, { "cell_type": "code", "execution_count": 33, "id": "777274bd-3653-4d0c-b73e-75f879cb48ae", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "pandas.core.series.Series" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(row)" ] }, { "cell_type": "markdown", "id": "f6027d4a-4df5-45c6-bb82-c4ed046c2681", "metadata": {}, "source": [ "**35.** Can you discover the names of the columns using only the row returned in #33? Why or why not?Can you discover the names of the columns using only the row returned in #33? Why or why not?" ] }, { "cell_type": "markdown", "id": "55c063e1-2346-4782-9133-e684c2714fe0", "metadata": {}, "source": [ "Yes, because a row series should have the columns as the index (See below):" ] }, { "cell_type": "code", "execution_count": 34, "id": "1a5b9a02-9764-43b7-b49b-4522506e5669", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Index(['carat', 'cut', 'color', 'clarity', 'depth', 'table', 'price', 'x', 'y',\n", " 'z'],\n", " dtype='object')" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "row.index" ] }, { "cell_type": "markdown", "id": "369f761e-0dbe-4589-835f-526ab8f9884f", "metadata": {}, "source": [ "**36.** Select the row with the highest priced diamond." ] }, { "cell_type": "code", "execution_count": 35, "id": "41f8510d-cf79-43eb-910c-1c03aebc21ba", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "carat 2.29\n", "cut Premium\n", "color I\n", "clarity VS2\n", "depth 60.8\n", "table 60.0\n", "price 18823\n", "x 8.5\n", "y 8.47\n", "z 5.16\n", "Name: 27749, dtype: object" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "diamonds.loc[diamonds['price'].idxmax(), :]" ] }, { "cell_type": "markdown", "id": "2694d362-a5d1-4ad9-8b09-f408d7531948", "metadata": {}, "source": [ "**37.** Select the row with the lowest priced diamond." ] }, { "cell_type": "code", "execution_count": 36, "id": "b6559ffa-fbb6-4ce9-b882-5bccbf5403e4", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "carat 0.23\n", "cut Ideal\n", "color E\n", "clarity SI2\n", "depth 61.5\n", "table 55.0\n", "price 326\n", "x 3.95\n", "y 3.98\n", "z 2.43\n", "Name: 0, dtype: object" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "diamonds.loc[diamonds['price'].idxmin(), :]" ] }, { "cell_type": "markdown", "id": "e366af69-1739-48c5-b964-298fa294bc7a", "metadata": {}, "source": [ "## Some Exercises Using Time Series" ] }, { "cell_type": "markdown", "id": "ca92f2e5-c4f3-460e-b3b1-0d6bff4694ff", "metadata": {}, "source": [ "**38.** Load the taxis dataset into a DataFrame, ```taxis```." ] }, { "cell_type": "code", "execution_count": 37, "id": "5bb6dee5-3828-4a80-8a35-98397639e5db", "metadata": {}, "outputs": [], "source": [ "taxis = sb.load_dataset('taxis')" ] }, { "cell_type": "markdown", "id": "06ef6738-b5fb-4341-8ced-15f02ee5aad6", "metadata": {}, "source": [ "**39.** The 'pickup' column contains the date and time the customer picked up, but it's a string. Add a column to the DataFrame, 'pickup_time', containing the value in 'pickup' as a DateTime." ] }, { "cell_type": "code", "execution_count": 38, "id": "a857614a-ada0-43c5-9ba0-6e00422bf518", "metadata": {}, "outputs": [], "source": [ "taxis['pickup_time'] = pd.to_datetime(taxis['pickup'])" ] }, { "cell_type": "markdown", "id": "d2aca5a0-b67d-4c2b-928f-49a0df783a4a", "metadata": {}, "source": [ "**40.** We have a hypothesis that as the day goes on, the tips get higher. We'll need to wrangle the data a bit before testing this, however. First, now that we have a datetime column, pickup_time, create a subset of it to create a new DataFrame, taxis_one_day. This new DataFrame should have values between '2019-03-23 00:06:00' (inclusive) and '2019-03-24 00:00:00' (exlusive)." ] }, { "cell_type": "code", "execution_count": 39, "id": "4c92be8d-ee1a-452d-84e9-64617cbcd850", "metadata": {}, "outputs": [], "source": [ "mask = (taxis['pickup_time'] >= '2019-03-23 06:00:00') & (taxis['pickup_time'] < '2019-03-24 00:00:00')\n", "taxis_one_day = taxis.loc[mask]" ] }, { "cell_type": "markdown", "id": "14b66e0e-762c-433c-9041-e40cf624eaa5", "metadata": {}, "source": [ "**41.** We now have a range from morning until midnight, but we to take the mean of the numeric columns, grouped at one hour intervals. Save the result as df_means, and display it." ] }, { "cell_type": "code", "execution_count": 40, "id": "baca4e69-dc2d-492e-918b-c15d7d560b4b", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
passengersdistancefaretiptollstotal
pickup_time
2019-03-23 06:00:001.0000000.40000021.5000000.0000000.00000023.133333
2019-03-23 07:00:002.3333330.9800005.2500001.1650000.0000009.298333
2019-03-23 08:00:001.0000000.0200002.5000000.0000000.0000003.300000
2019-03-23 09:00:001.5000001.3520007.4000001.6740000.00000012.124000
2019-03-23 10:00:001.0000001.7600008.7500000.7275000.00000012.152500
2019-03-23 11:00:001.9090912.07000011.0909090.8036360.00000014.667273
2019-03-23 12:00:002.0000002.26714310.2600000.6457140.00000013.420000
2019-03-23 13:00:002.5000001.1670007.5500002.0740000.00000012.344000
2019-03-23 14:00:002.4705884.75294118.3300001.9452941.00352924.267059
2019-03-23 15:00:001.0000006.55714322.2142863.2100001.64571430.370000
2019-03-23 16:00:002.0000002.19454510.4545451.1090910.00000014.431818
2019-03-23 17:00:001.0909091.91363614.8181822.6881820.52363620.739091
2019-03-23 18:00:001.5714293.20642912.8214290.8442860.41142916.427143
2019-03-23 19:00:001.5263162.09789510.2631581.1763160.00000014.226316
2019-03-23 20:00:001.4000002.44800011.1000001.5440000.00000015.944000
2019-03-23 21:00:001.0000002.01714310.5714291.4200000.00000015.791429
2019-03-23 22:00:001.3076921.8815388.9230771.0946150.00000013.433077
2019-03-23 23:00:001.6153853.72538515.1153851.6961540.00000020.034615
\n", "
" ], "text/plain": [ " passengers distance fare tip tolls \\\n", "pickup_time \n", "2019-03-23 06:00:00 1.000000 0.400000 21.500000 0.000000 0.000000 \n", "2019-03-23 07:00:00 2.333333 0.980000 5.250000 1.165000 0.000000 \n", "2019-03-23 08:00:00 1.000000 0.020000 2.500000 0.000000 0.000000 \n", "2019-03-23 09:00:00 1.500000 1.352000 7.400000 1.674000 0.000000 \n", "2019-03-23 10:00:00 1.000000 1.760000 8.750000 0.727500 0.000000 \n", "2019-03-23 11:00:00 1.909091 2.070000 11.090909 0.803636 0.000000 \n", "2019-03-23 12:00:00 2.000000 2.267143 10.260000 0.645714 0.000000 \n", "2019-03-23 13:00:00 2.500000 1.167000 7.550000 2.074000 0.000000 \n", "2019-03-23 14:00:00 2.470588 4.752941 18.330000 1.945294 1.003529 \n", "2019-03-23 15:00:00 1.000000 6.557143 22.214286 3.210000 1.645714 \n", "2019-03-23 16:00:00 2.000000 2.194545 10.454545 1.109091 0.000000 \n", "2019-03-23 17:00:00 1.090909 1.913636 14.818182 2.688182 0.523636 \n", "2019-03-23 18:00:00 1.571429 3.206429 12.821429 0.844286 0.411429 \n", "2019-03-23 19:00:00 1.526316 2.097895 10.263158 1.176316 0.000000 \n", "2019-03-23 20:00:00 1.400000 2.448000 11.100000 1.544000 0.000000 \n", "2019-03-23 21:00:00 1.000000 2.017143 10.571429 1.420000 0.000000 \n", "2019-03-23 22:00:00 1.307692 1.881538 8.923077 1.094615 0.000000 \n", "2019-03-23 23:00:00 1.615385 3.725385 15.115385 1.696154 0.000000 \n", "\n", " total \n", "pickup_time \n", "2019-03-23 06:00:00 23.133333 \n", "2019-03-23 07:00:00 9.298333 \n", "2019-03-23 08:00:00 3.300000 \n", "2019-03-23 09:00:00 12.124000 \n", "2019-03-23 10:00:00 12.152500 \n", "2019-03-23 11:00:00 14.667273 \n", "2019-03-23 12:00:00 13.420000 \n", "2019-03-23 13:00:00 12.344000 \n", "2019-03-23 14:00:00 24.267059 \n", "2019-03-23 15:00:00 30.370000 \n", "2019-03-23 16:00:00 14.431818 \n", "2019-03-23 17:00:00 20.739091 \n", "2019-03-23 18:00:00 16.427143 \n", "2019-03-23 19:00:00 14.226316 \n", "2019-03-23 20:00:00 15.944000 \n", "2019-03-23 21:00:00 15.791429 \n", "2019-03-23 22:00:00 13.433077 \n", "2019-03-23 23:00:00 20.034615 " ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "taxis_means = taxis_one_day.groupby(pd.Grouper(key='pickup_time', freq='1h')).mean()\n", "taxis_means" ] }, { "cell_type": "markdown", "id": "601743a0-8e97-46f3-8c46-6681353d374c", "metadata": {}, "source": [ "**42.** Create a simple line plot of the value \"distance\". " ] }, { "cell_type": "code", "execution_count": 41, "id": "d270aaf8-b52e-4d1f-8373-3ce72f9d4a81", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAWsAAAEHCAYAAABocGdZAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/YYfK9AAAACXBIWXMAAAsTAAALEwEAmpwYAAA1/ElEQVR4nO3dd3iUVfbA8e9NJyEFUmhJSEJooYQSUAGxIaAioGLdXQULll1dd3VXt/ys21zbunbsrqy6Is2ChSZVeighARJaeiW9J/f3x8zEGFNmkumcz/PkcTLzlvuS8cyd+957jtJaI4QQwrl5OLoBQgghuibBWgghXIAEayGEcAESrIUQwgVIsBZCCBfgZYuDhoWF6ZiYGFscWggh3NaePXuKtNbh7b1mk2AdExPD7t27bXFoIYRwW0qpUx29JsMgQgjhAiRYCyGEC5BgLYQQLsAmY9ZCiLNHQ0MDWVlZ1NbWOropLsPPz4/IyEi8vb3N3keCtRCiR7KysggMDCQmJgallKOb4/S01hQXF5OVlUVsbKzZ+8kwiBCiR2prawkNDZVAbSalFKGhoRZ/E5FgLYToMQnUlunOv5cEayHasfFIATX1TY5uhhAtZMxaiDb2nDrDwnd28derRvOzcwY7ujnCQo899hi9e/emvLyc6dOnM2PGjHa3W7lyJcOGDSMhIcHOLewe6VkL0caKfVkApBdUOrgloieeeOKJDgM1GIL14cOH7diinpFgLUQr9Y3NfH4gF4ATRVUObo0w11//+leGDRvGtGnTOHLkCAALFy5k2bJlADz88MMkJCQwduxYHnzwQbZt28bq1av53e9+x7hx48jIyOCNN95g0qRJJCYmcs0111BdXd1ynPvuu48pU6YQFxfXckyAp556ijFjxpCYmMjDDz8MQEZGBrNnz2bixImcf/75pKWlWeUaZRhEiFY2HimgtLqBsN6+HC+UYG2pxz9L4XBOuVWPmTAwiEevHNXh63v27OGjjz4iOTmZxsZGJkyYwMSJE1teLy4uZsWKFaSlpaGUorS0lJCQEObOncucOXNYsGABACEhIdxxxx0A/PnPf+att97i3nvvBSA3N5ctW7aQlpbG3LlzWbBgAWvWrGHVqlXs2LEDf39/SkpKAFi8eDGvvfYaQ4cOZceOHdxzzz2sX7++x/8OEqyFaGVlcjahAT5clxTJa99lUNfYhK+Xp6ObJTqxefNmrrrqKvz9/QGYO3fuj14PDg7Gz8+P2267jTlz5jBnzpx2j3Po0CH+/Oc/U1paSmVlJbNmzWp5bf78+Xh4eJCQkEB+fj4Aa9euZdGiRS3n7du3L5WVlWzbto1rr722Zd+6ujqrXKcEayGMymoaWJtawE2ToxnWL5BmDZkl1cRHBDq6aS6jsx6wo3h5ebFz507WrVvHsmXLeOmll9rt6S5cuJCVK1eSmJjIu+++y8aNG1te8/X1bXncWZHx5uZmQkJCSE5OtuYlADJmLUSLNQdzqW9s5qrxg4gNCwCQoRAXMH36dFauXElNTQ0VFRV89tlnP3q9srKSsrIyLr/8cp5//nn2798PQGBgIBUVFS3bVVRUMGDAABoaGli6dGmX57300kt55513Wsa2S0pKCAoKIjY2lk8++QQwBHbT+XpKgrUQRiv2ZRMXFsDYyGBiw43BWm4yOr0JEyZw/fXXk5iYyGWXXcakSZN+9HpFRQVz5sxh7NixTJs2jeeeew6AG264gaeffprx48eTkZHBk08+yTnnnMPUqVMZMWJEl+edPXs2c+fOJSkpiXHjxvHMM88AsHTpUt566y0SExMZNWoUq1atssp1qs669N2VlJSkpfiAcCVZZ6qZ9tQGHrh0GPdeMhSApL+s5ZIRETy1YKyDW+fcUlNTGTlypKOb4XLa+3dTSu3RWie1t730rIUAViXnADB//KCW5+LCAmT6nnAaEqzFWU9rzYp92UyK6UNUX/+W52PDAmQYRDgNCdbirJeSU056QeWPetUAseEBFFXWUV7b4KCWuQ5bDKe6s+78e0mwFme9Ffuy8fH04IoxA370fJxxRsgJmRHSKT8/P4qLiyVgm8mUz9rPz8+i/WSetTirNTY1syo5h4tGhBPi7/Oj1+KMM0JOFFWRGBXigNa5hsjISLKysigsLHR0U1yGqVKMJcwO1kqpEOBNYDSggVu11tstOpsQTmZrRjFFlXVc1WYIBCCqrz8eSqbvdcXb29uiiieieyzpWb8AfKW1XqCU8gH8u9pBCGe3Ym8WQX5eXDQi4iev+Xp5EtnHX2aECKdgVrBWSgUD04GFAFrreqDeds0Swvaq6hr5OiWf+eMHdZj/Iy48gOOFkipVOJ65NxhjgULgHaXUPqXUm0qpgNYbKKUWK6V2K6V2y9iVcAXfHM6jpqGJqyf8dAjEJNY411punglHMzdYewETgFe11uOBKuDh1htorZdorZO01knh4eFWbqYQ1rd8bzaRfXoxMbpPh9vEhQVQXd9EQYV1MqcJ0V3mBussIEtrvcP4+zIMwVsIl1RQXsvW9CLmjxuEh0fHxUtjw3oDktBJOJ5ZwVprnQdkKqWGG5+6BHCdejhCtLF6fw7Nmp8shGkrttX0PSEcyZLZIPcCS40zQY4Di2zTJCFsb8W+bMZGBhMf0bvT7QYE+eHn7SE3GYXDmR2stdbJQLvZoIRwJUfzK0jJKefRK7uuau3hoYgJlYROwvFkubk466zYl42nh2LO2IFmbR8XLsFaOJ4Ea3FWaW7WrNqXzflDwwgP9O16BwzT906XVNPQ1Gzj1gnRMQnW4qyy40QJOWW17S4v70hsWG8amzWZJdU2bJkQnZNgLc4qK/dlE+DjycyE/mbvEyczQoQTkGAtzhq1DU18eTCX2aMH0Mun/eXl7WlJlSrBWjiQBGtx1liXWkBFXaNFQyAAIf4+9PH3lux7wqEkWIuzxop92fQL8uW8IaEW7xsbFiBFCIRDSbAWZ4WSqno2Hilg3rhBeHayvLwjceG9OV4kC2OE40iwFmeFLw7k0NismT/OsiEQk9iwAPLL66iqa7Ryy4QwjwRrcVZYvi+bEf0DSRgY1K395SajcDQJ1sLtnSyqYt/p0i6TNnVGEjoJR5NgLdzeyuRslIJ548xbXt6emNAAlJJUqcJxJFgLt6a1ZsW+bM6LC2VAcK9uH8fP25OBwb04ITcZhYNIsBZubV9mKaeKq3s0BGIiCZ2EI0mwFm5txd5sfL08uGy0+cvLOxIbFsBxqccoHESCtXBb9Y3NfH4gh0sT+hHo593j48WGBVBR20hxVb0VWieEZSRYC7e16WghZ6obOq1ebolY4/Q9uckoHEGCtXBbK/Zl0zfAh/OHhlvleEPCDSXA5CajcAQJ1sItldc28G1qPleOHYC3p3Xe5gNDeuHj6SEJnYRDSLAWbmnNwVzqG5u5akKk1Y7p6aEYHOovCZ2EQ0iwFm5pxb5sYsMCSIwMtupxY8Nk+p5wDLODtVLqpFLqoFIqWSm125aNEqInsktr+P54CVeNH4RSlmfY60xceG9OFVfT1CzT94R9eVm4/UVa6yKbtEQIK1mVnA3Q7Qx7nYkLC6C+qZnsMzVEh/pb/fhCdESGQYRb0VqzYm82Ewf3sUkwNSV0ktzWwt4sCdYa+EYptUcptbjti0qpxUqp3Uqp3YWFhdZroRAWSMkp51hBpcWlu8wVK6lShYNYEqynaa0nAJcBv1RKTW/9otZ6idY6SWudFB5unXmtQlhq5b5svD0VV4wZYJPjhwb4EOjnJQtjhN2ZHay11tnG/xYAK4DJtmqUEN3R2NTMqv05XDQ8gj4BPjY5h1KKuPDe0rMWdmdWsFZKBSilAk2PgZnAIVs2TAhLbcsoprCizmZDICZxMn1POIC5Pet+wBal1H5gJ/CF1vor2zVLCMut3JdNkJ8XF42IsOl5YsMCyC6tobahyabnEaI1s6buaa2PA4k2bosQ3VZd38hXKXnMGzcQP29Pm57LdJPxZHEVI/p3r6ajEJaSqXvCLXyTkk91fZNN5la3Jdn3hCNIsBZuYVVyNoNCejEppq/NzyXT94QjSLAWLk9rzZ5TZ7hgeDgeHtZdXt6eAF8v+gf5Sc9a2JUEa+HycspqKa9tZOQA+40fGxI6ySpGYT8SrIXLS8stB2Bk/0C7nTM2PEDyWgu7kmAtXF6qMVgPt2OwjgsLoLS6gTNSj1HYiQRr4fJS8yqI6tvLKkVxzRXXktBJetfCPiRYC5eXmlvOSDvPd44NM9VjlGAt7EOCtXBpNfVNnCyqYoQdby4CRPbphZeHkpuMwm4kWAuXdjS/gmYNCQPsN14N4O3pQXRff5m+J+xGgrVwaWl5hpuLjlj2HRcuCZ2E/UiwFi4tNbcCfx9Povvav8SWqXhus9RjFHYgwVq4tNTccob3D7TLysW2YsN6U9fYTG55rd3PLc4+EqyFy9JaG2aC2PnmoskPCZ3kJqOwPQnWwmXlmpaZ23ExTGumudYybi3sQYK1cFmmlYuO6llHBPoS4OMpM0KEXUiwFi4rLa8CsO8y89aUUsTKjBBhJxKshcs6nFtu92XmbcWGSfFcYR8SrIXLSsstd3hZrdiwALLOVFPXKPUYhW1JsBYuqbahiRNFVQ4brzYZEh5As4bTxdUObYdwfxKshUsyLTN31EwQk5bpezIUImzMomCtlPJUSu1TSn1uqwYJYQ5HzwQxiZF6jMJOLO1Z/xpItUVDhLCEI5eZtxbk501Yb19ZGCNszuxgrZSKBK4A3rRdc4QwjyOXmbcVFybT94TtWdKz/hfwe6DZNk0Rwjxaa9LyKhw+E8REsu8JezArWCul5gAFWus9nWyzWCm1Wym1u7Cw0GoNFKKt3LJaymoa7J7DuiOxYQEUVdZTVtPg6KYIN2Zuz3oqMFcpdRL4CLhYKfVB6w201ku01kla66Tw8HArN1OIH7TksHbwzUUT04yQk9K7FjZkVrDWWv9Bax2ptY4BbgDWa61/btOWCdGB1FzHLjNv64fiuXKTUdiOzLMWLic1t5zIPr0IcuAy89ai+vrjoeCEJHQSNuRl6Q5a643ARqu3RAgzOTKHdXt8vTyJ6usvC2OETUnPWriUlmXmTjIEYhIr0/eEjUmwFi6lZZm5E/Ws4YdgrbXUYxS2IcFauJQ0481FZ5kJYhIXFkB1fRP55XWObopwUe9sPdHp6xKshUs5nFtOL29PBjt4mXlbceG9AZkRIrrv071Znb4uwVq4lLQ851lm3lqsJHQSPZBeUMmh7PJOt5FgLVyGoZp5hdONVwP0D/LDz9tDpu+JblmdnI3qov8hwVq4jLxywzLzkU6yzLw1Dw9FTGiATN8TFtNas2p/DlOGhHa6nQRr4TKcJYd1RyShk+iO/VllnCquZt64QZ1uJ8FauAxnW2beVlxYb06XVNPQJIkphflW7svGx8uD2aP7d7qdBGvhMpxtmXlbsWEBNDVrMkukHqMwT2NTM58fyOWSERFdvq8lWAuX4Uw5rNsTGy4zQoRltmUUU1RZx7xxA7vcVoK1cAm1DU0cL6x0mhzW7YkzFc+VGSHCTKuScwj08+LC4RFdbivBWriEY/mVNGvnW7nYWoi/D30DfGRGiDBLbUMTX6fkcdno/vh5e3a5vQRr4RKcfSaIiSFHiKxiFF1bl1pAZV1jl7NATCRYC5eQmmdYZu7oauZdkex7wlyrkrOJCPTl3LjO51ebSLAWLsFUzdzTyZaZtxUbFkB+eR2VdY2ObopwYmXVDWw8UsiViQPNfk9LsBZOz1TN3BlXLrYVJ/UYhRnWHMqlvqmZ+WYOgYAEa+EC8sprKa1ucPrxamidfc91gnWjLOKxu5XJ2cSFBTB6kPnvaQnWwum15LB24jnWJoND/VEuVI+xsKKOKf9Yz5JNGY5uylkjt6yGHSdKmDduEKqr7E2tSLAWTu+wcSbICBcYBvHz9mRgcC+XmRHy+ncZFFTU8ew3R8k6Iysv7eGz/TlojVkLYVqTYC2cXlpeBYNCnHeZeVtx4a6Rfa+gopYPdpzigmHhKAV/+zLV0U06K6xKziExKoQY4/0Nc0mwFk7P2aqZdyU2LIAThc5fj/G1jcdpaNI8MW8U91wYz5cH89iWXuToZrm19IIKUnLKmZdoWa8azAzWSik/pdROpdR+pVSKUupxi88kRDeYlpm7wkwQk7iwACrqGimqrHd0UzpUUF7L0h2nuHr8IAaHBrB4ehxRfXvx2GcpkjXQhlYl5+ChYE7iAIv3NbdnXQdcrLVOBMYBs5VS51p8NiEsZFpm7lI9a+OMEGdeHPPKxgyamjX3XjwUMIy1//mKBI7mV/LB96cc3Dr3pLVmVXIOU+PDiAj0s3h/s4K1NjDdMfE2/jj3dzzhFlLzjDcXnTSHdXt+SOjknDcZ88pq+e/O01wzIZLo0B9WhM5M6Mf5Q8N47tujFFVKlXZr25dZyumSauZ2YwgELBizVkp5KqWSgQLgW631jjavL1ZK7VZK7S4sLOxWY4RoK9VUzTzUspsxjjQwpBc+Xh5O27N+dWM6zc2aX10c/6PnlVI8euUoauqbeObrIw5qnftanZxjVpGBjpgdrLXWTVrrcUAkMFkpNbrN60u01kla66Tw8PBuNUaIttJyKxjmAsvMW/P0UMSE+jvljJDcsho+3JnJtUmRRLWTZyU+ojeLpsbw8e5MDmSV2r+BbspQZCCHGSMjCOzmrCaLZ4NorUuBDcDsbp1RCDNprUnNK3fqHNYdcdaETq9syKBZa355UXyH29x3yVBCA3x5dHUKzc0y2mkNWzOKKaqsNzvDXnvMnQ0SrpQKMT7uBVwKpHX7rEKYIb+8jtLqBpdYudhWbFhvThVX0eREwS6ntIaPd2VybVIUkX06zl4Y6OfNQ7OHs+90KSv2Zduxhe5rVXK2schA90cdzO1ZDwA2KKUOALswjFl/3u2zCmEGV8lh3Z64sAAamrRTrQp8eUM6mp+OVbfnmgmRjIsK4e9r0qiobbBD69xXTX0TXx/K4/LRA/D16rrIQEfMnQ1yQGs9Xms9Vms9Wmv9RLfPKISZTDNBnLWaeWdM9RidZdw660w1/9udyXVJUQwK6dXl9h4eisfnjqK4qo4X16fboYXua11aPlX1Tcwb371ZICayglE4rdRcwzLz4F6uscy8NdP0PWdJ6PTyhgwUqtOx6rYSo0K4bmIUb285QXqBc05DdAUr9+XQL8iXc2LNKzLQEQnWwmml5Za71MrF1voG+BDk5+UUNxkzS6r5ZHcm10+KYqAZverWfjd7OL28PXni88NOv3zeGZVW1/Pd0QLmWlBkoCMSrIVTqm1o4nhRlUuOV4NhznJseG+OO0H2vZc3pOOhFPdcNMTifcN6+3L/pcPYdLSQtakFNmide/vyYB4NTbpHs0BMJFgLp5ReUElTs3bJmSAmccaETo6UWVLNsj1Z3Dg5igHBlvWqTW4+bzBDI3rz5OeHqW1osnIL3duq5GyGhAcwamDP38cSrIVTOtwyE8Q1h0HAMNc6p6yWmnrHBbgX1x/Dw0NxjwVj1W15e3rw2NxRnC6p5s3Nx63YOveWU1rDzpOWFxnoiARr4ZTScivw8/ZwqWXmbcUZZ4ScLHZM7/p0cTWf7s3mpsnR9AuyPHFQa1Pjw7hsdH9e3pBBTmmNlVro3kxFBrqbC6QtCdbCKRmqmQe51DLztmJNM0IcdJPxxfXH8PJQ3HOh5WPV7fnj5SNp1lqKFJhpVXIO47pRZKAjEqyF0zFUMy9npAvOr24tJtRx2fdOFlWxfF82PztnMBE97FWbRPX15+4Lh/D5gVy+P15slWO6q2P5FRzOLbe4dFdnJFgLp5NfXscZF6lm3pkAXy/6B/k5ZGHMi+vT8fZU3HVhnFWPe9cFQxgU0ovHVqdIVfROtBQZGCvBWrgxV8xh3ZG4cPsndDpRVMWKfVn8/JzB3Upy3xlDkYKRpOVV8N+dp6167PbUNzZT1+haM1C01qzan83U+DDCA32tdlwJ1sLppLZUM3ftnjU4Jvvei+uO4ePlwZ0XWGesuq3Zo/szNT6UZ785SkmVbUqXNTQ1s3THKaY+tZ6kJ9fy8a7TLrMoZ+/pUjJLaphvhbnVrUmwFk4nzYWXmbcVGxZAaXUDZ2wU1NrKKKxkZXI2N58XY9VeXWumIgWVdY088411ixRorVlzMJdZz2/iTysOEd3Xn4SBQTz06UFufnsn2S4wE2V1cja+Xh7MHNXPqseVYC2cTqoLLzNvK64loZN9bjK+uO4Yvl6eLJ5u3bHqtob1C+SW82L4cOdpDmWXWeWY2zKKmP/KNu5euhdPD8UbNyex7K7z+PCOc3ly3ij2nDrDzOe+Y+mOU07by25oaubzA7nMGNmv20UGOiLBWjgV0zJzV1652FpsmKF47nE7rGRML6hk9f4cbp4ymLDetulVt/brGUPp6+/DY6tTehQ8D+eUc8vbO7npjR0UlNfyzwVj+er+6Vya0A+lFB4eil+cF8PX909nXHQIf1pxiJ+9uYPMEudJP2uyNb2I4qp6q84CMZFgLZyKaZm5q88EMYnq0wsvD2WXcet/rzuGn7cnd063zVh1W8G9vHlo9gh2nzrDquQci/fPLKnm/o/2ccWLm0nOLOWPl49gw4MXcl1SVLvz66P6+vPBbefwt6vGcCCrjFn/2sT72086VTWbVck5BPl5cUEPigx0xMvqRxRuJ7eshnWpBSRnlvLbS4dZnLnNEj/cXHSPYRAvTw+iQ/1tHqyP5Vfw2YEc7rpgCH0DfGx6rtYWTIzkgx2n+NuXqcxI6Edv365DSnFlHS9tSOeD70/hoRR3Th/C3RcMIdi/62EDpRQ3nRPNBcPDefjTAzyyKoUvDuTyzwVjHb7ataa+ia9T8pg3bmCPigx0RIK1+AmtNSk55Xx7OJ91afkcyi5vea28poElNyfZ7NypxmXmMS68zLytuLAAmw+DvLDuGP7eniw+37Zj1W2ZihRc9co2XlqfzsOXjehw26q6Rt7acoIlm45TXd/IdUlR/HrG0G4lmBoU0ov3b53MJ7uzePLzw8z61yZ+P2sEC6fE4OGgVa/fpuZTXd/E3ETrzgIxkWAtAMNY8faMYtam5rMutYC88lqUgonRfXho9ghmjIzgm8P5PP31Eb47WsgFw2xTwT4tr5zh/VyrmnlXYsMC2HSsiMKKOpvM0DiaX8EXB3O558Ih9LFjr9pkfHQfFkyM5K0tx7l+UlTLMnuThqZmPtp5mhfWpVNUWcesUf343azhxEf07NuTUorrJkVx/rAw/rj8IE98fpgvDxp62XHhvXt07O5YnZxN/yA/zonta5PjS7A+ixVV1rE+rYC1h/PZkl5EdX0T/j6eTB8azoyEflw0PJzQVjeqokP9WbYni8dXp/DV/dPx8bLuLQ+tNam55cwa1d+qx3W0aUPDeWPzCab+Yz1zEgdw69RYRg8KttrxX1h7jAAfL+6wc6+6td/PHs5Xh/J44rMU3lk0GYDmZs0XB3N59psjnCyuZnJMX17/xUQmDu5j1XMPCO7F2wsnsXxvNo9/lsJlL2zmwZnDuXVarN0+9M9U1bPxSCG3Tou1Wc9egvVZRGvNsYJKw/BGaj77MkvRGgYE+3HNhEguGRnBuXGh+Hm3P97m6+XJI1cmsOidXby99QR3WXnRRUGFYZm5O6xcbO2CYeGs/e0FvL/9JMv2ZLF8bzZJg/uwaGoss0b1w8uz+x96aXnlfHEwl3svjifE3/69apOIQD/unzGUv3yRyvq0fHy9PPnHmjQOZpcxvF8gby9M4qLhEVZJFdoepRTXTIxk2tAw/rTiEH/9MpUvD+Xy9IKxPe7Bm+PLQ7k0NmurZdhrj7LFfMWkpCS9e/duqx9XWK6hqZmdJ0paxp8zSwyLCsZGBnPJiH7MSIggYUCQRf8T3f7eLrZnFLPugQvpH2y95cwbjhSw6J1dfLz4XM6J61m9OmdVXtvA/3Zl8t72k2SW1DAg2I9fnDeYGyZFd+vG4N0f7GHLsSK2PHSxWTfobKm+sZnLXthE1pka6hqbGRTSi99eOoz54wfZdVhLa83q/Tk8ujqF6vomfjNjGHecH9ujD8WuXPf6dkqq6vn2N9N79IGklNqjtW73ppBZPWulVBTwPtAP0MASrfUL3W6RsLkjeRUs2XScbw7nUVHbiK+XB1Pjw7j7gnguGRnRo/zG/zcngUuf38Tf16Tywg3jrdbmtNwKALeZY92eID9vbj8/jkVTY1mfVsC7207wz6+O8MLaY8wfN4iFU2PMnrZ4OKecNYfyuO+SoQ4P1AA+Xh789aox/GnFQW6cHM3Pzx3c4bc0W1JKMW/cIM4bEsojK1N46qs0vjqUyz8XJDLcBt/asktr2HmihAcuHWazbw5g/jBII/CA1nqvUioQ2KOU+lZrfdhmLRPdsu/0GV7ekMHa1Hz8fTyZM3YAlyYYcjn4+1hn1GtwaAB3TY/j3+vTuWlytNV6wam55YZl5k4QeGzN00NxaUI/Lk3ox5G8Ct7ddpIV+7L4eHcm58b1ZdHUWGaM7Ndpj/SFdUcJ9PPitmmxdmx5586NC2XdAxc6uhmAYWjm1Z9P4IuDuTyyKoU5L27m15cM5fbz46z6IfLZfsMcc2vUWexMt4ZBlFKrgJe01t+297oMg9iX1pptGcW8vCGdbRnFBPfyZtHUGBZOibHZOGZNfRMznvuOQD8vPr93mlW+Ys58/jui+vjz1sJJVmih6ymtrufjXZm8v/0U2aU1RPbpxc3nDeb6pOiffICl5JRxxb+3cP+Modw/Y5iDWuw6iivreGS1YU62t6diRP8gxkYGkxgVQmJkCPERvbs9VDP7X5vo5ePJinum9ridnQ2DWByslVIxwCZgtNa6vNXzi4HFANHR0RNPnTrV7QYL8zQ3a75NzeeVjRnszywlItCXO86P48Zzos1anNBTaw7mcvfSvTwxbxQ3nxfTo2PVNjQx6tGvufuCITw4a7h1GuiiGpuaWZuaz9tbT7LzRAm9vD25esIgFk2NablZdsf7u9lxvJjND13sFgmv7GXLsSK2pBexP7OUg9llVNY1AuDv48noQcEkRgYzNjKEcVEhRPbp1eWwxpG8Cmb9axOPzx3FLVNiety+Ho9ZtzpQb+BT4P7WgRpAa70EWAKGnnU32yrM0NjUzGcHcnhlQwbHCiqJ7uvP364aw9UTBtl1jNCUKvOZr49wxZgBP5rmZ6mWauZusnKxJ7w8PZg9egCzRw8gJaeM97ad5JM9WSzdcZrzh4ZxyYgIvj2cz29mDJNAbaFpQ8OYNjQMMHR2jhdVsT+zlANZpezPKuO97aeobzwBQB9/b8ZGhhh734Yg3nae/Or92Xh6KC4fM8DmbTc7WCulvDEE6qVa6+W2a5LoSG1DE8v2ZPH6pgwyS2oY3i+QF24YxxVjBtj0TndHlFI8duUoLnthM898c4S/Xz2228dKbalm7r43F7tj1MBg/rkgkYdmj+CjXZn8Z/spNh8rIsjPi0XTYhzdPJfm4aGIj+hNfERvrpkYCRhmtBzJq2B/liGAH8gq46X1xzClHxkY7EdiVIghiEcGsyo5x+pFBjpi7mwQBbwFpGqtn7Ntk0RblXWN/HfHKd7YfILCijoSo0J4ZM4oLhkR4bCltSZD+wWycEoMb209wQ2TokmMCunWcdLy3G+ZuTWF9vbllxfFs3h6HGsP5xPa25cgK6fgFIYZLWMigxkTGQwMBgzL5FNyylt63/szS1lzKK9ln9/Y6Z6BuT3rqcAvgINKqWTjc3/UWn9pk1YJwLAq6p1tJ3lv20nKahqYGh/KC9eP47whoTadImSpX88YysrkHB5ZncKKu6d06wMkNdf9lpnbgrenB5fZ4Su3+EGArxeTY/syudUy8jNV9RzILiP7TA1zbZAOtT1mBWut9RZA/i+yk7yyWt7cfJz/7jxNdX0TMxP6cc9F8YzrZq/V1gL9vPnj5SP47f/2s2xvFtclRVm0v2mZ+cwE91pmLtxXnwAfm+XH6YgsN3ciWWeqeXlDBp/uyaJJG5au3n3hEIb1c/6bbleNH8TSHad5ak0as0b1t+jGl2mZubtUhxHCFiRYO4HGpmbe3nqC5789RlOz5tqkSO6cPoToUH9HN81sShlSZV750hb+tfYoj145yux93alArhC2IsHawZIzS/nD8oOk5pZzyYgIHp83isg+rhOkWxs9KJibJkfz/vZT3DAp2uylvanGZeYj3XiZuRA9JWW9HKSitoFHVx3iqle2UlJVx2s/n8CbtyS5bKA2eXDmcAL9vHh09SGz6/Kl5ZUzMNjvrFhmLkR3SbC2M601Xx3KZcZz3/H+96e4+dzBfPvbC5g9eoBTzfDorj4BPjw4czjfHy/h8wO5Zu1jqGYuvWohOiPB2o6yS2u44/3d3PXBXvr4+7D87ik8Pm+0282XvXFyNKMGBvG3L1OpMi7n7UhdYxMZhVWyclGILkiwtoPGpmbe3HycS5/7jq3pxfzx8hF8du80xkdbt2KGs/D0UDwxbxS5ZbW8vCG9022P5btXNXMhbEVuMNrYwawy/rDiAIeyy7lweDhPzhtNVF/XHpc2x8TBfbl6wiDe3HyCa5N+WpfPJC3P/XNYC2EN0rO2kcq6Rh7/LIV5L28hv7yOl2+awDsLJ50Vgdrk4ctG4OPlweOfpXR4szE1txxfL48Og7kQwkB61jbwTUoej65OIa+8lp+dE83vZo04K7Ojta7Lty61gBkJ/X6yTVpeOcP7yzJzIboiPWsryi2rYfH7u1n8nz0E9/Jm2V1T+Mv8MWdloDa5ZUoM8RG9eeLzw9Q2NP3oNcMy8wqZXy2EGSRYW0FTs+adrSeY8ex3bDpWyEOzDTcQJw52zxuIlvD29ODxuaM4XVLNG5uO/+i1woo6SqrqZSaIEGaQYZAeOpRdxh9XHORAVhnTh4Xzl3mjXWqZuD1MjQ/j8jH9eXljOldPjGRQSC8ADksOayHMJj3rHth1soR5L28lp7SWf984nvcWTZJA3YE/XZEAwF+/+KHGsmkmiAyDCNE1CdY98Nw3RwkN8GHtb6czN3GgW6xAtJVBIb345YXxfHkwj63pRYBhJogsMxfCPBKsu2nXyRK2Hy9m8fQ4m1UQdzd3TI8juq8/j65OoaGpmbTcCsm0J4SZJFh307/XHSM0wIefnTPY0U1xGX7enjwyJ4H0gkre2HycjMJKyWEthJkkWHdDcmYpm48Vcfv5cfTysV81cXdwycgILhoezrPfHKWxWcvKRSHMJMG6G15cd4wQf29+cZ70qi2llOKRK0fhaRzfl5kgQphHgrWFDmWXsS6tgNumxtLbV2Y+dkdsWAC/vCie/kF+xMjsGSHMItHGQi+tTyfQz4tbpsY4uiku7dczhvKri+NlmbkQZjKrZ62UelspVaCUOmTrBjmzI3kVfJWSx6IpMW6Xg9oRJFALYT5zh0HeBWbbsB0u4cX1xwjw8eTWabGObooQ4ixjVrDWWm8CSmzcFqeWXlDJFwdzuXlKjMyrFkLYndVuMCqlFiuldiuldhcWFlrlmFprXv8ug2e+PmKV4/XEKxvS8fPy5HbpVQshHMBqNxi11kuAJQBJSUnmlbXuRENTM39cfpBP9mQBMLx/IFcmDuzpYbvlVHEVq/bnsGhKDKG9fR3SBiHE2c0pp+5V1TVyx/u7+WRPFvddHE9iVAj/t+oQBeW1DmnPKxsy8PRQLJ4e55DzCyGE0wXroso6bnzjezYdLeTvV4/htzOH8+y1idTUN/GH5Qc7LA9lK1lnqvl0bxY3TooiIsjPrucWQggTc6fufQhsB4YrpbKUUrfZojEni6q45tVtHM2v4I2bk7hxcjQA8RG9+f3sEaxLK2CZcVjEXl7dmIFScOcFQ+x6XiGEaM2sMWut9Y22bkhyZim3vrsLgA/vOJfx0T+usrJoSgxfp+TxxGeHmRIf1pLA3pbyymr5ZHcW1yZFMdAO5xNCiI44xTDI+rR8blzyPQG+niy767yfBGoADw/FMwsSadKah5YdsMtwyGvfZdCsNXdLr1oI4WAOD9Yf7TzNHe/vIT6iN8vvnkpceO8Ot40O9edPV4xkS3oRH+w4bdN2FVTU8uHO01w1fhBRfSV/hRDCsRwWrLXW/GvtUR5efpCp8WF8tPhcwgO7nhZ30+Rozh8axt++SOVUcZXN2vfm5hM0NDXzy4vibXYOIYQwl0OCdWNTM39YfpB/rT3GNRMieeuWJALMzGCnlOKpa8bi5al48JP9NDVbfzikuLKO/2w/xdzEgcSEBVj9+EIIYSm7B+vq+kbu/M8ePtqVya8uiueZa8fi7WlZMwaG9OKxK0ex6+QZ3tl6wuptfGvLCWobm/jVxdKrFkI4B7sG6+LKOm58YwcbjhTw5PzRPDhreLeLzF49YRAzRvbjn18fIb2gwmptLK2u5/3tp7h8zADiI6TklBDCOdgtWJ8qNsyhTsst59WfT+QX5/asyopSir9dPZoAH08e+N9+GpuardLOd7aepLKukXulVy2EcCJ2CdYHskq55tVtlNY08N87zmHWqP5WOW5EoB9/mT+G/VllvPZdRo+PV1HbwDtbTzAzoZ/UBhRCOBWbB+uNRwq4Ycn3+Hp5suyuKUwc3Neqx79i7ADmjB3AC+uOkZJT1qNjvb/9FOW1jdx78VArtU4IIazDpsH6k92Z3P7ebmJCA1hxzxTiIzqeQ90TT84bTXAvHx74337qG7s3HFJV18ibm49z0fBwxkQGW7mFQgjRMzYL1i+tP8bvlh3g3LhQPr7zXJsmQeoT4MM/rh5DWl4F/153rFvH+OD7U5ypbuDeS6RXLYRwPjYJ1tmlNTzzzVHmjxvI2wsnEWiHeoUzEvpx7cRIXtmYzr7TZyzat6a+iTc2H+f8oWFMaGepuxBCOJpNgnVJVT13XTCE564bh4+X/WYH/t+VCfQP8uOBT/ZT29Bk9n4f7jxNUWW9jFULIZyWTSLp4L7+PHzZCDzsXL06yM+bfy5I5HhhldmlwGobmnh9UwbnxPZlcqx1b34KIYS12CRYB/Wy/bBHR6YNDeMX5w7mra0n2HG8uMvtP9mTRX55HffJWLUQwok5POueLTx82Qii+vjz4LL9VNU1drhdfWMzr23MYOLgPkwZEmrHFgohhGXcMlgH+HrxzLWJZJ2p4e9rUjvcbsW+LLJLa7j34vhuL3sXQgh7cMtgDTA5ti+3T4vlg+9Ps+lo4U9eb2xq5uUNGYyNDOaCYeEOaKEQQpjPbYM1wAMzhzMkPICHPj1AWU3Dj15blZzD6ZJq7r14qPSqhRBOz62DtZ+3J89eN46Cijqe/Pxwy/NNzZqXN6QzckAQM0ZGOLCFQghhHrcO1gDjokK458IhLNuTxbeH8wH44mAux4uqZKxaCOEyzA7WSqnZSqkjSql0pdTDtmyUtd178VBGDgjiD8sPUlxZx0vrjzE0ojezrZT9TwghbM2sYK2U8gReBi4DEoAblVIJtmyYNfl4efDstYmU1dRz7evbOZpfya8ujrf7oh0hhOguc3vWk4F0rfVxrXU98BEwz3bNsr6EgUHcP2MYxwuriAsLYM7YgY5ukhBCmM28KrUwCMhs9XsWcI71m2Nbd06Po7Cijlmj+uMpvWohhAsxN1h3SSm1GFgMEB0dba3DWpWXpwePzR3l6GYIIYTFzB0GyQaiWv0eaXyuhdZ6idY6SWudFB4ui0yEEMKazA3Wu4ChSqlYpZQPcAOw2nbNEkII0ZpZwyBa60al1K+ArwFP4G2tdYpNWyaEEKKF2WPWWusvgS9t2BYhhBAdcPsVjEII4Q4kWAshhAuQYC2EEC5AgrUQQrgApbW2/kGVqgDMq1jr+oKBMkc3wo7OpuuVa3VPznytw7XWge29YLUVjG0c0Von2ejYTkUptURrvdjR7bCXs+l65VrdkzNfq1Jqd0evyTBIz33m6AbY2dl0vXKt7sklr9VWwyC7z5aetRBCWEtnsdNWPeslNjquEEK4sw5jp0161kIIIaxLxqzbaK98mVLqYqXUXqXUIaXUe0qpdm/MKqVuUUodM/7c0ur5iUqpg8Zj/ls5QeFHpdTbSqkCpdShVs89rZRKU0odUEqtUEqFdLBvuyXejIm+dhif/9iY9MspdHC9jymlspVSycafyzvY16Wut4NrHaeU+t54nbuVUpM72NeV3sNRSqkNSqnDSqkUpdSvjc9fa/y9WSnV4XCsq/1d0VrLj/EHQ5KqDCAO8AH2YyhjlgkMM27zBHBbO/v2BY4b/9vH+LiP8bWdwLmAAtYAlznBtU4HJgCHWj03E/AyPn4KeMrcfyPja/8DbjA+fg2429HX2cX1PgY82J33hDNfbwfX+o3pfQdcDmx0g/fwAGCC8XEgcNT4/+tIYDiwEUhyl7+ruTUY2+ttKqXUX5VSR5VSqUqp+zrY12U+qWm/fNk1QL3W+qhxm2+Nz7U1C/hWa12itT5j3G62UmoAEKS1/l4b/vrvA/NtfSFd0VpvAkraPPeN1rrR+Ov3GPKWt9VuiTfj3+9iYJlxu/dwgus0ae96zeRy19vBtWogyPg4GMhpZ1dXew/naq33Gh9XAKnAIK11qta6q3UeLvd37TJYq46L5S7EUJBghNZ6JIaLbbtvX+BRDCXAJgOPKqX6GF9+FbgDGGr8md3Ti7GC9sqX9Qe8Wn2dWoCxEINSKkkp9WYn+w4y/mS187yzuxVDDwql1ECllCnjYkfXGQqUtgr2rnKdvzIO+7xtem+66fXeDzytlMoEngH+AO7zHlZKxQDjgR2dbOPSf1dzetYdFcu9G3hCa90MoLUuaGdfl/qk7oDGUGzheaXUTqACaALQWu/WWt/uyMbZglLqT0AjsBRAa52jtW53PNfFvQoMAcYBucCz4LbXezfwG611FPAb4C1wj/ewUqo38Clwv9a6vKPtXP3vak6w7ugTaAhwvfFmxRql1FBw+U/qdsuXaa23a63P11pPBjZhGBsza1/jT2Q7zzslpdRCYA7wM+MHaVsdXWcxEKJ+uPnq1NcJoLXO11o3GTscb2DomLTlLtd7C7Dc+PgTLLtWp30PK6W8MQTqpVrr5V1t34rL/V17MhvEF6jVhgncbwBvg8t/UrdbvkwpFQGglPIFHsJw06Gtr4GZSqk+xq/TM4Gvtda5QLlS6lzjeNjNwCp7XIyllFKzgd8Dc7XW1R1s1u6/kTGwb8AwTASG4OCU12li/IZnchVwqJ3N3OV6c4ALjI8vBo61s41LvYeNbXkLSNVaP2fh7q73dzXjjut5GP5gpt//YPxJA2KNzymgrJ19bwReb/X768bnBgBpHW3nyB8Md8qPYrhT/Cfjc09juHlxBMNXLdO2ScCbrX6/FUg3/ixqs90h4zFfwji/3cHX+SGGr/4NGL7Z3GZsdyaQbPx5zbjtQODLzv6NjM/HYZg1kI6h9+br6Ovs4nr/AxwEDmCoKTrAHa63g2udBuzBMOthBzDRDd7D0zAMUx5o9Z69HMMHbxZQB+RjjF+u/nftclGM8evAUeASDF8HdgE3Ab8Ajmqt31ZKXQg8rbWe1GbfvsY3yATjU3uNb5IS4/jvfcY3zpfAi9pQOkwIIUQbXQ6DaMNdUVOx3FTgf9pQLPcfwDVKqYPA34Hb4cdj1lrrEuBJDAF+F4YbkqYpRfcAb2L49MrAOPNACCHET8lycyGEcAGy3FwIIVyABGshhHABEqyFEMIFSLAWQggXIMFaCCFcgARrIYRwARKshVNSSr1pzO7Y0euPKaUetNG5x6lWhQiUUnNbJ6cXwhHarXgihKNpx+aXGYdhefWXxrasxrAcXQiHkZ61cCilVIwylBJbaixisUwp5a+U2mjKIW4sfrFXKbVfKbWunWPcYcz82EspVdnq+QVKqXeNj99VSr1mzBJ5VCk1p4P2+GCoBnS9MpTAul4ptVAp9VKr47yqDCWyjiulLjTmwk41ncu43Uyl1HZjuz8xpvEUotskWAtnMBx4RRuKWJRjSEUAgFIqHENWx2u01onAta13VEr9CkNK1/la65ouzhODITXoFcBrSim/thtoQ872R4CPtdbjtNYft3OcPhgSnP0GQ4/7eWAUMMY4hBIG/BmYobWeAOwGfttF24TolAyDCGeQqbXeanz8AYYEXybnApu01iegJd+Myc0YsgTO11o3mHGe/2lD7upjSqnjwAgMmdos9ZnWWhvz4uRrrQ8CKKVSMHwgRGKoqrTVkMUTH2B7N84jRAsJ1sIZtE1QY27CmoMYxpcjgRPt7Nu259zd87RVZ/xvc6vHpt+9MFQS+lZrfWM3jy/ET8gwiHAG0Uqp84yPbwK2tHrte2C6UioWWtLumuwD7sRQIGKg8bl8pdRIpZQHhrzGrV2rlPJQSg3BkLO4o6KqFRiqZXfX98BUpVS8sc0BSqlhPTieEBKshVM4AvxSKZWKYTz4VdMLWutCYDGwXCm1H/jRGLLWegvwIPCFcaz4YeBzYBuGBPytncaQVH4NcJfWuraD9mwAEkw3GC29GGObFwIfKqUOYBgCGWHpcYRoTVKkCodShqrUn2utR9v4PO8az7PMlucRwlakZy2EEC5AetbirKWUmgU81ebpE1rrtmPdQjicBGshhHABMgwihBAuQIK1EEK4AAnWQgjhAiRYCyGEC/h/XCBz4hEcGQUAAAAASUVORK5CYII=\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "taxis_means.plot(y='distance')" ] }, { "cell_type": "markdown", "id": "deba998f-24b1-4fc3-93e6-807955ba111b", "metadata": {}, "source": [ "**43.** Overall, do riders travel further or less far as the day progresses?" ] }, { "cell_type": "markdown", "id": "4b69ad07-bd6c-4ef8-8caa-ec0b5b4a2b21", "metadata": {}, "source": [ "They travel further." ] }, { "cell_type": "markdown", "id": "3c1784d1-e66d-40c7-87d1-a5a4bc7dd43b", "metadata": {}, "source": [ "**44.** Create a new column in taxis_means, ```tip_in_percent```. The source columns for this should be \"fare\" and \"tip\"" ] }, { "cell_type": "code", "execution_count": 42, "id": "014637ae-70f7-4289-8077-5264bf46daab", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "pickup_time\n", "2019-03-23 06:00:00 0.000000\n", "2019-03-23 07:00:00 22.190476\n", "2019-03-23 08:00:00 0.000000\n", "2019-03-23 09:00:00 22.621622\n", "2019-03-23 10:00:00 8.314286\n", "2019-03-23 11:00:00 7.245902\n", "2019-03-23 12:00:00 6.293512\n", "2019-03-23 13:00:00 27.470199\n", "2019-03-23 14:00:00 10.612625\n", "2019-03-23 15:00:00 14.450161\n", "2019-03-23 16:00:00 10.608696\n", "2019-03-23 17:00:00 18.141104\n", "2019-03-23 18:00:00 6.584958\n", "2019-03-23 19:00:00 11.461538\n", "2019-03-23 20:00:00 13.909910\n", "2019-03-23 21:00:00 13.432432\n", "2019-03-23 22:00:00 12.267241\n", "2019-03-23 23:00:00 11.221374\n", "Freq: H, Name: tip_in_percent, dtype: float64" ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "taxis_means['tip_in_percent'] = taxis_means.tip / taxis_means.fare * 100\n", "taxis_means.tip_in_percent" ] }, { "cell_type": "markdown", "id": "15be5b49-d465-48a7-9b3b-6d58291752d9", "metadata": {}, "source": [ "**45.** Create a new column, time_interval, as a range of integer values beginning with zero." ] }, { "cell_type": "code", "execution_count": 43, "id": "bcc2dca5-7682-4e55-b218-e94006071ebd", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "pickup_time\n", "2019-03-23 06:00:00 0\n", "2019-03-23 07:00:00 1\n", "2019-03-23 08:00:00 2\n", "2019-03-23 09:00:00 3\n", "2019-03-23 10:00:00 4\n", "2019-03-23 11:00:00 5\n", "2019-03-23 12:00:00 6\n", "2019-03-23 13:00:00 7\n", "2019-03-23 14:00:00 8\n", "2019-03-23 15:00:00 9\n", "2019-03-23 16:00:00 10\n", "2019-03-23 17:00:00 11\n", "2019-03-23 18:00:00 12\n", "2019-03-23 19:00:00 13\n", "2019-03-23 20:00:00 14\n", "2019-03-23 21:00:00 15\n", "2019-03-23 22:00:00 16\n", "2019-03-23 23:00:00 17\n", "Freq: H, Name: time_interval, dtype: int64" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "taxis_means['time_interval'] = np.arange(0, taxis_means.shape[0])\n", "taxis_means.time_interval" ] }, { "cell_type": "markdown", "id": "269b329e-cade-4e54-9ed8-ce5d23006750", "metadata": {}, "source": [ "Display the correlations between the following pairs of values:\n", "1. tip_in_percent and distance.\n", "1. tip_in_percent and passengers.\n", "1. tip_in_percent and time_interval." ] }, { "cell_type": "code", "execution_count": 44, "id": "7003165a-2e86-46fa-9b0c-bb1e9dab4afd", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.058068558052138404\n", "0.39614201273484234\n", "0.11904714170082598\n" ] } ], "source": [ "print(taxis_means['tip_in_percent'].corr(taxis_means['distance']))\n", "print(taxis_means['tip_in_percent'].corr(taxis_means['passengers']))\n", "print(taxis_means['tip_in_percent'].corr(taxis_means['time_interval']))" ] }, { "cell_type": "markdown", "id": "934d0379-8d2c-4579-af88-fafffdb58767", "metadata": {}, "source": [ "**47.** Admittedly, the size of the data set is fairly small given how we've subsetted it. But based on the values in #45, which of the three pairs show the strongest correlation." ] }, { "cell_type": "markdown", "id": "04dfb838-f8de-4e79-8076-12ef0c187fc4", "metadata": {}, "source": [ "tip_in_percent and passengers." ] }, { "cell_type": "markdown", "id": "279a9e91-2488-4e53-8234-ddde076fbe30", "metadata": {}, "source": [ "**48.** Did our hypothesis that people tip more as the day goes on turn out to be warranted?" ] }, { "cell_type": "markdown", "id": "6aaf603f-5099-457c-99eb-bf8370583a0c", "metadata": {}, "source": [ "Not based on this dataset, no." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.2" } }, "nbformat": 4, "nbformat_minor": 5 }