{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# A Brief Introduction to Python\n",
    "\n",
    "_Aidan Slingsby\n",
    "a.slingsby@city.ac.uk_"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This introduction is intended for those who already have some familiarity with python. You should know what variables, functions, operators are and how loops and if statement work. The most important thing you can do is *practice*. This tutorial won't give you much practice. But it will help to show how Python fits together, some general principles, and how we will be using it during the MSc in Data Science. We will make reference to the [W3 Schools' Python tutorial](https://www.w3schools.com/python/) which gives the basics.\n",
    "\n",
    "Please install Anaconda and check that the Spyder editor works, in advance. Anaconda is a suite of Python tools that includes Python itself. See below for more details.\n",
    "\n",
    "## Introducing Python\n",
    "\n",
    "Python is an interpreted, high-level, general-purpose programming language that works on many platforms. Its popularity for Data Science is largely down to its simplicity and the huge number of libraries that are available for it. There are a only few basics to learn, but note that most of your work will be using libraries and most of your Python effort will be about learning how to use individual libraries.\n",
    "\n",
    "It is relately easy to write Python by hacking together code from examples on the web, but I recommend that you try and understand the syntax and how this code works. This will make things easier in the long term.\n",
    "\n",
    "Python has been around since the early 1990s, but 2008 saw the release of *Python 3*, a major revision that is not completely compatible with previous releases.\n",
    "\n",
    "Before you start this, *please install Anaconda on your computer* (see below).\n",
    "\n",
    "\n",
    "### Anaconda\n",
    "\n",
    "Python is free. We will be using the *Anaconda distribution*, which includes a suite of tools including those that help you install/update libraries. Install it [here](https://www.anaconda.com/products/individual) and see the quick instructions in their [cheatsheet](https://docs.conda.io/projects/conda/en/4.6.0/_downloads/52a95608c49671267e40c689e0bc00ca/conda-cheatsheet.pdf).\n",
    "\n",
    "\n",
    "### \"Spyder\" as a Python editor\n",
    "\n",
    "Python code is simply plain text file. You can write it in any editor that saves plain-text (e.g. Notepad) and then running this file through a python interpreter to execute it (`python myPythonCode.py`).\n",
    "\n",
    "However, using specific python editor makes life a bit easier for us. We will be using the \"Spyder\" editor in this tutorial, because it contains a lot of built-in tools for helping you write python. It is part of Anaconda, so you will have it on your computer. Other editors will be covered later. You can launch it from the *Anaconda Navigator*.\n",
    "\n",
    "![Spyder interface](spyder.png)\n",
    "\n",
    "I want to draw your attention to three panels of the Spyder interface:\n",
    "\n",
    "- *Code* (left): this is where your python files can be edited\n",
    "- *IPython console* (bottom right): to run python commands immediately\n",
    "- *Variable explorer* (top right): to see what variables you have\n",
    "\n",
    "Write your first line of python in the traditional way by pasting the following into the IPython console:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Hello world\n"
     ]
    }
   ],
   "source": [
    "print(\"Hello world\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Syntax and comments\n",
    "\n",
    "See the [basic syntax](https://www.w3schools.com/python/python_syntax.asp) and [comments](https://www.w3schools.com/python/python_comments.asp). Unlike most languages, indentation has meaning (that we'll come on to) so don't accidentally give your code different indentation."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Variables\n",
    "\n",
    "[Variables](https://www.w3schools.com/python/python_variables.asp) are named labels that represent values or more complex types of data. Don't make life difficult by using uninformative variable names such as `my_number` - try and make your code understandable.\n",
    "\n",
    "In python, conventionally, variables start with a lowercase letter and use the `_` character to separate words. Variables names cannot start with numbers, cannot contain spaces, can only have alphanumeric characters of `_`, and are *case-sensitive*.\n",
    "\n",
    "There's no need to declare them in advance, you just initiate them using the `=` assignment operator. If it already has a value, it will be overwritten.\n",
    "\n",
    "Once a variable is initialised, we can easily access the value (and change it if we like). Note that variables persist, so unless you go to the `Consoles` menu item and select `Remove all variables` they'll all still be there.\n",
    "\n",
    "Type the following into the IPython console in Spyder:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "pet_type = \"Hamster\"\n",
    "pet_weight_g = 47.3\n",
    "pet_favourite = True\n",
    "pet_num_children = 0"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Print the values to screen:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Hamster 47.3 True 0\n"
     ]
    }
   ],
   "source": [
    "print(pet_type, pet_weight_g, pet_favourite, pet_num_children)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "tags": []
   },
   "source": [
    "Now take a look at the \"variable explorer\".\n",
    "\n",
    "![The four variables you created are listed](variables.png)\n",
    "\n",
    "The four variables are listed along with their (inferred) types and their values. This is one of the advantages of using a python editor.\n",
    "\n",
    "You can also see what types of variables are, by using the built-in `type()` function (also using the built-in function `print()`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<class 'str'>\n",
      "<class 'float'>\n",
      "<class 'bool'>\n",
      "<class 'int'>\n"
     ]
    }
   ],
   "source": [
    "print(type(pet_type))\n",
    "print(type(pet_weight_g))\n",
    "print(type(pet_favourite))\n",
    "print(type(pet_num_children))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Data types\n",
    "\n",
    "As you've seen, variables can be of different [data types](https://www.w3schools.com/python/python_datatypes.asp).\n",
    "\n",
    "### Data types that store single values\n",
    "\n",
    "`bool` is for a `True`/`False` Boolean value which can also be represented as `0` or `1`.\n",
    "\n",
    "`int` and `float` are for whole and non-whole numbers respectively.\n",
    "\n",
    "`str` is for text. If you specify text directly in Python, it needs either single or double quotes around it.\n",
    "\n",
    "### Data types that are *collections* that store more than one value\n",
    "\n",
    "A *[`list`](https://www.w3schools.com/python/python_lists.asp)* is a collection of values that is ordered (i.e. a sequence) and for which you can change the values. Values can be any data type (even `list`s!) but normally they'd be of the same type. Square brackets are used to create, access and change values in lists. List indexes start from *zero*, so `fruits[1]` is the second item."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "['apple', 'banana', 'cherry', 'orange', 'kiwi', 'melon', 'mango']\n",
      "<class 'list'>\n",
      "banana\n",
      "<class 'str'>\n",
      "mango\n",
      "['cherry', 'orange', 'kiwi']\n",
      "<class 'list'>\n",
      "['apple', 'banana', 'cherry', 'orange']\n",
      "['cherry', 'orange', 'kiwi', 'melon', 'mango']\n",
      "['apple', 'blackcurrant', 'cherry', 'orange', 'kiwi', 'melon', 'mango']\n"
     ]
    }
   ],
   "source": [
    "#Create a list of `str` values (using single or double quotes)\n",
    "fruits = [\"apple\", \"banana\", \"cherry\", \"orange\", \"kiwi\", \"melon\", \"mango\"]\n",
    "\n",
    "print(fruits)        \n",
    "print(type(fruits))       #to how that this variable is a list\n",
    "print(fruits[1])          #to show the SECOND item in the list\n",
    "print(type(fruits[1]))    #to show that the item is a str\n",
    "print(fruits[-1])         #negative values are from the end of the list (last value)\n",
    "print(fruits[2:5])        #a range\n",
    "print(type(fruits[2:5]))  #to show this is a list\n",
    "print(fruits[:4])         #from the first item to the fifth\n",
    "print(fruits[2:])         #from the the third item to the end\n",
    "fruits[1] = \"blackcurrant\"#change the second item in the list\n",
    "print(fruits)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can also see these in the variable explorer (click the list to get the table)\n",
    "\n",
    "![List variables](variables2.png)\n",
    "![Table of the list contents](variables3.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The `str` type is actually a list of characters, so individual letter and substrings can be accessed from strings using the techniques above."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "happy!\n"
     ]
    }
   ],
   "source": [
    "message = \"Unhappy!\"\n",
    "print(message[2:])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A *[`set`](https://www.w3schools.com/python/python_sets.asp)* is like a list, but *unordered* and *cannot have duplicates*. It is created using curly brackets. Since it's unordered, you can't access individual elements except by looping through the list (see later).\n",
    "\n",
    "A *[dictionary (`dict`)](https://www.w3schools.com/python/python_dictionaries.asp)* is really useful. It stores key-value pairs allowing you to relate information. Keys are unique, but you can have as many values as you want. Continuing the fruits example, we could store the colour of each fruit. In this example, both the keys and values are `str` data types."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'apple': 'green/red', 'banana': 'yellow', 'cherry': 'red', 'orange': 'orange', 'kiwi': 'green', 'melon': 'yellow', 'mango': 'orange'}\n",
      "A banana is yellow\n"
     ]
    }
   ],
   "source": [
    "#Create a dictionary of `str` values (using single or double quotes)\n",
    "fruit_colours = {\"apple\":\"green/red\", \"banana\":\"yellow\", \"cherry\":\"red\", \"orange\":\"orange\", \"kiwi\":\"green\", \"melon\":\"yellow\", \"mango\":\"orange\"}\n",
    "print(fruit_colours)\n",
    "\n",
    "print(\"A banana is\",fruit_colours['banana']) #access the value with the key \"banana\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Dictionaries are used extensively when doing data science with Python. Values can be of any data type. Examples of their use are:\n",
    "\n",
    "- substituting values e.g. replacing country codes with names (see [example])(https://stackoverflow.com/questions/12906090/country-name-from-iso-short-code-in-dictionary-how-to-deal-with-non-ascii-chars)\n",
    "- mapping colours to values in charting libraries\n",
    "- setting parameters in some python libraries\n",
    "\n",
    "A *[`tuple`](https://www.w3schools.com/python/python_tuples.asp)* is like a `list`, but values can't be changed (immutable). Generally its used differently to lists - to store a set of values what describe something (like a coordinate). Often returned by functions, they use round brackets (`(` and `)`) in their construction. \n",
    "\n",
    "### Data types that are classes\n",
    "\n",
    "We'll come onto classes later. In actual Python is an object-oriented language and all data types are classes and values are objects (or class instances). We'll look later at how classes can:\n",
    "\n",
    "- represent complex data types comprising a mixture of different data types\n",
    "- have their own functions that operate on the objects themselves\n",
    "\n",
    "Most Python libraries use classes to implement sophisticated and complex behaviour as we'll see.\n",
    "\n",
    "### Typecasting\n",
    "\n",
    "When we type data values into our code, Python guesses the data type. We can also [cast the data type](https://www.w3schools.com/python/python_casting.asp) by telling Python to treat it like another data type. We do this by using the type name like a function."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<class 'int'>\n",
      "<class 'float'>\n"
     ]
    }
   ],
   "source": [
    "pet_weight_g = 47          #will be inferred to be an int\n",
    "print(type(pet_weight_g))\n",
    "pet_weight_g = float(47)   #specify to treat as a float\n",
    "print(type(pet_weight_g))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Operators\n",
    "\n",
    "\n",
    "[Operators](https://www.w3schools.com/python/python_operators.asp) operate on data. The ones you'll use most are arithmetic operators, assignment operators, comparison operators and logical operators, but there are also identity operators, membership operators and bitwise operators.\n",
    "\n",
    "Some operators work differently depending on the data types. `+` is arithmetic addition if the values are numerical; but it joins (concatenates) values together if the values are `str` types.\n",
    "\n",
    "## Loops\n",
    "\n",
    "[For loops](https://www.w3schools.com/python/python_for_loops.asp) let you repeat things, either a fixed number of times or iterate through a list. Indentation is essential.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0\n",
      "1\n",
      "2\n",
      "3\n",
      "4\n",
      "5\n"
     ]
    }
   ],
   "source": [
    "#Fixed number of times\n",
    "for i in range(6):\n",
    "  print(i)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "apple\n",
      "banana\n",
      "cherry\n",
      "orange\n",
      "kiwi\n",
      "melon\n",
      "mango\n"
     ]
    }
   ],
   "source": [
    "#Iterate over a collection\n",
    "fruits = [\"apple\", \"banana\", \"cherry\", \"orange\", \"kiwi\", \"melon\", \"mango\"]\n",
    "for fruit in fruits:  #iterates through all the fruits\n",
    "  print(fruit)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "There are also [while loops](https://www.w3schools.com/python/python_while_loops.asp)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## If statements\n",
    "\n",
    "[If statements](https://www.w3schools.com/python/python_conditions.asp) work in the same way as in many languages, and require the use of operators, and use indenting."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Odd numbers from 0 to 10\n",
      "1\n",
      "3\n",
      "5\n",
      "7\n",
      "9\n"
     ]
    }
   ],
   "source": [
    "#Print only odd numbers (% operator here is modulus, if you divide a (whole)\n",
    "#odd number by 2, you'll get 1)\n",
    "limit=10\n",
    "print(\"Odd numbers from 0 to\",limit)\n",
    "for i in range(limit):\n",
    "    if i%2==1:\n",
    "      print(i)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Functions\n",
    "\n",
    "A [function](https://www.w3schools.com/python/python_functions.asp) (usually) names  a block of code which only runs when it is called. You can pass it arguments (`args`; of various data types) and it can return values (of various data types).\n",
    "\n",
    "So far, we've been using [Python's built-in functions](https://docs.python.org/3/library/functions.html) such as `print()` and `type()`. Functions may take any number of parameters (including none) of different types and may return any number of values of different types.\n",
    "\n",
    "Programming by example is great, but it's worth learning to read documentation. The good news is that it's really easy to get a summary of how a method works: In a Jupyter notebook, you can just put a `? ` followed by the function name e.g. `? print`. In the Spyder console you can either type the function e.g. `print()` and a pop up will provide summary information or you can achieve the same result by using the help function e.g. `help(print)`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "\u001b[0;31mDocstring:\u001b[0m\n",
       "print(value, ..., sep=' ', end='\\n', file=sys.stdout, flush=False)\n",
       "\n",
       "Prints the values to a stream, or to sys.stdout by default.\n",
       "Optional keyword arguments:\n",
       "file:  a file-like object (stream); defaults to the current sys.stdout.\n",
       "sep:   string inserted between values, default a space.\n",
       "end:   string appended after the last value, default a newline.\n",
       "flush: whether to forcibly flush the stream.\n",
       "\u001b[0;31mType:\u001b[0m      builtin_function_or_method"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "? print"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The bad news is that this documentation is inconsistent and can be rather cryptic. It will help you to learn how to interpret them, which you may have to do in conjunction with a bit of Googling to find the web documentation. Hopefully this will prompt you to write good documentation!\n",
    "\n",
    "### The `print` function\n",
    "\n",
    "What this (above) means is:\n",
    "\n",
    "- \"prints the values to a stream, or to sys.stdout by default\"\n",
    "- it takes *any number( (denoted by `...`) of arguments called `value'\n",
    "- an (optional) keyword argument (kwarg) called `sep` with a default value of ` ` (a space)\n",
    "- an (optional) keyword argument (kwarg) called `end` with a default value of `\\n` (a new line)\n",
    "- an (optional) keyword argument (kwarg) called `file` with a default value of `sys.stdout` (standard output is usually the screen)\n",
    "- an optional keyword argument (kwarg) called `flush` with a default value of `False`\n",
    "\n",
    "The keyword parameters (kwargs) are optional. An example of the used of 'sep' is thus:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Hamster - 47.3 - True - 0\n"
     ]
    }
   ],
   "source": [
    "pet_type = \"Hamster\"\n",
    "pet_weight_g = 47.3\n",
    "pet_favourite = True\n",
    "pet_num_children = 0\n",
    "\n",
    "print(pet_type, pet_weight_g, pet_favourite, pet_num_children, sep=\" - \")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note that this extra optional named argument simply changes the separator when writing out these values.\n",
    "\n",
    "### The `type` function"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "\u001b[0;31mInit signature:\u001b[0m  \u001b[0mtype\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m/\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m*\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
       "\u001b[0;31mDocstring:\u001b[0m     \n",
       "type(object_or_name, bases, dict)\n",
       "type(object) -> the object's type\n",
       "type(name, bases, dict) -> a new type\n",
       "\u001b[0;31mType:\u001b[0m           type\n",
       "\u001b[0;31mSubclasses:\u001b[0m     ABCMeta, EnumMeta, NamedTupleMeta, _TypedDictMeta, _ABC, MetaHasDescriptors, _TemplateMetaclass, PyCStructType, UnionType, PyCPointerType, ..."
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "? type"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The [`type()` method](https://docs.python.org/3/library/functions.html#type) actually has some different variants. Ignore all, but the one we've been using: the middle one (after the `Docstring:` line):\n",
    "\n",
    "```\n",
    "type(object) -> the object's type\n",
    "```\n",
    "\n",
    "This:\n",
    "\n",
    "- takes a parameter called object (which can be any variable for any type/object)\n",
    "- return the type\n",
    "\n",
    "So this method return a `type` object, as illustrated. The `print()` method prints `class XXX` where `XXX` is the data type."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<class 'str'>\n",
      "<class 'type'>\n"
     ]
    }
   ],
   "source": [
    "pet_type = \"Hamster\"\n",
    "print(type(pet_type))         #returns a string\n",
    "print(type(type(pet_type)))   #return an object of class `type`\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### The `pow` function\n",
    "\n",
    "Finally, let's look at [`pow`](https://docs.python.org/3/library/functions.html#pow), one of the built-in arithmetic functions."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "\u001b[0;31mSignature:\u001b[0m  \u001b[0mpow\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mbase\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mexp\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mmod\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mNone\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
       "\u001b[0;31mDocstring:\u001b[0m\n",
       "Equivalent to base**exp with 2 arguments or base**exp % mod with 3 arguments\n",
       "\n",
       "Some types, such as ints, are able to use a more efficient algorithm when\n",
       "invoked using the three argument form.\n",
       "\u001b[0;31mType:\u001b[0m      builtin_function_or_method"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "? pow"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This raises `x` to the power of `y`, with an options kwarg `z` which is `None` by default. This is clearer in the [web documentation](https://docs.python.org/3/library/functions.html#pow). This:\n",
    "\n",
    "- take two arguments, `x` and `y` (with an optional `z`)\n",
    "- returns the answer\n",
    "\n",
    "It also notes that the `**` operator does the same thing.\n",
    "\n",
    "### Using methods from other modules and packages\n",
    "\n",
    "A module is simply a python file that has a set of functions and/or constants (like variables, but cannot be changed) defined. Modules may be organised into packages. Python has a lot of [prefined modules](https://docs.python.org/3/py-modindex.html) that give you amazing functionality. You use them, you simply import them, like:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [],
   "source": [
    "import math"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can see what's available within a module (note that those that start with `__` are generally internal ones that we wouldn't normally call). You'll also find the documentation on the web\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['__doc__',\n",
       " '__file__',\n",
       " '__loader__',\n",
       " '__name__',\n",
       " '__package__',\n",
       " '__spec__',\n",
       " 'acos',\n",
       " 'acosh',\n",
       " 'asin',\n",
       " 'asinh',\n",
       " 'atan',\n",
       " 'atan2',\n",
       " 'atanh',\n",
       " 'ceil',\n",
       " 'comb',\n",
       " 'copysign',\n",
       " 'cos',\n",
       " 'cosh',\n",
       " 'degrees',\n",
       " 'dist',\n",
       " 'e',\n",
       " 'erf',\n",
       " 'erfc',\n",
       " 'exp',\n",
       " 'expm1',\n",
       " 'fabs',\n",
       " 'factorial',\n",
       " 'floor',\n",
       " 'fmod',\n",
       " 'frexp',\n",
       " 'fsum',\n",
       " 'gamma',\n",
       " 'gcd',\n",
       " 'hypot',\n",
       " 'inf',\n",
       " 'isclose',\n",
       " 'isfinite',\n",
       " 'isinf',\n",
       " 'isnan',\n",
       " 'isqrt',\n",
       " 'ldexp',\n",
       " 'lgamma',\n",
       " 'log',\n",
       " 'log10',\n",
       " 'log1p',\n",
       " 'log2',\n",
       " 'modf',\n",
       " 'nan',\n",
       " 'perm',\n",
       " 'pi',\n",
       " 'pow',\n",
       " 'prod',\n",
       " 'radians',\n",
       " 'remainder',\n",
       " 'sin',\n",
       " 'sinh',\n",
       " 'sqrt',\n",
       " 'tan',\n",
       " 'tanh',\n",
       " 'tau',\n",
       " 'trunc']"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dir(math)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "An example is the `math` module. See the [documentation here](https://docs.python.org/3/library/math.html) and you can use them like this:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "PI is  3.141592653589793\n"
     ]
    }
   ],
   "source": [
    "print(\"PI is \", math.pi)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "tags": []
   },
   "source": [
    "\n",
    "### Making your own functions.\n",
    "\n",
    "Making your own function is worth doing if there's some simple functionality that's small in scope you want to reuse.\n",
    "\n",
    "One example is to construct a URL to get some data based on some parameters.\n",
    "\n",
    "![Beautiful Stamen maps](stamen.png)\n",
    "\n",
    "Stamen are a design company (amongst other things) have designed some really nice [maps that look like watercolour](https://stadiamaps.com/explore-the-map/#style=stamen_watercolor&map=14/52.50676/13.38465). These map tiles are on a tile server and they provide an API to grab those tiles - it's simply a URL [as described on their website](https://docs.stadiamaps.com/map-styles/stamen-watercolor/#__tabbed_1_1):\n",
    "\n",
    "```\n",
    "https://tiles.stadiamaps.com/tiles/stamen_watercolor/{z}/{x}/{y}@2x.jpg?api_key=6ace8e1f-ea73-40a9-898e-a6978a5d4b67\n",
    "``` \n",
    "\n",
    "The [OpenStreetMap website](https://wiki.openstreetmap.org/wiki/Slippy_map_tilenames) (on which Stamen maps are based) describes how to convert latitude and longitude into these `x` and `y` values, providing pseudocode.\n",
    "\n",
    "```\n",
    "n = 2 ^ zoom\n",
    "xtile = n * ((lon_deg + 180) / 360)\n",
    "ytile = n * (1 - (log(tan(lat_rad) + sec(lat_rad)) / π)) / 2\n",
    "```\n",
    "\n",
    "We can convert this to two Python functions (always reference any sources you use!)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [],
   "source": [
    "import math\n",
    "\n",
    "# Returns the tile x from longitude\n",
    "# Modified from http://wiki.openstreetmap.org/wiki/Slippy_map_tilenames\n",
    "def getTileXFromLon(lon, zoom):\n",
    "    return (int)(math.floor((lon+180.0)/360.0*math.pow(2.0,zoom)))\n",
    "\n",
    "# Returns the tile y from longitude\n",
    "# Modified from http://wiki.openstreetmap.org/wiki/Slippy_map_tilenames\n",
    "def getTileYFromLat(lat, zoom):\n",
    "    return (int)(math.floor((1.0-math.log(math.tan(lat*math.pi/180.0) + 1.0/math.cos(lat*math.pi/180.0))/math.pi)/2.0 *math.pow(2.0,zoom)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We need to use the `math` module, w\n",
    "Note that *variables initialised in functions can only be seen within the function*. Also note that the *indenting* is essential to define the function block.\n",
    "\n",
    "We can then use them, just like any other function. Note here that I'm typecasting the numbers to strings (though this may not be necessary)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "https://tiles.stadiamaps.com/tiles/stamen_watercolor/16/32749/21786.jpg?api_key=6ace8e1f-ea73-40a9-898e-a6978a5d4b67\n"
     ]
    }
   ],
   "source": [
    "zoom=16 #zoom level\n",
    "x=getTileXFromLon(-0.102644086,zoom)\n",
    "y=getTileYFromLat(51.527701,zoom)\n",
    "url = \"https://tiles.stadiamaps.com/tiles/stamen_watercolor/\"+str(zoom)+\"/\"+str(x)+\"/\"+str(y)+\".jpg?api_key=6ace8e1f-ea73-40a9-898e-a6978a5d4b67\"\n",
    "print(url)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Try putting this URL in your browser.\n",
    "\n",
    "Try putting this code into its own method.\n",
    "\n",
    "In actual fact, the [OpenStreetMap website](https://wiki.openstreetmap.org/wiki/Slippy_map_tilenames) does provide a function:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [],
   "source": [
    "import math\n",
    "def deg2num(lat_deg, lon_deg, zoom):\n",
    "  lat_rad = math.radians(lat_deg)\n",
    "  n = 2.0 ** zoom\n",
    "  xtile = int((lon_deg + 180.0) / 360.0 * n)\n",
    "  ytile = int((1.0 - math.asinh(math.tan(lat_rad)) / math.pi) / 2.0 * n)\n",
    "  return (xtile, ytile)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note that this does this all in one method, returning the two values as a `tuple`. Again, note the indentation.\n",
    "\n",
    "### Document your functions\n",
    "\n",
    "It's good practice to provide documentation so that someone else can type `? yourFunction` and get a good summary.\n",
    "\n",
    "The help for the function we wrote is"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "\u001b[0;31mSignature:\u001b[0m  \u001b[0mgetTileXFromLon\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlon\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mzoom\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
       "\u001b[0;31mDocstring:\u001b[0m <no docstring>\n",
       "\u001b[0;31mFile:\u001b[0m      /var/folders/qp/833_d7651js_jq_n0ydl3k480000gp/T/ipykernel_29853/830997071.py\n",
       "\u001b[0;31mType:\u001b[0m      function"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "? getTileXFromLon"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note the `<no docstring>`. Let's add one."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "\u001b[0;31mSignature:\u001b[0m \u001b[0mgetTileXFromLon\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlon\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mzoom\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
       "\u001b[0;31mDocstring:\u001b[0m\n",
       "Finds the Staman tile x from the longitude\n",
       "Parameters:\n",
       "argument1 (lon): Longitude\n",
       "argument2 (zoom): Zoom level (int from 0-16)\n",
       "\n",
       "Returns:\n",
       "int: The tile's x\n",
       "\u001b[0;31mFile:\u001b[0m      /var/folders/qp/833_d7651js_jq_n0ydl3k480000gp/T/ipykernel_29853/3365258485.py\n",
       "\u001b[0;31mType:\u001b[0m      function"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "def getTileXFromLon(lon, zoom):\n",
    "  \"\"\"Finds the Staman tile x from the longitude\n",
    "    Parameters:\n",
    "    argument1 (lon): Longitude\n",
    "    argument2 (zoom): Zoom level (int from 0-16)\n",
    "\n",
    "    Returns:\n",
    "    int: The tile's x\n",
    "   \"\"\"\n",
    "  return (int)(math.floor((lon+180.0)/360.0*math.pow(2.0,zoom)))\n",
    "\n",
    "?getTileXFromLon"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "That's better!\n",
    "\n",
    "## Using python source code files\n",
    "\n",
    "So far, we've been putting python in the IPython console, where it runs immediately.\n",
    "\n",
    "Let's instead write code in a file. In Spyder, Choose `File` > `New file` from the menu. This will create a new python file (extension `.py`) in some temporary location. You'll probably want to save it somewhere, perhaps call it `mapTiles.py`.\n",
    "\n",
    "Put the functions we made and the code to generate the tile URL in there and then run is (green triangle. Note that in the IPython console, it issues the `runFile()` function to run your file. This is also where the output goes.\n",
    "\n",
    "![Spyder interface](spyder2.png)\n",
    "\n",
    "Note that the variables are accessible to both, because it's all run through the IPython console.\n",
    "\n",
    "### Autocomplete and documentation\n",
    "\n",
    "Spyder also give you autocomplete and documentation. Press tab after typing the beginning of a function and will list the available functions, tell you what the arguments are and even give you the documentation.\n",
    "\n",
    "### Cells\n",
    "\n",
    "Another structural thing is cells. `#%%` breaks your code into `cells` which can be run separately (using the button with green triangle with yellow square on). Note that this calls the `runcell()` function in the IPython console.\n",
    "\n",
    "### Keyboard shortcuts\n",
    "\n",
    "Learn the keyboard shortcuts. [Here are some](http://e-callisto.org/cospar2018/SpyderKeyboardShortcutsEditor.pdf).\n",
    "\n",
    "### Debugging\n",
    "\n",
    "This incredibly powerful feature lets you *pause* the execution of code and show you how the code executes and what the variable values are at any point. Add one or more `breakpoints` by clicking to the right of the line number. Then if you run it using the \"debug file\" button or menu option, the code will pause at the breakpoint. The buttons to the right of the debug button will allow you to step through the code, including *into* functions that are called. Whilst execute is paused, you can see the current state of the variables.\n",
    "\n",
    "Have a go at using this on a loop:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "15\n"
     ]
    }
   ],
   "source": [
    "sum=0;\n",
    "for i in range(6):\n",
    "    sum+=i\n",
    "print(sum)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Classes and objects\n",
    "\n",
    "Python is an object-oriented language, in that everything is an *object*. Objects not only hold data, but they hold functions that manipulate those data. These are defined by its *class*; effectively a *template* for the object. And you can find out what functions a class has by using the `dir()` function."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['__add__',\n",
       " '__class__',\n",
       " '__contains__',\n",
       " '__delattr__',\n",
       " '__dir__',\n",
       " '__doc__',\n",
       " '__eq__',\n",
       " '__format__',\n",
       " '__ge__',\n",
       " '__getattribute__',\n",
       " '__getitem__',\n",
       " '__getnewargs__',\n",
       " '__gt__',\n",
       " '__hash__',\n",
       " '__init__',\n",
       " '__init_subclass__',\n",
       " '__iter__',\n",
       " '__le__',\n",
       " '__len__',\n",
       " '__lt__',\n",
       " '__mod__',\n",
       " '__mul__',\n",
       " '__ne__',\n",
       " '__new__',\n",
       " '__reduce__',\n",
       " '__reduce_ex__',\n",
       " '__repr__',\n",
       " '__rmod__',\n",
       " '__rmul__',\n",
       " '__setattr__',\n",
       " '__sizeof__',\n",
       " '__str__',\n",
       " '__subclasshook__',\n",
       " 'capitalize',\n",
       " 'casefold',\n",
       " 'center',\n",
       " 'count',\n",
       " 'encode',\n",
       " 'endswith',\n",
       " 'expandtabs',\n",
       " 'find',\n",
       " 'format',\n",
       " 'format_map',\n",
       " 'index',\n",
       " 'isalnum',\n",
       " 'isalpha',\n",
       " 'isascii',\n",
       " 'isdecimal',\n",
       " 'isdigit',\n",
       " 'isidentifier',\n",
       " 'islower',\n",
       " 'isnumeric',\n",
       " 'isprintable',\n",
       " 'isspace',\n",
       " 'istitle',\n",
       " 'isupper',\n",
       " 'join',\n",
       " 'ljust',\n",
       " 'lower',\n",
       " 'lstrip',\n",
       " 'maketrans',\n",
       " 'partition',\n",
       " 'replace',\n",
       " 'rfind',\n",
       " 'rindex',\n",
       " 'rjust',\n",
       " 'rpartition',\n",
       " 'rsplit',\n",
       " 'rstrip',\n",
       " 'split',\n",
       " 'splitlines',\n",
       " 'startswith',\n",
       " 'strip',\n",
       " 'swapcase',\n",
       " 'title',\n",
       " 'translate',\n",
       " 'upper',\n",
       " 'zfill']"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dir(str)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And for each method, we can use `?`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "\u001b[0;31mSignature:\u001b[0m  \u001b[0mstr\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcapitalize\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m/\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
       "\u001b[0;31mDocstring:\u001b[0m\n",
       "Return a capitalized version of the string.\n",
       "\n",
       "More specifically, make the first character have upper case and the rest lower\n",
       "case.\n",
       "\u001b[0;31mType:\u001b[0m      method_descriptor"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "? str.capitalize"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As you know, `str` is a class. You can use these methods by using a `.` after the variable name. For example:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "aidan\n",
      "Aidan\n"
     ]
    }
   ],
   "source": [
    "myName=\"aidan\"\n",
    "print(myName)\n",
    "print(myName.capitalize())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So what you been to know is that a *class* is a *data type*, and object is the value that contains variable and functions. So a `str` variable actually references a more complex object that you might have expected with the ability to do things. This is a fundamental characteristic of object oriented language.\n",
    "\n",
    "In practice terms for Data Science is when we use libraries that do complicated machine-learning, the complexity is hidden inside the objects that we use. And we can query and manipulate these objects by using the documented functions.\n",
    "\n",
    "If you have a look again at the [Dictionary documentation](https://www.w3schools.com/python/python_dictionaries.asp), you'll notice reference to many functions that help use dictionaries. Yes, you've guessed it... dictionaries are actually *classes* and have built-in functions that relate to use of dictionarys."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Defining your own class\n",
    "\n",
    "Just like functions you can define your own bespoke classes to package together related data and associated functions for that data.\n",
    "\n",
    "Below we can see a simple example of a class. \n",
    "\n",
    "For the most part you will not need to define your own classes but it is useful to see how classes are defined in Python as it will allow you to understand how to interact and work with other classes built by others (eg. builtin classes and impprrted libraries (see next section))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Hello my name is John\n",
      "Hello my name is Gina\n"
     ]
    }
   ],
   "source": [
    "class Person:\n",
    "    \"\"\"This defines an object of type Person that has a name and age attribute.\n",
    "    The Person class will return a statement describing who they are and what age they are\"\"\"\n",
    "    def __init__(self, name, age):\n",
    "        self.name = name\n",
    "        self.age = age\n",
    "    \n",
    "    def myname(self):\n",
    "        print(\"Hello my name is \" + self.name)\n",
    "        \n",
    "    #def myage(self, ):TO DO!\n",
    "        \n",
    "\n",
    "p1 = Person(\"John\", 36)\n",
    "p1.myname()\n",
    "\n",
    "p2 = Person(\"Gina\", 56)\n",
    "p2.myname()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The class object is defined using the argument `class` followed by the name of the class. You can define your class by any name.\n",
    "\n",
    "Most classes will have a `__init__` method which is where you can initialise your class with any number of attributes so here we are providing the class object with name and age attributes.\n",
    "\n",
    "The arguments that are provided to the `__init__` method indicate what arguments we need to provide to the class when we call the class. So when we first define (or instantiate) class, we provide it with those required arguments:\n",
    "\n",
    "`p1 = Person(\"John\", 36)`\n",
    "\n",
    "Now p1 defined here is an example or an \"instance\" of our Person class and we can define any number of Person class instances eg.\n",
    "\n",
    "`p2 = Person(\"Gina\", 56)`\n",
    "\n",
    "Within the class we can define specific methods that are associated with processesing the data packaged within the Person class. So `myname` is a method that will take the name attribute and print a statement describing the name of a particular person class instance.\n",
    "\n",
    "eg. `p1.myname()`\n",
    "\n",
    "Returns:\n",
    "`Hello my name is John`\n",
    "\n",
    "The first argument in each of these methods contains this argument `self`. This argument indicates that in order to call this function we must first instantiate the class in other words we must first define the variable `p1` before we can call the method `myname()`.\n",
    "\n",
    "\n",
    "Now over to you to have a go at defining a myage method that will print out a statement describing the age of the specific class instance.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Libraries\n",
    "\n",
    "Now we'll talk about libraries. Libraries are \"packages\" (collections of modules) that define classes and functions for some specific functionality. This is what makes Python (and other languages so powerful).\n",
    "\n",
    "### Example: Which bike hire station in London currently holds the most bikes?\n",
    "\n",
    "This example will tell ous which bike hire station in London has the most bikes available.\n",
    "\n",
    "The data is provided by Transport for London as an XML file - https://tfl.gov.uk/tfl/syndication/feeds/cycle-hire/livecyclehireupdates.xml. Try it in a browser! It's live (used by apps that tell you how many bikes there are at stations. Some browsers even format it for you. Here's an abridged version of how the first two stations are represented:\n",
    "\n",
    "```\n",
    "<stations lastUpdate=\"1599228180865\" version=\"2.0\">\n",
    "  <station>\n",
    "    <id>1</id>\n",
    "    <name>River Street , Clerkenwell</name>\n",
    "    <lat>51.52916347</lat>\n",
    "    <long>-0.109970527</long>\n",
    "    <nbBikes>8</nbBikes>\n",
    "    <nbEmptyDocks>11</nbEmptyDocks>\n",
    "    <nbDocks>19</nbDocks>\n",
    "  </station>\n",
    "  <station>\n",
    "    <id>2</id>\n",
    "    <name>Phillimore Gardens, Kensington</name>\n",
    "    <lat>51.49960695</lat>\n",
    "    <long>-0.197574246</long>\n",
    "    <nbBikes>16</nbBikes>\n",
    "    <nbEmptyDocks>17</nbEmptyDocks>\n",
    "    <nbDocks>37</nbDocks>\n",
    "  </station>\n",
    "...\n",
    "</stations>\n",
    "```\n",
    "\n",
    "Since the data needs to be retrieved from the web we will also use a library called `requests` that retrieves data from a URL."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Status code is 200\n",
      "<class 'requests.models.Response'>\n"
     ]
    }
   ],
   "source": [
    "import requests\n",
    "url = \"https://tfl.gov.uk/tfl/syndication/feeds/cycle-hire/livecyclehireupdates.xml\"\n",
    "response = requests.get(url)\n",
    "print(\"Status code is\",response.status_code)\n",
    "print(type(response))\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Our `response` variable contains an object of type `Response`. You can use `type()`, `dir()` and `?` to find out more about its variables and methods.\n",
    "\n",
    "One of its variable is called `status_code` and this tells use whether the HTTP request was successful. `200` means success - see a [list of status codes here](https://developer.mozilla.org/en-US/docs/Web/HTTP/Status). To make code more robust, you would use an `if` statement to check for success before proceeding.\n",
    "\n",
    "One of its variables is called `text` gives us the text (the whole XML file).\n",
    "\n",
    "Again, these variables/functions are part of the `Response` class.\n",
    "\n",
    "Now we have the XML, we use another python library called `xml` that gives us class called `ElementTree` for extracting the data we want. `ElementTree` is designed for parsing XML files."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<class 'xml.etree.ElementTree.Element'>\n"
     ]
    }
   ],
   "source": [
    "import xml.etree.ElementTree as et\n",
    "tree = et.fromstring(response.text)\n",
    "print(type(tree))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This gives us an `Element` object. Note that when we import the library, we say `as et` which lets us abbreviate this in our code. This is a common convention.\n",
    "\n",
    "Again, you can use `type()`, `dir()` and `?` to find out more. It can be iterated over (it has the a function called `iter`), so we can use a for loop. Each item is a station and we can use its `find` method to get another `Element` object corresponding to characterisics of the station.\n",
    "\n",
    "We then add these to a dictionary.\n",
    "\n",
    "We can then iterate though the keys of the dictionary to find the biggest station.\n",
    "\n",
    "The code is below - hopefully, it's self explanatory."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Worship Street, Shoreditch currently has the most bikes, with 51\n"
     ]
    }
   ],
   "source": [
    "import requests\n",
    "import xml.etree.ElementTree as et\n",
    "\n",
    "#create an empty dictionary\n",
    "stations_numBikes={}\n",
    "\n",
    "\n",
    "url = \"https://tfl.gov.uk/tfl/syndication/feeds/cycle-hire/livecyclehireupdates.xml\"\n",
    "\n",
    "#retrieve the XML content from the web using the request library\n",
    "response = requests.get(url)\n",
    "\n",
    "#parse the XML from the text\n",
    "tree = et.fromstring(response.text)\n",
    "\n",
    "#iterate through all the elements\n",
    "for station_node in tree:\n",
    "    name=station_node.find(\"name\").text        #find the name\n",
    "    numBikes=station_node.find(\"nbBikes\").text #find the number of bikes\n",
    "    stations_numBikes[name]=numBikes          #add to the dictionary \n",
    "\n",
    "#iterate and find the one with the highest\n",
    "\n",
    "max_bikes=int(0);                  \n",
    "most_bikes_station=\"\";\n",
    "#iterate through the keys in the dictionary\n",
    "for station_name in stations_numBikes:\n",
    "    #get the number of bikes from the dictionary\n",
    "    num_bikes=int(stations_numBikes[station_name])\n",
    "    #check if it's greater than the biggest station we'd found so far\n",
    "    if num_bikes>max_bikes:\n",
    "        max_bikes=num_bikes;\n",
    "        most_bikes_station=station_name\n",
    "\n",
    "#print the result        \n",
    "print(most_bikes_station, \"currently has the most bikes, with\", max_bikes);\n",
    "      "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Using the Pandas library to handle tabular data\n",
    "\n",
    "We will work with a lot of tabular table and don't want to mess around with lists and dictionaries for tabular data.\n",
    "\n",
    "Fortunately, the `Pandas` library for Python incorporates pretty much everything you need work with tabular data. This includes:\n",
    "\n",
    "- reading and writing from/to file\n",
    "- restructuring the data\n",
    "- deriving new data in new columns\n",
    "- selecting subsets of data\n",
    "- generating numerical summaries\n",
    "- doing statistical plots\n",
    "\n",
    "There are plenty of thing it can't do or doesn't do well, but we can easily use other libraries for this.\n",
    "\n",
    "As before, this is all handled through classes.\n",
    "\n",
    "### Loading tabular data from a CSV file\n",
    "\n",
    "The example here will be based on the bike data again, but we will use a CSV version, since Pandas only really reads tabular data directly. This URL - http://staff.city.ac.uk/~sbbb717/tfl_bikes/latest - returns an CSV version of the XML data we just used\n",
    "\n",
    "When we import the library, people conventionally use `pd` as the abbreviation, you may as well."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "      id                                  name        lat      long  \\\n",
      "0      1            River Street , Clerkenwell  51.529163 -0.109971   \n",
      "1      2        Phillimore Gardens, Kensington  51.499607 -0.197574   \n",
      "2      3  Christopher Street, Liverpool Street  51.521284 -0.084606   \n",
      "3      4       St. Chad's Street, King's Cross  51.530059 -0.120974   \n",
      "4      5         Sedding Street, Sloane Square  51.493130 -0.156876   \n",
      "..   ...                                   ...        ...       ...   \n",
      "790  851                  The Blue, Bermondsey  51.492221 -0.062513   \n",
      "791  852         Coomer Place, West Kensington  51.483571 -0.202039   \n",
      "792  857                        Strand, Strand  51.512582 -0.115057   \n",
      "793  864     Abbey Orchard Street, Westminster  51.498126 -0.132102   \n",
      "794  865           Leonard Circus , Shoreditch  51.524696 -0.084439   \n",
      "\n",
      "             updatedDate  numBikes  numEmptyDocks  installed  locked  \\\n",
      "0    2024-09-02 14:55:00         2             15       True   False   \n",
      "1    2024-09-02 14:55:00         5             30       True   False   \n",
      "2    2024-09-02 14:55:00        20             12       True   False   \n",
      "3    2024-09-02 14:55:00        13             10       True   False   \n",
      "4    2024-09-02 14:55:00        24              3       True   False   \n",
      "..                   ...       ...            ...        ...     ...   \n",
      "790  2024-09-02 14:55:00         4             17       True   False   \n",
      "791  2024-09-02 14:55:00        19              6       True   False   \n",
      "792  2024-09-02 14:55:00        35              0       True   False   \n",
      "793  2024-09-02 14:55:00        19              9       True   False   \n",
      "794  2024-09-02 14:55:00        40              3       True   False   \n",
      "\n",
      "           installedDate  \n",
      "0    2010-07-12 16:08:00  \n",
      "1    2010-07-08 11:43:00  \n",
      "2    2010-07-04 11:46:00  \n",
      "3    2010-07-04 11:58:00  \n",
      "4    2010-07-04 12:04:00  \n",
      "..                   ...  \n",
      "790  2022-10-17 23:00:00  \n",
      "791  1970-01-01 01:00:00  \n",
      "792  1970-01-01 01:00:00  \n",
      "793  2010-07-14 12:42:00  \n",
      "794  2010-07-07 13:45:00  \n",
      "\n",
      "[795 rows x 10 columns]\n"
     ]
    }
   ],
   "source": [
    "import pandas as pd\n",
    "latest = pd.read_csv ('http://staff.city.ac.uk/~sbbb717/tfl_bikes/latest')\n",
    "print(latest)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "That's it! The data are now in a `DataFrame` object called `latest`. If you double-click it in the Spyder's variable explorer, you'll see all the data.\n",
    "\n",
    "![The DataFrame in the variable explorer](spyder3.png)\n",
    "\n",
    "Now it's in a data frame, we can work with it. However, working with data in Pandas is very different from working with data using basic Python data types. It has its own way of working with data which you need to learn and understand. This is why I said that the challenge you'll face is learning to use libraries, rather than learning to use Python! There many advantages to using Pandas way of working - it's faster and more convenient... once you've learnt how to do it.\n",
    "\n",
    "[I recommend this cheat sheet](https://pandas.pydata.org/Pandas_Cheat_Sheet.pdf).\n",
    "\n",
    "### Deriving new columns\n",
    "\n",
    "Panda makes it easy to make new columns, without having a do any looping. For example:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [],
   "source": [
    "latest[\"capacity\"] = latest[\"numBikes\"]+latest[\"numEmptyDocks\"]\n",
    "latest[\"percentageFull\"] = latest[\"numBikes\"]/latest[\"capacity\"]\n",
    "\n",
    "latest[\"areaName\"] = latest[\"name\"].apply(lambda text: text.split(\",\")[-1].strip())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The first two are easy and obvious (I hope). We are creating two new columns based on derived data: the capacity of each station and the percentage full.\n",
    "\n",
    "The third one is a bit more complex. The text after the last comma of the station name is the local London area name. To do this, we\n",
    "\n",
    "- use `str`'s`split()` function to split the text by its commas\n",
    "- take the last element (using [-1])\n",
    "- use `str`'s`strip()` function to remove white spaces\n",
    "\n",
    "See below:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Clarkenwell\n"
     ]
    }
   ],
   "source": [
    "print(\"Farringdon, Clarkenwell\".split(\",\")[-1].strip())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can't do this as easily as the first two, because it's more complex. So instead, we use a [lambda function](https://www.w3schools.com/python/python_lambda.asp) that applies this to every value.\n",
    "\n",
    "### Accessing the data\n",
    "\n",
    "Here's how you would find the station name with the largest number of bikes in Pandas:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Worship Street, Shoreditch currently has the most bikes, with 51\n"
     ]
    }
   ],
   "source": [
    "#get the numBikes column\n",
    "numBikes_column = latest[\"numBikes\"]\n",
    "#calculate the maximum\n",
    "most_bikes=numBikes_column.max()\n",
    "#find the row index of the maximum\n",
    "most_bikes_row_idx = numBikes_column.idxmax()\n",
    "#find the value at that row index and column \"name\"\n",
    "most_bikes_station = latest.loc[most_bikes_row_idx,\"name\"]\n",
    "#print it\n",
    "print(most_bikes_station, \"currently has the most bikes, with\", most_bikes);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "But you'd normally see it all together. This code does the same, but without doing it in stages. It's very hard to work out what's going on! I don't recommend this. But you'll see code like this."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Worship Street, Shoreditch currently has the most bikes, with 51\n"
     ]
    }
   ],
   "source": [
    "print(latest.loc[latest[\"numBikes\"].idxmax(),\"name\"], \"currently has the most bikes, with\", latest[\"numBikes\"].max());"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So rather than using loops, we are using Pandas' methods that operate on rows, columns and cells. We:\n",
    "\n",
    " - get the numBikes column\n",
    " - find it maximum value\n",
    " - find the row index of its maximum value\n",
    " - extract the station name from that row\n",
    "\n",
    "As you see below, `numBikes_column` is a `Series` object that represents the whole column. `max()` and `idxmax()` are both function of the Series class.\n",
    "\n",
    "`loc` is a function of the `DataFrame` class and returns either:\n",
    " - a `DataFrame` object (for a range of rows and columns)\n",
    " - a `Series` object (for range of rows OR a range of columns)\n",
    " - the object in the cell (for a single row and column)\n",
    "\n",
    "This is illustrated below.\n",
    "\n",
    "It uses the same way as accessing values as for lists."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<class 'pandas.core.series.Series'>\n",
      "<class 'int'>\n",
      "<class 'int'>\n",
      "<class 'str'>\n",
      "\n",
      "A whole column: <class 'pandas.core.series.Series'>\n",
      "A partial column: <class 'pandas.core.series.Series'>\n",
      "A whole row: <class 'pandas.core.series.Series'>\n",
      "A partial row: <class 'pandas.core.series.Series'>\n",
      "A value: <class 'str'>\n"
     ]
    }
   ],
   "source": [
    "print(type(numBikes_column))\n",
    "print(type(most_bikes))\n",
    "print(type(most_bikes_row_idx))\n",
    "print(type(most_bikes_station))\n",
    "print()\n",
    "print(\"A whole column:\",type(latest.loc[:,\"name\"]))\n",
    "print(\"A partial column:\",type(latest.loc[3:8,\"name\"]))\n",
    "print(\"A whole row:\",type(latest.loc[2,:]))\n",
    "print(\"A partial row:\",type(latest.loc[2,\"name\":\"long\"]))\n",
    "print(\"A value:\",type(latest.loc[2,\"name\"]))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can also define ranges based on variables values."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0 stations are over half full\n"
     ]
    }
   ],
   "source": [
    "over_half_full_stations=latest.loc[latest[\"percentageFull\"]>50,:]\n",
    "print(over_half_full_stations['name'].count(), \"stations are over half full\")\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Statistics\n",
    "\n",
    "As you've seen, it is easy to calculate statistics. `DataFrame`'s `describe()` method produces a new `DataFrame` object with summary statistics for all numerical columns"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>id</th>\n",
       "      <th>lat</th>\n",
       "      <th>long</th>\n",
       "      <th>numBikes</th>\n",
       "      <th>numEmptyDocks</th>\n",
       "      <th>capacity</th>\n",
       "      <th>percentageFull</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>count</th>\n",
       "      <td>795.000000</td>\n",
       "      <td>795.000000</td>\n",
       "      <td>795.000000</td>\n",
       "      <td>795.000000</td>\n",
       "      <td>795.000000</td>\n",
       "      <td>795.000000</td>\n",
       "      <td>795.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>mean</th>\n",
       "      <td>429.040252</td>\n",
       "      <td>51.505905</td>\n",
       "      <td>-0.127512</td>\n",
       "      <td>12.415094</td>\n",
       "      <td>13.161006</td>\n",
       "      <td>25.576101</td>\n",
       "      <td>0.481746</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>std</th>\n",
       "      <td>247.224428</td>\n",
       "      <td>0.020331</td>\n",
       "      <td>0.055178</td>\n",
       "      <td>9.414672</td>\n",
       "      <td>9.117439</td>\n",
       "      <td>8.577117</td>\n",
       "      <td>0.316509</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>min</th>\n",
       "      <td>1.000000</td>\n",
       "      <td>51.452997</td>\n",
       "      <td>-0.236770</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>8.000000</td>\n",
       "      <td>0.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>25%</th>\n",
       "      <td>214.500000</td>\n",
       "      <td>51.492976</td>\n",
       "      <td>-0.172134</td>\n",
       "      <td>4.500000</td>\n",
       "      <td>6.000000</td>\n",
       "      <td>19.000000</td>\n",
       "      <td>0.187500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>50%</th>\n",
       "      <td>439.000000</td>\n",
       "      <td>51.509087</td>\n",
       "      <td>-0.129362</td>\n",
       "      <td>12.000000</td>\n",
       "      <td>13.000000</td>\n",
       "      <td>24.000000</td>\n",
       "      <td>0.485714</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>75%</th>\n",
       "      <td>644.500000</td>\n",
       "      <td>51.520978</td>\n",
       "      <td>-0.091125</td>\n",
       "      <td>18.000000</td>\n",
       "      <td>18.000000</td>\n",
       "      <td>30.000000</td>\n",
       "      <td>0.750000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>max</th>\n",
       "      <td>865.000000</td>\n",
       "      <td>51.549369</td>\n",
       "      <td>-0.002275</td>\n",
       "      <td>51.000000</td>\n",
       "      <td>52.000000</td>\n",
       "      <td>62.000000</td>\n",
       "      <td>1.000000</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "               id         lat        long    numBikes  numEmptyDocks  \\\n",
       "count  795.000000  795.000000  795.000000  795.000000     795.000000   \n",
       "mean   429.040252   51.505905   -0.127512   12.415094      13.161006   \n",
       "std    247.224428    0.020331    0.055178    9.414672       9.117439   \n",
       "min      1.000000   51.452997   -0.236770    0.000000       0.000000   \n",
       "25%    214.500000   51.492976   -0.172134    4.500000       6.000000   \n",
       "50%    439.000000   51.509087   -0.129362   12.000000      13.000000   \n",
       "75%    644.500000   51.520978   -0.091125   18.000000      18.000000   \n",
       "max    865.000000   51.549369   -0.002275   51.000000      52.000000   \n",
       "\n",
       "         capacity  percentageFull  \n",
       "count  795.000000      795.000000  \n",
       "mean    25.576101        0.481746  \n",
       "std      8.577117        0.316509  \n",
       "min      8.000000        0.000000  \n",
       "25%     19.000000        0.187500  \n",
       "50%     24.000000        0.485714  \n",
       "75%     30.000000        0.750000  \n",
       "max     62.000000        1.000000  "
      ]
     },
     "execution_count": 40,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "\n",
    "latest.describe()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And if we want the summary statistics by the area names we created, we get use `DataFrame`'s `groupby` function."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr:last-of-type th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <th colspan=\"8\" halign=\"left\">id</th>\n",
       "      <th colspan=\"2\" halign=\"left\">lat</th>\n",
       "      <th>...</th>\n",
       "      <th colspan=\"2\" halign=\"left\">capacity</th>\n",
       "      <th colspan=\"8\" halign=\"left\">percentageFull</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <th>count</th>\n",
       "      <th>mean</th>\n",
       "      <th>std</th>\n",
       "      <th>min</th>\n",
       "      <th>25%</th>\n",
       "      <th>50%</th>\n",
       "      <th>75%</th>\n",
       "      <th>max</th>\n",
       "      <th>count</th>\n",
       "      <th>mean</th>\n",
       "      <th>...</th>\n",
       "      <th>75%</th>\n",
       "      <th>max</th>\n",
       "      <th>count</th>\n",
       "      <th>mean</th>\n",
       "      <th>std</th>\n",
       "      <th>min</th>\n",
       "      <th>25%</th>\n",
       "      <th>50%</th>\n",
       "      <th>75%</th>\n",
       "      <th>max</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>areaName</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Aldgate</th>\n",
       "      <td>6.0</td>\n",
       "      <td>249.000000</td>\n",
       "      <td>271.855108</td>\n",
       "      <td>33.0</td>\n",
       "      <td>105.25</td>\n",
       "      <td>158.5</td>\n",
       "      <td>247.75</td>\n",
       "      <td>779.0</td>\n",
       "      <td>6.0</td>\n",
       "      <td>51.513985</td>\n",
       "      <td>...</td>\n",
       "      <td>30.75</td>\n",
       "      <td>37.0</td>\n",
       "      <td>6.0</td>\n",
       "      <td>0.586216</td>\n",
       "      <td>0.349546</td>\n",
       "      <td>0.055556</td>\n",
       "      <td>0.401014</td>\n",
       "      <td>0.603125</td>\n",
       "      <td>0.869945</td>\n",
       "      <td>0.962963</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Angel</th>\n",
       "      <td>10.0</td>\n",
       "      <td>326.700000</td>\n",
       "      <td>217.356364</td>\n",
       "      <td>75.0</td>\n",
       "      <td>200.25</td>\n",
       "      <td>290.0</td>\n",
       "      <td>358.50</td>\n",
       "      <td>697.0</td>\n",
       "      <td>10.0</td>\n",
       "      <td>51.533240</td>\n",
       "      <td>...</td>\n",
       "      <td>25.50</td>\n",
       "      <td>47.0</td>\n",
       "      <td>10.0</td>\n",
       "      <td>0.314786</td>\n",
       "      <td>0.212971</td>\n",
       "      <td>0.038462</td>\n",
       "      <td>0.109524</td>\n",
       "      <td>0.312500</td>\n",
       "      <td>0.504762</td>\n",
       "      <td>0.583333</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Avondale</th>\n",
       "      <td>7.0</td>\n",
       "      <td>680.428571</td>\n",
       "      <td>114.596185</td>\n",
       "      <td>442.0</td>\n",
       "      <td>657.50</td>\n",
       "      <td>740.0</td>\n",
       "      <td>747.50</td>\n",
       "      <td>771.0</td>\n",
       "      <td>7.0</td>\n",
       "      <td>51.511550</td>\n",
       "      <td>...</td>\n",
       "      <td>25.00</td>\n",
       "      <td>29.0</td>\n",
       "      <td>7.0</td>\n",
       "      <td>0.306563</td>\n",
       "      <td>0.187339</td>\n",
       "      <td>0.038462</td>\n",
       "      <td>0.190374</td>\n",
       "      <td>0.291667</td>\n",
       "      <td>0.431818</td>\n",
       "      <td>0.571429</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Bank</th>\n",
       "      <td>4.0</td>\n",
       "      <td>361.750000</td>\n",
       "      <td>199.932280</td>\n",
       "      <td>101.0</td>\n",
       "      <td>280.25</td>\n",
       "      <td>383.5</td>\n",
       "      <td>465.00</td>\n",
       "      <td>579.0</td>\n",
       "      <td>4.0</td>\n",
       "      <td>51.512803</td>\n",
       "      <td>...</td>\n",
       "      <td>35.25</td>\n",
       "      <td>42.0</td>\n",
       "      <td>4.0</td>\n",
       "      <td>0.843398</td>\n",
       "      <td>0.127911</td>\n",
       "      <td>0.681818</td>\n",
       "      <td>0.770455</td>\n",
       "      <td>0.869697</td>\n",
       "      <td>0.942641</td>\n",
       "      <td>0.952381</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Bankside</th>\n",
       "      <td>7.0</td>\n",
       "      <td>408.000000</td>\n",
       "      <td>380.261314</td>\n",
       "      <td>9.0</td>\n",
       "      <td>101.50</td>\n",
       "      <td>230.0</td>\n",
       "      <td>802.00</td>\n",
       "      <td>810.0</td>\n",
       "      <td>7.0</td>\n",
       "      <td>51.506176</td>\n",
       "      <td>...</td>\n",
       "      <td>30.00</td>\n",
       "      <td>60.0</td>\n",
       "      <td>7.0</td>\n",
       "      <td>0.495514</td>\n",
       "      <td>0.196020</td>\n",
       "      <td>0.277778</td>\n",
       "      <td>0.342544</td>\n",
       "      <td>0.482759</td>\n",
       "      <td>0.598443</td>\n",
       "      <td>0.826087</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>West Kensington</th>\n",
       "      <td>8.0</td>\n",
       "      <td>718.750000</td>\n",
       "      <td>79.218955</td>\n",
       "      <td>626.0</td>\n",
       "      <td>653.25</td>\n",
       "      <td>713.5</td>\n",
       "      <td>773.00</td>\n",
       "      <td>852.0</td>\n",
       "      <td>8.0</td>\n",
       "      <td>51.487602</td>\n",
       "      <td>...</td>\n",
       "      <td>30.00</td>\n",
       "      <td>32.0</td>\n",
       "      <td>8.0</td>\n",
       "      <td>0.358097</td>\n",
       "      <td>0.199511</td>\n",
       "      <td>0.125000</td>\n",
       "      <td>0.250000</td>\n",
       "      <td>0.292424</td>\n",
       "      <td>0.437500</td>\n",
       "      <td>0.760000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Westbourne</th>\n",
       "      <td>1.0</td>\n",
       "      <td>327.000000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>327.0</td>\n",
       "      <td>327.00</td>\n",
       "      <td>327.0</td>\n",
       "      <td>327.00</td>\n",
       "      <td>327.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>51.522168</td>\n",
       "      <td>...</td>\n",
       "      <td>20.00</td>\n",
       "      <td>20.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.100000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.100000</td>\n",
       "      <td>0.100000</td>\n",
       "      <td>0.100000</td>\n",
       "      <td>0.100000</td>\n",
       "      <td>0.100000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Westminster</th>\n",
       "      <td>16.0</td>\n",
       "      <td>475.250000</td>\n",
       "      <td>241.494651</td>\n",
       "      <td>118.0</td>\n",
       "      <td>294.50</td>\n",
       "      <td>359.5</td>\n",
       "      <td>675.00</td>\n",
       "      <td>864.0</td>\n",
       "      <td>16.0</td>\n",
       "      <td>51.496762</td>\n",
       "      <td>...</td>\n",
       "      <td>23.25</td>\n",
       "      <td>28.0</td>\n",
       "      <td>16.0</td>\n",
       "      <td>0.687383</td>\n",
       "      <td>0.262222</td>\n",
       "      <td>0.136364</td>\n",
       "      <td>0.661765</td>\n",
       "      <td>0.750000</td>\n",
       "      <td>0.831481</td>\n",
       "      <td>1.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>White City</th>\n",
       "      <td>2.0</td>\n",
       "      <td>583.500000</td>\n",
       "      <td>24.748737</td>\n",
       "      <td>566.0</td>\n",
       "      <td>574.75</td>\n",
       "      <td>583.5</td>\n",
       "      <td>592.25</td>\n",
       "      <td>601.0</td>\n",
       "      <td>2.0</td>\n",
       "      <td>51.511962</td>\n",
       "      <td>...</td>\n",
       "      <td>37.25</td>\n",
       "      <td>38.0</td>\n",
       "      <td>2.0</td>\n",
       "      <td>0.873684</td>\n",
       "      <td>0.104205</td>\n",
       "      <td>0.800000</td>\n",
       "      <td>0.836842</td>\n",
       "      <td>0.873684</td>\n",
       "      <td>0.910526</td>\n",
       "      <td>0.947368</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Whitechapel</th>\n",
       "      <td>8.0</td>\n",
       "      <td>403.750000</td>\n",
       "      <td>150.653576</td>\n",
       "      <td>200.0</td>\n",
       "      <td>263.00</td>\n",
       "      <td>466.0</td>\n",
       "      <td>515.25</td>\n",
       "      <td>565.0</td>\n",
       "      <td>8.0</td>\n",
       "      <td>51.517410</td>\n",
       "      <td>...</td>\n",
       "      <td>34.25</td>\n",
       "      <td>42.0</td>\n",
       "      <td>8.0</td>\n",
       "      <td>0.453601</td>\n",
       "      <td>0.251162</td>\n",
       "      <td>0.147059</td>\n",
       "      <td>0.328571</td>\n",
       "      <td>0.426587</td>\n",
       "      <td>0.500000</td>\n",
       "      <td>1.000000</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>123 rows × 56 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                   id                                                        \\\n",
       "                count        mean         std    min     25%    50%     75%   \n",
       "areaName                                                                      \n",
       "Aldgate           6.0  249.000000  271.855108   33.0  105.25  158.5  247.75   \n",
       "Angel            10.0  326.700000  217.356364   75.0  200.25  290.0  358.50   \n",
       "Avondale          7.0  680.428571  114.596185  442.0  657.50  740.0  747.50   \n",
       "Bank              4.0  361.750000  199.932280  101.0  280.25  383.5  465.00   \n",
       "Bankside          7.0  408.000000  380.261314    9.0  101.50  230.0  802.00   \n",
       "...               ...         ...         ...    ...     ...    ...     ...   \n",
       "West Kensington   8.0  718.750000   79.218955  626.0  653.25  713.5  773.00   \n",
       "Westbourne        1.0  327.000000         NaN  327.0  327.00  327.0  327.00   \n",
       "Westminster      16.0  475.250000  241.494651  118.0  294.50  359.5  675.00   \n",
       "White City        2.0  583.500000   24.748737  566.0  574.75  583.5  592.25   \n",
       "Whitechapel       8.0  403.750000  150.653576  200.0  263.00  466.0  515.25   \n",
       "\n",
       "                         lat             ... capacity       percentageFull  \\\n",
       "                   max count       mean  ...      75%   max          count   \n",
       "areaName                                 ...                                 \n",
       "Aldgate          779.0   6.0  51.513985  ...    30.75  37.0            6.0   \n",
       "Angel            697.0  10.0  51.533240  ...    25.50  47.0           10.0   \n",
       "Avondale         771.0   7.0  51.511550  ...    25.00  29.0            7.0   \n",
       "Bank             579.0   4.0  51.512803  ...    35.25  42.0            4.0   \n",
       "Bankside         810.0   7.0  51.506176  ...    30.00  60.0            7.0   \n",
       "...                ...   ...        ...  ...      ...   ...            ...   \n",
       "West Kensington  852.0   8.0  51.487602  ...    30.00  32.0            8.0   \n",
       "Westbourne       327.0   1.0  51.522168  ...    20.00  20.0            1.0   \n",
       "Westminster      864.0  16.0  51.496762  ...    23.25  28.0           16.0   \n",
       "White City       601.0   2.0  51.511962  ...    37.25  38.0            2.0   \n",
       "Whitechapel      565.0   8.0  51.517410  ...    34.25  42.0            8.0   \n",
       "\n",
       "                                                                             \\\n",
       "                     mean       std       min       25%       50%       75%   \n",
       "areaName                                                                      \n",
       "Aldgate          0.586216  0.349546  0.055556  0.401014  0.603125  0.869945   \n",
       "Angel            0.314786  0.212971  0.038462  0.109524  0.312500  0.504762   \n",
       "Avondale         0.306563  0.187339  0.038462  0.190374  0.291667  0.431818   \n",
       "Bank             0.843398  0.127911  0.681818  0.770455  0.869697  0.942641   \n",
       "Bankside         0.495514  0.196020  0.277778  0.342544  0.482759  0.598443   \n",
       "...                   ...       ...       ...       ...       ...       ...   \n",
       "West Kensington  0.358097  0.199511  0.125000  0.250000  0.292424  0.437500   \n",
       "Westbourne       0.100000       NaN  0.100000  0.100000  0.100000  0.100000   \n",
       "Westminster      0.687383  0.262222  0.136364  0.661765  0.750000  0.831481   \n",
       "White City       0.873684  0.104205  0.800000  0.836842  0.873684  0.910526   \n",
       "Whitechapel      0.453601  0.251162  0.147059  0.328571  0.426587  0.500000   \n",
       "\n",
       "                           \n",
       "                      max  \n",
       "areaName                   \n",
       "Aldgate          0.962963  \n",
       "Angel            0.583333  \n",
       "Avondale         0.571429  \n",
       "Bank             0.952381  \n",
       "Bankside         0.826087  \n",
       "...                   ...  \n",
       "West Kensington  0.760000  \n",
       "Westbourne       0.100000  \n",
       "Westminster      1.000000  \n",
       "White City       0.947368  \n",
       "Whitechapel      1.000000  \n",
       "\n",
       "[123 rows x 56 columns]"
      ]
     },
     "execution_count": 41,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "latest.groupby(by=\"areaName\").describe()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And if we want to count the available bikes in areas..."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>numBikes</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>areaName</th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Aldgate</th>\n",
       "      <td>88</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Angel</th>\n",
       "      <td>89</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Avondale</th>\n",
       "      <td>50</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Bank</th>\n",
       "      <td>98</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Bankside</th>\n",
       "      <td>100</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>West Kensington</th>\n",
       "      <td>74</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Westbourne</th>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Westminster</th>\n",
       "      <td>225</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>White City</th>\n",
       "      <td>64</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Whitechapel</th>\n",
       "      <td>81</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>123 rows × 1 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                 numBikes\n",
       "areaName                 \n",
       "Aldgate                88\n",
       "Angel                  89\n",
       "Avondale               50\n",
       "Bank                   98\n",
       "Bankside              100\n",
       "...                   ...\n",
       "West Kensington        74\n",
       "Westbourne              2\n",
       "Westminster           225\n",
       "White City             64\n",
       "Whitechapel            81\n",
       "\n",
       "[123 rows x 1 columns]"
      ]
     },
     "execution_count": 42,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "latest.groupby(by=\"areaName\")[[\"areaName\",\"numBikes\"]].sum()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Plotting graphics\n",
    "\n",
    "Simple graphics can be plotting, showing that more docking stations tend to be on the empty rather than full side."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<Axes: ylabel='Frequency'>"
      ]
     },
     "execution_count": 43,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAjsAAAGdCAYAAAD0e7I1AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjcuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8pXeV/AAAACXBIWXMAAA9hAAAPYQGoP6dpAAAmMklEQVR4nO3de3TU5YHG8WfIZUiySeQiM0QihBqqGOoFCmvQAoXEFYQqx4ILCir24IkXIlCaLG0Nrk0grDHVFLwsDViN0Fqw7npporIRpLty1Qo9YCGGIElz0DQXLklI3v2Dw9QxoGQyycy8fj/nzB+/d9755Zn30M7jO7+ZcRhjjAAAACzVK9ABAAAAuhNlBwAAWI2yAwAArEbZAQAAVqPsAAAAq1F2AACA1Sg7AADAapQdAABgtfBABwgG7e3tOnr0qGJjY+VwOAIdBwAAXABjjBobG5WQkKBevc6/f0PZkXT06FElJiYGOgYAAPBBVVWVBg0adN77KTuSYmNjJZ1ZrLi4uACnAQAAF6KhoUGJiYme1/HzoexInreu4uLiKDsAAISYr7sEhQuUAQCA1Sg7AADAapQdAABgNcoOAACwGmUHAABYjbIDAACsRtkBAABWo+wAAACrUXYAAIDVKDsAAMBqAS077777rqZOnaqEhAQ5HA698sornvtaW1v1k5/8RCNGjFBMTIwSEhI0Z84cHT161Osczc3NevDBB9W/f3/FxMRo2rRpOnLkSA8/EwAAEKwCWnaOHz+uq666SkVFRR3uO3HihHbt2qWf/exn2rVrlzZu3KgDBw5o2rRpXvMyMzO1adMmrV+/Xlu3blVTU5NuvvlmtbW19dTTAAAAQcxhjDGBDiGd+RGvTZs26ZZbbjnvnO3bt2v06NGqrKzUpZdeqvr6el188cX6zW9+o5kzZ0qSjh49qsTERL3++uu68cYbL+hvNzQ0KD4+XvX19fwQKAAAIeJCX79D6pqd+vp6ORwOXXTRRZKknTt3qrW1Venp6Z45CQkJSklJ0bZt2857nubmZjU0NHjdAACAncIDHeBCnTp1SllZWZo1a5anvdXU1CgyMlJ9+vTxmutyuVRTU3Pec+Xl5WnZsmXdmvesIVmv9cjf8bdPlk8JdAQAAPwiJHZ2Wltbdfvtt6u9vV2rVq362vnGGDkcjvPen52drfr6es+tqqrKn3EBAEAQCfqy09raqhkzZqiiokJlZWVe78m53W61tLSorq7O6zG1tbVyuVznPafT6VRcXJzXDQAA2Cmoy87ZovPxxx/rrbfeUr9+/bzuHzlypCIiIlRWVuYZq66u1kcffaTU1NSejgsAAIJQQK/ZaWpq0l//+lfPcUVFhfbs2aO+ffsqISFBt912m3bt2qX//u//Vltbm+c6nL59+yoyMlLx8fGaN2+eFi1apH79+qlv375avHixRowYoUmTJgXqaQEAgCAS0LKzY8cOTZgwwXO8cOFCSdLcuXOVk5OjV199VZJ09dVXez1u8+bNGj9+vCTpiSeeUHh4uGbMmKGTJ09q4sSJWrt2rcLCwnrkOQAAgOAWNN+zE0jd+T07fBoLAIDuYeX37AAAAHQWZQcAAFiNsgMAAKxG2QEAAFaj7AAAAKtRdgAAgNUoOwAAwGqUHQAAYDXKDgAAsBplBwAAWI2yAwAArEbZAQAAVqPsAAAAq1F2AACA1Sg7AADAapQdAABgNcoOAACwGmUHAABYjbIDAACsRtkBAABWo+wAAACrUXYAAIDVKDsAAMBqlB0AAGA1yg4AALAaZQcAAFiNsgMAAKxG2QEAAFaj7AAAAKtRdgAAgNUoOwAAwGqUHQAAYDXKDgAAsBplBwAAWI2yAwAArEbZAQAAVqPsAAAAq1F2AACA1Sg7AADAapQdAABgtfBABwAAABduSNZrgY7QaZ8snxLQv8/ODgAAsBplBwAAWI2yAwAArEbZAQAAVqPsAAAAq1F2AACA1Sg7AADAagEtO++++66mTp2qhIQEORwOvfLKK173G2OUk5OjhIQERUVFafz48dq7d6/XnObmZj344IPq37+/YmJiNG3aNB05cqQHnwUAAAhmAS07x48f11VXXaWioqJz3p+fn6+CggIVFRVp+/btcrvdSktLU2Njo2dOZmamNm3apPXr12vr1q1qamrSzTffrLa2tp56GgAAIIgF9BuUb7rpJt10003nvM8Yo8LCQi1dulTTp0+XJK1bt04ul0slJSWaP3++6uvrtWbNGv3mN7/RpEmTJEkvvPCCEhMT9dZbb+nGG2/ssecCAACCU9Bes1NRUaGamhqlp6d7xpxOp8aNG6dt27ZJknbu3KnW1lavOQkJCUpJSfHMOZfm5mY1NDR43QAAgJ2CtuzU1NRIklwul9e4y+Xy3FdTU6PIyEj16dPnvHPOJS8vT/Hx8Z5bYmKin9MDAIBgEbRl5yyHw+F1bIzpMPZlXzcnOztb9fX1nltVVZVfsgIAgOATtGXH7XZLUocdmtraWs9uj9vtVktLi+rq6s4751ycTqfi4uK8bgAAwE5BW3aSkpLkdrtVVlbmGWtpaVF5eblSU1MlSSNHjlRERITXnOrqan300UeeOQAA4JstoJ/Gampq0l//+lfPcUVFhfbs2aO+ffvq0ksvVWZmpnJzc5WcnKzk5GTl5uYqOjpas2bNkiTFx8dr3rx5WrRokfr166e+fftq8eLFGjFihOfTWQAA4JstoGVnx44dmjBhgud44cKFkqS5c+dq7dq1WrJkiU6ePKmMjAzV1dVpzJgxKi0tVWxsrOcxTzzxhMLDwzVjxgydPHlSEydO1Nq1axUWFtbjzwcAAAQfhzHGBDpEoDU0NCg+Pl719fV+v35nSNZrfj1fT/lk+ZRARwAAnEMovq5012vKhb5+B+01OwAAAP5A2QEAAFaj7AAAAKtRdgAAgNUoOwAAwGqUHQAAYDXKDgAAsBplBwAAWI2yAwAArEbZAQAAVqPsAAAAq1F2AACA1Sg7AADAapQdAABgNcoOAACwGmUHAABYjbIDAACsRtkBAABWo+wAAACrUXYAAIDVKDsAAMBqlB0AAGA1yg4AALAaZQcAAFiNsgMAAKxG2QEAAFaj7AAAAKtRdgAAgNUoOwAAwGqUHQAAYDXKDgAAsFp4oAMgOA3Jei3QETrtk+VTAh0BABCE2NkBAABWo+wAAACrUXYAAIDVKDsAAMBqlB0AAGA1yg4AALAaZQcAAFiNsgMAAKxG2QEAAFaj7AAAAKtRdgAAgNUoOwAAwGqUHQAAYDXKDgAAsFp4oAMAQHcbkvVaoCN02ifLpwQ6AmCNoN7ZOX36tH76058qKSlJUVFRGjp0qB599FG1t7d75hhjlJOTo4SEBEVFRWn8+PHau3dvAFMDAIBgEtRlZ8WKFXr66adVVFSkv/zlL8rPz9fKlSv11FNPeebk5+eroKBARUVF2r59u9xut9LS0tTY2BjA5AAAIFgEddn505/+pB/84AeaMmWKhgwZottuu03p6enasWOHpDO7OoWFhVq6dKmmT5+ulJQUrVu3TidOnFBJSUmA0wMAgGAQ1GXn+uuv19tvv60DBw5Ikj744ANt3bpVkydPliRVVFSopqZG6enpnsc4nU6NGzdO27ZtC0hmAAAQXIL6AuWf/OQnqq+v1+WXX66wsDC1tbXpF7/4hf71X/9VklRTUyNJcrlcXo9zuVyqrKw873mbm5vV3NzsOW5oaOiG9AAAIBgE9c7Ohg0b9MILL6ikpES7du3SunXr9B//8R9at26d1zyHw+F1bIzpMPZFeXl5io+P99wSExO7JT8AAAi8oC47P/7xj5WVlaXbb79dI0aM0J133qmHH35YeXl5kiS32y3pHzs8Z9XW1nbY7fmi7Oxs1dfXe25VVVXd9yQAAEBABXXZOXHihHr18o4YFhbm+eh5UlKS3G63ysrKPPe3tLSovLxcqamp5z2v0+lUXFyc1w0AANgpqK/ZmTp1qn7xi1/o0ksv1ZVXXqndu3eroKBA99xzj6Qzb19lZmYqNzdXycnJSk5OVm5urqKjozVr1qwApwcAAMEgqMvOU089pZ/97GfKyMhQbW2tEhISNH/+fP385z/3zFmyZIlOnjypjIwM1dXVacyYMSotLVVsbGwAkwMAgGAR1GUnNjZWhYWFKiwsPO8ch8OhnJwc5eTk9FguAAAQOoL6mh0AAICuouwAAACrUXYAAIDVKDsAAMBqlB0AAGC1oP40FgB8Uw3Jei3QETrtk+VTAh0BOCd2dgAAgNUoOwAAwGqUHQAAYDXKDgAAsBplBwAAWM2nslNRUeHvHAAAAN3Cp7Jz2WWXacKECXrhhRd06tQpf2cCAADwG5/KzgcffKBrrrlGixYtktvt1vz58/X+++/7OxsAAECX+fSlgikpKSooKFB+fr7+67/+S2vXrtX111+v5ORkzZs3T3feeacuvvhif2cFAAQxvggRwapLFyiHh4fr1ltv1W9/+1utWLFCBw8e1OLFizVo0CDNmTNH1dXV/soJAADgky6VnR07digjI0MDBw5UQUGBFi9erIMHD+qdd97Rp59+qh/84Af+ygkAAOATn97GKigoUHFxsfbv36/Jkyfr+eef1+TJk9Wr15nulJSUpGeeeUaXX365X8MCAAB0lk9lZ/Xq1brnnnt09913y+12n3POpZdeqjVr1nQpHAAAQFf5VHY+/vjjr50TGRmpuXPn+nJ6AAAAv/Gp7BQXF+uf/umf9MMf/tBr/He/+51OnDhByUFA8EkQAMC5+HSB8vLly9W/f/8O4wMGDFBubm6XQwEAAPiLT2WnsrJSSUlJHcYHDx6sw4cPdzkUAACAv/hUdgYMGKAPP/yww/gHH3ygfv36dTkUAACAv/hUdm6//XY99NBD2rx5s9ra2tTW1qZ33nlHCxYs0O233+7vjAAAAD7z6QLlxx57TJWVlZo4caLCw8+cor29XXPmzOGaHQAAEFR8KjuRkZHasGGD/v3f/10ffPCBoqKiNGLECA0ePNjf+QAAALrEp7Jz1rBhwzRs2DB/ZQEAAPA7n8pOW1ub1q5dq7ffflu1tbVqb2/3uv+dd97xSzgAAICu8qnsLFiwQGvXrtWUKVOUkpIih8Ph71zANwJfhAgA3c+nsrN+/Xr99re/1eTJk/2dBwAAwK98+uh5ZGSkLrvsMn9nAQAA8Dufys6iRYv0y1/+UsYYf+cBAADwK5/extq6das2b96sN954Q1deeaUiIiK87t+4caNfwgEAAHSVT2Xnoosu0q233urvLAAAAH7nU9kpLi72dw4AAIBu4dM1O5J0+vRpvfXWW3rmmWfU2NgoSTp69Kiampr8Fg4AAKCrfNrZqays1L/8y7/o8OHDam5uVlpammJjY5Wfn69Tp07p6aef9ndOAAAAn/i0s7NgwQKNGjVKdXV1ioqK8ozfeuutevvtt/0WDgAAoKt8/jTWe++9p8jISK/xwYMH69NPP/VLMAAAAH/waWenvb1dbW1tHcaPHDmi2NjYLocCAADwF5/KTlpamgoLCz3HDodDTU1NeuSRR/gJCQAAEFR8ehvriSee0IQJEzR8+HCdOnVKs2bN0scff6z+/fvrpZde8ndGAAAAn/lUdhISErRnzx699NJL2rVrl9rb2zVv3jzNnj3b64JlAACAQPOp7EhSVFSU7rnnHt1zzz3+zAMAAOBXPpWd559//ivvnzNnjk9hAAAA/M2nsrNgwQKv49bWVp04cUKRkZGKjo6m7AAAQsKQrNcCHQE9wKdPY9XV1XndmpqatH//fl1//fVcoAwAAIKKz7+N9WXJyclavnx5h12frvr00091xx13qF+/foqOjtbVV1+tnTt3eu43xignJ0cJCQmKiorS+PHjtXfvXr9mAAAAoctvZUeSwsLCdPToUb+dr66uTmPHjlVERITeeOMN7du3T48//rguuugiz5z8/HwVFBSoqKhI27dvl9vtVlpamufHSQEAwDebT9fsvPrqq17HxhhVV1erqKhIY8eO9UswSVqxYoUSExNVXFzsGRsyZIjX3y0sLNTSpUs1ffp0SdK6devkcrlUUlKi+fPn+y0LAAAITT6VnVtuucXr2OFw6OKLL9b3v/99Pf744/7IJelMqbrxxhv1wx/+UOXl5brkkkuUkZGhH/3oR5KkiooK1dTUKD093fMYp9OpcePGadu2bectO83NzWpubvYcNzQ0+C0zAAAILj6Vnfb2dn/nOKdDhw5p9erVWrhwof7t3/5N77//vh566CE5nU7NmTNHNTU1kiSXy+X1OJfLpcrKyvOeNy8vT8uWLevW7AAAIDj49Zodf2tvb9e1116r3NxcXXPNNZo/f75+9KMfafXq1V7zHA6H17ExpsPYF2VnZ6u+vt5zq6qq6pb8AAAg8Hza2Vm4cOEFzy0oKPDlT0iSBg4cqOHDh3uNXXHFFfr9738vSXK73ZKkmpoaDRw40DOntra2w27PFzmdTjmdTp9zAQCA0OFT2dm9e7d27dql06dP69vf/rYk6cCBAwoLC9O1117rmfdVuysXYuzYsdq/f7/X2IEDBzR48GBJUlJSktxut8rKynTNNddIklpaWlReXq4VK1Z06W8DAAA7+FR2pk6dqtjYWK1bt059+vSRdOZj4nfffbduuOEGLVq0yC/hHn74YaWmpio3N1czZszQ+++/r2effVbPPvuspDNlKjMzU7m5uUpOTlZycrJyc3MVHR2tWbNm+SUDAAAIbT6Vnccff1ylpaWeoiNJffr00WOPPab09HS/lZ3vfve72rRpk7Kzs/Xoo48qKSlJhYWFmj17tmfOkiVLdPLkSWVkZKiurk5jxoxRaWmpYmNj/ZIBAACENp/KTkNDg/72t7/pyiuv9Bqvra31+5f53Xzzzbr55pvPe7/D4VBOTo5ycnL8+ncBAIAdfPo01q233qq7775bL7/8so4cOaIjR47o5Zdf1rx58zxf7gcAABAMfNrZefrpp7V48WLdcccdam1tPXOi8HDNmzdPK1eu9GtAAACArvCp7ERHR2vVqlVauXKlDh48KGOMLrvsMsXExPg7HwAAQJd06UsFq6urVV1drWHDhikmJkbGGH/lAgAA8Aufys5nn32miRMnatiwYZo8ebKqq6slSffee6/fPokFAADgDz6VnYcfflgRERE6fPiwoqOjPeMzZ87Um2++6bdwAAAAXeXTNTulpaX64x//qEGDBnmNJycnf+UPcAIAAPQ0n3Z2jh8/7rWjc9axY8f4zSkAABBUfCo73/ve9/T88897jh0Oh9rb27Vy5UpNmDDBb+EAAAC6yqe3sVauXKnx48drx44damlp0ZIlS7R37159/vnneu+99/ydEQAAwGc+7ewMHz5cH374oUaPHq20tDQdP35c06dP1+7du/Wtb33L3xkBAAB81umdndbWVqWnp+uZZ57RsmXLuiMTAACA33R6ZyciIkIfffSRHA5Hd+QBAADwK5/expozZ47WrFnj7ywAAAB+59MFyi0tLfrP//xPlZWVadSoUR1+E6ugoMAv4QAAALqqU2Xn0KFDGjJkiD766CNde+21kqQDBw54zeHtLQAAEEw6VXaSk5NVXV2tzZs3Szrz8xBPPvmkXC5Xt4QDAADoqk6VnS//qvkbb7yh48eP+zUQgOA2JOu1QEcAgE7x6QLls75cfgAAAIJNp8qOw+HocE0O1+gAAIBg1um3se666y7Pj32eOnVK9913X4dPY23cuNF/CQEAALqgU2Vn7ty5Xsd33HGHX8MAAAD4W6fKTnFxcXflAAAA6BZdukAZAAAg2FF2AACA1Sg7AADAapQdAABgNcoOAACwGmUHAABYjbIDAACsRtkBAABWo+wAAACrUXYAAIDVKDsAAMBqlB0AAGA1yg4AALAaZQcAAFiNsgMAAKxG2QEAAFaj7AAAAKtRdgAAgNUoOwAAwGqUHQAAYDXKDgAAsBplBwAAWI2yAwAArEbZAQAAVqPsAAAAq4VU2cnLy5PD4VBmZqZnzBijnJwcJSQkKCoqSuPHj9fevXsDFxIAAASVkCk727dv17PPPqvvfOc7XuP5+fkqKChQUVGRtm/fLrfbrbS0NDU2NgYoKQAACCYhUXaampo0e/ZsPffcc+rTp49n3BijwsJCLV26VNOnT1dKSorWrVunEydOqKSkJICJAQBAsAiJsnP//fdrypQpmjRpktd4RUWFampqlJ6e7hlzOp0aN26ctm3bdt7zNTc3q6GhwesGAADsFB7oAF9n/fr12rVrl7Zv397hvpqaGkmSy+XyGne5XKqsrDzvOfPy8rRs2TL/BgUAAEEpqHd2qqqqtGDBAr3wwgvq3bv3eec5HA6vY2NMh7Evys7OVn19vedWVVXlt8wAACC4BPXOzs6dO1VbW6uRI0d6xtra2vTuu++qqKhI+/fvl3Rmh2fgwIGeObW1tR12e77I6XTK6XR2X3AAABA0gnpnZ+LEifrzn/+sPXv2eG6jRo3S7NmztWfPHg0dOlRut1tlZWWex7S0tKi8vFypqakBTA4AAIJFUO/sxMbGKiUlxWssJiZG/fr184xnZmYqNzdXycnJSk5OVm5urqKjozVr1qxARAYAAEEmqMvOhViyZIlOnjypjIwM1dXVacyYMSotLVVsbGygowEAgCDgMMaYQIcItIaGBsXHx6u+vl5xcXF+PfeQrNf8ej4AAELNJ8undMt5L/T1O6iv2QEAAOgqyg4AALAaZQcAAFiNsgMAAKxG2QEAAFaj7AAAAKtRdgAAgNUoOwAAwGqUHQAAYDXKDgAAsBplBwAAWI2yAwAArEbZAQAAVqPsAAAAq1F2AACA1Sg7AADAapQdAABgNcoOAACwGmUHAABYjbIDAACsRtkBAABWo+wAAACrUXYAAIDVKDsAAMBqlB0AAGA1yg4AALAaZQcAAFiNsgMAAKxG2QEAAFaj7AAAAKtRdgAAgNUoOwAAwGqUHQAAYDXKDgAAsBplBwAAWI2yAwAArEbZAQAAVqPsAAAAq1F2AACA1Sg7AADAapQdAABgNcoOAACwGmUHAABYjbIDAACsRtkBAABWo+wAAACrUXYAAIDVgrrs5OXl6bvf/a5iY2M1YMAA3XLLLdq/f7/XHGOMcnJylJCQoKioKI0fP1579+4NUGIAABBsgrrslJeX6/7779f//u//qqysTKdPn1Z6erqOHz/umZOfn6+CggIVFRVp+/btcrvdSktLU2NjYwCTAwCAYBEe6ABf5c033/Q6Li4u1oABA7Rz505973vfkzFGhYWFWrp0qaZPny5JWrdunVwul0pKSjR//vxAxAYAAEEkqHd2vqy+vl6S1LdvX0lSRUWFampqlJ6e7pnjdDo1btw4bdu2LSAZAQBAcAnqnZ0vMsZo4cKFuv7665WSkiJJqqmpkSS5XC6vuS6XS5WVlec9V3Nzs5qbmz3HDQ0N3ZAYAAAEg5DZ2XnggQf04Ycf6qWXXupwn8Ph8Do2xnQY+6K8vDzFx8d7bomJiX7PCwAAgkNIlJ0HH3xQr776qjZv3qxBgwZ5xt1ut6R/7PCcVVtb22G354uys7NVX1/vuVVVVXVPcAAAEHBBXXaMMXrggQe0ceNGvfPOO0pKSvK6PykpSW63W2VlZZ6xlpYWlZeXKzU19bzndTqdiouL87oBAAA7BfU1O/fff79KSkr0hz/8QbGxsZ4dnPj4eEVFRcnhcCgzM1O5ublKTk5WcnKycnNzFR0drVmzZgU4PQAACAZBXXZWr14tSRo/frzXeHFxse666y5J0pIlS3Ty5EllZGSorq5OY8aMUWlpqWJjY3s4LQAACEZBXXaMMV87x+FwKCcnRzk5Od0fCAAAhJygvmYHAACgqyg7AADAapQdAABgNcoOAACwGmUHAABYjbIDAACsRtkBAABWo+wAAACrUXYAAIDVKDsAAMBqlB0AAGA1yg4AALAaZQcAAFiNsgMAAKxG2QEAAFaj7AAAAKtRdgAAgNUoOwAAwGqUHQAAYDXKDgAAsBplBwAAWI2yAwAArEbZAQAAVqPsAAAAq1F2AACA1Sg7AADAapQdAABgNcoOAACwGmUHAABYjbIDAACsRtkBAABWo+wAAACrUXYAAIDVKDsAAMBqlB0AAGA1yg4AALAaZQcAAFiNsgMAAKxG2QEAAFaj7AAAAKtRdgAAgNUoOwAAwGqUHQAAYDXKDgAAsBplBwAAWI2yAwAArEbZAQAAVqPsAAAAq1lTdlatWqWkpCT17t1bI0eO1JYtWwIdCQAABAErys6GDRuUmZmppUuXavfu3brhhht000036fDhw4GOBgAAAsyKslNQUKB58+bp3nvv1RVXXKHCwkIlJiZq9erVgY4GAAACLDzQAbqqpaVFO3fuVFZWltd4enq6tm3bds7HNDc3q7m52XNcX18vSWpoaPB7vvbmE34/JwAAoaQ7Xl+/eF5jzFfOC/myc+zYMbW1tcnlcnmNu1wu1dTUnPMxeXl5WrZsWYfxxMTEbskIAMA3WXxh956/sbFR8fHx570/5MvOWQ6Hw+vYGNNh7Kzs7GwtXLjQc9ze3q7PP/9c/fr1O+9jfNHQ0KDExERVVVUpLi7Ob+dFR6x1z2Cdewbr3DNY557RnetsjFFjY6MSEhK+cl7Il53+/fsrLCyswy5ObW1th92es5xOp5xOp9fYRRdd1F0RFRcXx/+Qeghr3TNY557BOvcM1rlndNc6f9WOzlkhf4FyZGSkRo4cqbKyMq/xsrIypaamBigVAAAIFiG/syNJCxcu1J133qlRo0bpuuuu07PPPqvDhw/rvvvuC3Q0AAAQYFaUnZkzZ+qzzz7To48+qurqaqWkpOj111/X4MGDA5rL6XTqkUce6fCWGfyPte4ZrHPPYJ17BuvcM4JhnR3m6z6vBQAAEMJC/podAACAr0LZAQAAVqPsAAAAq1F2AACA1Sg7XbRq1SolJSWpd+/eGjlypLZs2fKV88vLyzVy5Ej17t1bQ4cO1dNPP91DSUNbZ9Z548aNSktL08UXX6y4uDhdd911+uMf/9iDaUNbZ/9Nn/Xee+8pPDxcV199dfcGtERn17m5uVlLly7V4MGD5XQ69a1vfUu//vWveyht6OrsOr/44ou66qqrFB0drYEDB+ruu+/WZ5991kNpQ9O7776rqVOnKiEhQQ6HQ6+88srXPqbHXwsNfLZ+/XoTERFhnnvuObNv3z6zYMECExMTYyorK885/9ChQyY6OtosWLDA7Nu3zzz33HMmIiLCvPzyyz2cPLR0dp0XLFhgVqxYYd5//31z4MABk52dbSIiIsyuXbt6OHno6exan/X3v//dDB061KSnp5urrrqqZ8KGMF/Wedq0aWbMmDGmrKzMVFRUmP/7v/8z7733Xg+mDj2dXectW7aYXr16mV/+8pfm0KFDZsuWLebKK680t9xySw8nDy2vv/66Wbp0qfn9739vJJlNmzZ95fxAvBZSdrpg9OjR5r777vMau/zyy01WVtY55y9ZssRcfvnlXmPz5883//zP/9xtGW3Q2XU+l+HDh5tly5b5O5p1fF3rmTNnmp/+9KfmkUceoexcgM6u8xtvvGHi4+PNZ5991hPxrNHZdV65cqUZOnSo19iTTz5pBg0a1G0ZbXMhZScQr4W8jeWjlpYW7dy5U+np6V7j6enp2rZt2zkf86c//anD/BtvvFE7duxQa2trt2UNZb6s85e1t7ersbFRffv27Y6I1vB1rYuLi3Xw4EE98sgj3R3RCr6s86uvvqpRo0YpPz9fl1xyiYYNG6bFixfr5MmTPRE5JPmyzqmpqTpy5Ihef/11GWP0t7/9TS+//LKmTJnSE5G/MQLxWmjFNygHwrFjx9TW1tbhx0ZdLleHHyU9q6am5pzzT58+rWPHjmngwIHdljdU+bLOX/b444/r+PHjmjFjRndEtIYva/3xxx8rKytLW7ZsUXg4/3dyIXxZ50OHDmnr1q3q3bu3Nm3apGPHjikjI0Off/451+2chy/rnJqaqhdffFEzZ87UqVOndPr0aU2bNk1PPfVUT0T+xgjEayE7O13kcDi8jo0xHca+bv65xuGts+t81ksvvaScnBxt2LBBAwYM6K54VrnQtW5ra9OsWbO0bNkyDRs2rKfiWaMz/6bb29vlcDj04osvavTo0Zo8ebIKCgq0du1adne+RmfWed++fXrooYf085//XDt37tSbb76piooKfmexG/T0ayH/Keaj/v37KywsrMN/IdTW1nZorGe53e5zzg8PD1e/fv26LWso82Wdz9qwYYPmzZun3/3ud5o0aVJ3xrRCZ9e6sbFRO3bs0O7du/XAAw9IOvOibIxReHi4SktL9f3vf79HsocSX/5NDxw4UJdcconi4+M9Y1dccYWMMTpy5IiSk5O7NXMo8mWd8/LyNHbsWP34xz+WJH3nO99RTEyMbrjhBj322GPsvvtJIF4L2dnxUWRkpEaOHKmysjKv8bKyMqWmpp7zMdddd12H+aWlpRo1apQiIiK6LWso82WdpTM7OnfddZdKSkp4v/0CdXat4+Li9Oc//1l79uzx3O677z59+9vf1p49ezRmzJieih5SfPk3PXbsWB09elRNTU2esQMHDqhXr14aNGhQt+YNVb6s84kTJ9Srl/fLYlhYmKR/7Dyg6wLyWthtlz5/A5z9WOOaNWvMvn37TGZmpomJiTGffPKJMcaYrKwsc+edd3rmn/243cMPP2z27dtn1qxZw0fPL0Bn17mkpMSEh4ebX/3qV6a6utpz+/vf/x6opxAyOrvWX8ansS5MZ9e5sbHRDBo0yNx2221m7969pry83CQnJ5t77703UE8hJHR2nYuLi014eLhZtWqVOXjwoNm6dasZNWqUGT16dKCeQkhobGw0u3fvNrt37zaSTEFBgdm9e7fnI/7B8FpI2emiX/3qV2bw4MEmMjLSXHvttaa8vNxz39y5c824ceO85v/P//yPueaaa0xkZKQZMmSIWb16dQ8nDk2dWedx48YZSR1uc+fO7fngIaiz/6a/iLJz4Tq7zn/5y1/MpEmTTFRUlBk0aJBZuHChOXHiRA+nDj2dXecnn3zSDB8+3ERFRZmBAwea2bNnmyNHjvRw6tCyefPmr/z/3GB4LXQYw94cAACwF9fsAAAAq1F2AACA1Sg7AADAapQdAABgNcoOAACwGmUHAABYjbIDAACsRtkBAABWo+wAAACrUXYAAIDVKDsAAMBqlB0AAGC1/wcbD60z4ayu4wAAAABJRU5ErkJggg==",
      "text/plain": [
       "<Figure size 640x480 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "import matplotlib.pyplot as plt # you need to import the matplotlib library \n",
    "latest[\"percentageFull\"].plot.hist()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### More data\n",
    "\n",
    "If you want to play with a day's worth of data, try this URL - http://staff.city.ac.uk/~sbbb717/tfl_bikes/last24h - this live data from the latest 24. It is more minimal, so you'll need to join to station name data (look up Pandas' `merge` function). It also has a time column, so have a look at the `datetime` modules and its `strptime()` function."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Python notebooks\n",
    "\n",
    "Finally, a bit about Python notebooks. We've been using Spyder because of the autocomplete, debugger and variable explorer.\n",
    "\n",
    "A python notebook is a document in which can intersperse blocks of python code and output with markdown that gives a narrative. This document is a Python notebook. You can [download it here](pythonSession.ipynb) and open it in \"Jupyter Lab\" from the Anaconda launcher. This will enable you to open the notebook in your web browser and execute the code in the browser. It's a really nice way to build a narrative around your work and we will be using it during the MSc. You can easily export as an HTML page (as you've been reading this) so you can easily show what you've done.\n",
    "\n",
    "[Google colab](https://colab.research.google.com/) is a hosted solution where you can edit the notebook on the server and share with others."
   ]
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.17"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}