{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Lab 1: Introduction\n", "\n", "Some helpful links:\n", " * Python (https://docs.python.org/3.9/tutorial/index.html): an introduction to the Python programming language\n", " * Google Colab (https://colab.research.google.com/): for Python development in your web-browser\n", " * Miniconda (https://docs.conda.io/en/latest/miniconda.html): a free minimal installer for conda\n", " * NumPy (https://numpy.org/doc/stable/user/quickstart.html): a widely used library for mathematical operations in Python\n", " * Seaborn (https://seaborn.pydata.org/): a library for creating nice looking graphs and figures" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## NumPy array basics\n", "\n", "The basic data structure is a NumPy array" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Import NumPy\n", "import numpy as np" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Define a 1D array (i.e. a vector)\n", "x = np.array([1, 2, 3, 4])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Check the \"shape\" of the array\n", "x.shape" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Define a 2D array (i.e. a matrix)\n", "w = np.array([[1, 2, 3, 4], \n", " [1, 2, 3, 4]])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Check the shape of w\n", "w.shape" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Do a matrix multiplication\n", "out = np.matmul(w, x)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Note, this is the same\n", "out = w @ x\n", "print(out)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "out.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Loading data with Numpy" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "data = np.loadtxt(fname=\"data/leapfrog_sho.txt\")\n", "\n", "print(data)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The expression `np.loadtxt(...)` is a function call that asks Python to run the function] `loadtxt` that\n", "belongs to the NumPy library.\n", "The dot notation in Python is used most of all as an object attribute/property specifier or for invoking its method. \n", "`object.property` will give you the object.property value, `object_name.method()` will invoke on object_name method.\n", "\n", "`np.loadtxt` has two parameters: the name of the file we want to read and the delimiter that separates values on a line. \n", "These both need to be character strings (or strings for short), so we put them in quotes.\n", "\n", "By default, only a few rows and columns are shown (with `...` to omit elements when displaying big arrays).\n", "Note that, to save space when displaying NumPy arrays, Python does not show us trailing zeros, so `1.0` becomes `1.`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Let's print out some features of this data\n", "print(type(data))\n", "print(data.dtype)\n", "print(data.shape)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The output tells us that the `data` array variable contains 65 rows and 4 columns. \n", "When we created the variable `data` to store our data, we did not only create the array; we also created information about the array, called members or attributes. \n", "`data.shape` is an attribute of `data` which describes the dimensions of `data`.\n", "\n", "If we want to get a single number from the array, we must provide an index in square brackets after the variable name, just as we do in math when referring to an element of a matrix. \n", "\n", "\n", "Our data has two dimensions, so we will need to use two indices to refer to one specific value:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(f\"first value in data: {data[0, 0]}\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(f\"middle value in data: {data[30, 0]}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The expression `data[30, 0]` accesses the element at row 30, column 0, while `data[0, 0]` accesses the element at row 0, column 0.\n", "Languages in the C family (including C++, Java, Perl, and Python) count from 0 because it represents an offset from the first value in the array (the second value is offset by one index from the first value).\n", "As a result, if we have an $M\\times N$ array in Python, its indices go from $0$ to $M-1$ on the first axis and $0$ to $N-1$ on the second.\n", "\n", "![\"data\" is a 3 by 3 numpy array containing row 0: ['A', 'B', 'C'], row 1: ['D', 'E', 'F'], and\n", "row 2: ['G', 'H', 'I']. Starting in the upper left hand corner, data[0, 0] = 'A', data[0, 1] = 'B', data[0, 2] = 'C', data[1, 0] = 'D', data[1, 1] = 'E', data[1, 2] = 'F', data[2, 0] = 'G', data[2, 1] = 'H', and data[2, 2] = 'I', in the bottom right hand corner.](images/python-zero-index.svg)\n", "\n", "When Python displays an array, it shows the element with index `[0, 0]` in the upper left corner rather than the lower left.\n", "This is consistent with the way mathematicians draw matrices but different from the Cartesian coordinates.\n", "The indices are (row, column) instead of (column, row) for the same reason, which can be confusing when plotting data." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Slicing data\n", "An index like `[30, 0]` selects a single element of an array, but we can select whole sections as well." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(data[0:4, :])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The slice `0:4` means, “Start at index 0 and go up to, but not including, index 4”. \n", "The difference between the upper and lower bounds is the number of values in the slice.\n", "\n", "We don’t have to start slices at 0:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(data[5:10, :])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We also don't have to include the upper and lower bound on the slice. \n", "If we don't include the lower bound, Python uses `0` by default; if we don't include the upper, the slice runs to the end of the axis, and if we don’t include either (i.e., if we use `:` on its own), the slice includes everything:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "small = data[:3, 2:]\n", "print(f\"small is:\\n{small}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The above example selects rows 0 through 2 and columns 2 through to the end of the array." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Analyzing data\n", "\n", "NumPy has several useful functions that take an array as input to perform operations on its values.\n", "If we want to find the average inflammation for all patients on\n", "all days, for example, we can ask NumPy to compute `data`'s mean value:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.mean(data)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "However, this is not very meaningful because it's a mean over positions, velocities, etc.\n", "Instead, we can ask for the average position (or velocity) over all time steps." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# average position\n", "print(np.mean(data[:, 2]))\n", "\n", "# average velocity\n", "print(np.mean(data[:, 3]))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Visualizing data\n", "Visualization deserves an entire lecture of its own, but we can explore a few features of Python's `matplotlib` library here.\n", "While there is no official plotting library, `matplotlib` is the _de facto_ standard.\n", "First, we will import the `pyplot` module from `matplotlib` and use it to create and display our data:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "\n", "plt.plot(data[:, 0], data[:, 2], label=\"Position\")\n", "plt.plot(data[:, 0], data[:, 3], label=\"Velocity\", ls=\"--\")\n", "\n", "plt.ylabel(\"Position or velocity\")\n", "plt.xlabel(\"Time\")\n", "\n", "plt.legend()\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Make your own plot\n", "\n", "Create a plot showing the position, velocity, and acceleration of the point at each time step." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Your code here" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.5" }, "vscode": { "interpreter": { "hash": "d0ea348b636367bcdf67fd2d6d24251712b38670f61fdee14f28eb58fe74f081" } } }, "nbformat": 4, "nbformat_minor": 4 }