720 lines
110 KiB
Plaintext
720 lines
110 KiB
Plaintext
|
{
|
||
|
"cells": [
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {
|
||
|
"slideshow": {
|
||
|
"slide_type": "slide"
|
||
|
}
|
||
|
},
|
||
|
"source": [
|
||
|
"# Lecture 5: Training I"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {
|
||
|
"slideshow": {
|
||
|
"slide_type": "skip"
|
||
|
}
|
||
|
},
|
||
|
"source": [
|
||
|
"data:image/s3,"s3://crabby-images/8691a/8691ad7e08fdc24de3bea5cea2b2da59630fa836" alt=""\n",
|
||
|
"[Run in colab](https://colab.research.google.com/drive/10z5cZZcHnp1cfCRiD4ESsdwLR7NFb5p0)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 1,
|
||
|
"metadata": {
|
||
|
"execution": {
|
||
|
"iopub.execute_input": "2024-01-10T00:19:25.718881Z",
|
||
|
"iopub.status.busy": "2024-01-10T00:19:25.718648Z",
|
||
|
"iopub.status.idle": "2024-01-10T00:19:25.726287Z",
|
||
|
"shell.execute_reply": "2024-01-10T00:19:25.725751Z"
|
||
|
},
|
||
|
"slideshow": {
|
||
|
"slide_type": "skip"
|
||
|
}
|
||
|
},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"name": "stdout",
|
||
|
"output_type": "stream",
|
||
|
"text": [
|
||
|
"Last executed: 2024-01-10 00:19:25\n"
|
||
|
]
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"import datetime\n",
|
||
|
"now = datetime.datetime.now()\n",
|
||
|
"print(\"Last executed: \" + now.strftime(\"%Y-%m-%d %H:%M:%S\"))"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {
|
||
|
"slideshow": {
|
||
|
"slide_type": "slide"
|
||
|
}
|
||
|
},
|
||
|
"source": [
|
||
|
"## Linear regression"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {
|
||
|
"slideshow": {
|
||
|
"slide_type": "subslide"
|
||
|
}
|
||
|
},
|
||
|
"source": [
|
||
|
"### Linear regression mode (scalar form)\n",
|
||
|
"\n",
|
||
|
"$$\\hat{y} = \\theta_0 + \\theta_1 x_1 + ... + \\theta_n x_n$$\n",
|
||
|
"\n",
|
||
|
"- $\\hat{y}$ is the predicted value.\n",
|
||
|
"- $n$ is the number of features.\n",
|
||
|
"- $x_j$ is the $j$th feature, for $j=0,1,...,n$.\n",
|
||
|
"- $\\theta_j$ is the $j$th model parameter\n",
|
||
|
"<br>(with bias $\\theta_0$ and feature weights $\\theta_1, ..., \\theta_n)$."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {
|
||
|
"slideshow": {
|
||
|
"slide_type": "subslide"
|
||
|
}
|
||
|
},
|
||
|
"source": [
|
||
|
"### Linear regression mode (vector form)\n",
|
||
|
"\n",
|
||
|
"$$\\hat{y} =\\theta^{\\rm T} x = h_\\theta(x)$$\n",
|
||
|
"\n",
|
||
|
"- $x$ is the instance feature vector $x=(x_0, x_1, ... x_n)^{\\rm T}$, where $x_0 = 1$.\n",
|
||
|
"- $\\theta$ is the vector of model parameters $\\theta=(\\theta_0, \\theta_1, ... \\theta_n)^{\\rm T}$.\n",
|
||
|
"- $\\cdot^{\\rm T}$ denotes transpose.\n",
|
||
|
"- $h_\\theta(x)$ is the hypothesis function, with parameters $\\theta$."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {
|
||
|
"slideshow": {
|
||
|
"slide_type": "subslide"
|
||
|
}
|
||
|
},
|
||
|
"source": [
|
||
|
"### Train linear regression\n",
|
||
|
"\n",
|
||
|
"Give training data $\\{ x^{(i)}, y^{(i)} \\}$, for $m$ instances $i=1,2,...,m$.\n",
|
||
|
"\n",
|
||
|
"Minimise mean square error (MSE):\n",
|
||
|
"\n",
|
||
|
"$$\\text{MSE} = \\frac{1}{m} \\sum_{i=1}^{m} \\left(\\theta^{\\rm T} x^{(i)} - y^{(i)}\\right)^2 .$$"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {
|
||
|
"slideshow": {
|
||
|
"slide_type": "subslide"
|
||
|
}
|
||
|
},
|
||
|
"source": [
|
||
|
"### Concise matrix-vector notation\n",
|
||
|
"\n",
|
||
|
"Recall:\n",
|
||
|
"- $x_j^{(i)}$ is (scalar) value of feature $j$ in $i$th training example.\n",
|
||
|
"- $x^{(i)}$ is the (column) vector of features of the $i$th training example.\n",
|
||
|
"- $y^{(i)}$ is the (scalar) value of target of the $i$th training example.\n",
|
||
|
"- $n$ features and $m$ training instances\n",
|
||
|
"\n",
|
||
|
"Define:\n",
|
||
|
"- Feature matrix: $X_{m \\times n} = [ x^{(1)},\\ x^{(2)},\\ ...,\\ x^{(m)}]^{\\rm T}$.\n",
|
||
|
"- Target vector: $y_{m \\times 1} = [ y^{(1)},\\ y^{(2)},\\ ...,\\ y^{(m)}]^{\\rm T}$."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {
|
||
|
"slideshow": {
|
||
|
"slide_type": "subslide"
|
||
|
}
|
||
|
},
|
||
|
"source": [
|
||
|
"### Recall features matrix and target vector\n",
|
||
|
"\n",
|
||
|
"<img src=\"https://raw.githubusercontent.com/astro-informatics/course_mlbd_images/master/Lecture05_Images/data-layout.png\" width=\"500\" style=\"display:block; margin:auto\"/>\n",
|
||
|
"\n",
|
||
|
"[Image source](https://github.com/jakevdp/sklearn_tutorial)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {
|
||
|
"slideshow": {
|
||
|
"slide_type": "subslide"
|
||
|
}
|
||
|
},
|
||
|
"source": [
|
||
|
"Minimising the MSE is equivalent to minimising the cost function\n",
|
||
|
"\n",
|
||
|
"$$C(\\theta) = \\frac{1}{m} (X \\theta - y)^{\\rm T}(X \\theta - y),$$\n",
|
||
|
"\n",
|
||
|
"where for notational convenience we denote the dependence on $\\theta$ only and consider $X$ and $y$ fixed."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {
|
||
|
"slideshow": {
|
||
|
"slide_type": "slide"
|
||
|
}
|
||
|
},
|
||
|
"source": [
|
||
|
"## Normal equations\n",
|
||
|
"\n",
|
||
|
"Minimise the cost function analytically \n",
|
||
|
"\n",
|
||
|
"$$\\min_\\theta\\ C(\\theta) = \\min_\\theta \\ (X \\theta - y)^{\\rm T}(X \\theta - y).$$\n",
|
||
|
"\n",
|
||
|
"Solution given by \n",
|
||
|
"\n",
|
||
|
"$$ \\hat{\\theta} = \\left( X^{\\rm T} X \\right)^{-1} X^{\\rm T} y. $$"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 2,
|
||
|
"metadata": {
|
||
|
"execution": {
|
||
|
"iopub.execute_input": "2024-01-10T00:19:25.767150Z",
|
||
|
"iopub.status.busy": "2024-01-10T00:19:25.766614Z",
|
||
|
"iopub.status.idle": "2024-01-10T00:19:26.182564Z",
|
||
|
"shell.execute_reply": "2024-01-10T00:19:26.181848Z"
|
||
|
},
|
||
|
"slideshow": {
|
||
|
"slide_type": "subslide"
|
||
|
}
|
||
|
},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"# Common imports\n",
|
||
|
"import os\n",
|
||
|
"import numpy as np\n",
|
||
|
"np.random.seed(42) # To make this notebook's output stable across runs\n",
|
||
|
"\n",
|
||
|
"# To plot pretty figures\n",
|
||
|
"%matplotlib inline\n",
|
||
|
"import matplotlib\n",
|
||
|
"import matplotlib.pyplot as plt\n",
|
||
|
"plt.rcParams['axes.labelsize'] = 14\n",
|
||
|
"plt.rcParams['xtick.labelsize'] = 12\n",
|
||
|
"plt.rcParams['ytick.labelsize'] = 12"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 3,
|
||
|
"metadata": {
|
||
|
"execution": {
|
||
|
"iopub.execute_input": "2024-01-10T00:19:26.186110Z",
|
||
|
"iopub.status.busy": "2024-01-10T00:19:26.185473Z",
|
||
|
"iopub.status.idle": "2024-01-10T00:19:26.426019Z",
|
||
|
"shell.execute_reply": "2024-01-10T00:19:26.425332Z"
|
||
|
}
|
||
|
},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAw0AAAIbCAYAAACpGXLSAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjcuNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8WgzjOAAAACXBIWXMAAA9hAAAPYQGoP6dpAABCYUlEQVR4nO3deXxU5b3H8e8kSFgTiqAsCYsQtbKjgoDFaKvBIrgCVaxQYoMWwRUpFS8KAqVaV1TE5iVWpepFvbXiragXN4oogtZesVBtMENAZMvIkgDJc/+Ym8BkOZlMZuZsn/frlRevnJzJPHPmZHi+5/k9zwkYY4wAAAAAoA4pdjcAAAAAgLMRGgAAAABYIjQAAAAAsERoAAAAAGCJ0AAAAADAEqEBAAAAgCVCAwAAAABLhAYAAAAAlprY3YBEqaioUHFxsVq3bq1AIGB3cwAAAIC4MMbo+++/V6dOnZSSkpwxAM+GhuLiYmVlZdndDAAAACAhioqKlJmZmZTn8mxoaN26taTwwUxPT7e5NQAAAEB8hEIhZWVlVfV3k8GzoaGyJCk9PZ3QAAAAAM9JZgk+E6EBAAAAWCI0AAAAALBEaAAAAABgidAAAAAAwBKhAQAAAIAlQgMAAAAAS4QGAAAAAJYIDQAAAAAsERoAAAAAWCI0AAAAALBEaAAAAABgidAAAAAAwBKhAQAAAIAlQgMAAAAAS4QGAAAAAJYIDQAAAAAsERoAAAAAWCI0AAAAALBEaAAAAABgidAAAAAAwBKhAQAAAIAlQgMAAAAAS0kLDfv27dPs2bM1YsQItW3bVoFAQEuXLrV8zOHDh3XaaacpEAjovvvuS05DAQAAAERIWmjYuXOn5syZo40bN6pfv35RPeaRRx7RN998k+CWAQAAALCStNDQsWNHbdu2TVu2bNG9995b7/47duzQnDlzNGPGjCS0DgAAAEBdkhYa0tLS1KFDh6j3//Wvf61TTjlFV199dQJbBQAAAKA+TexuQG0++ugjPf300/rggw8UCATsbg4AAADga44LDcYYTZ06VePGjdOQIUNUWFgY1ePKyspUVlZW9X0oFEpQCwEAAAB/cdySq0uXLtXnn3+uhQsXNuhxCxYsUEZGRtVXVlZWgloIAAAA+IujQkMoFNLMmTM1ffr0Bnf6Z86cqZKSkqqvoqKiBLUSAAAA8BdHlSfdd999OnTokMaNG1dVlhQMBiVJe/bsUWFhoTp16qSmTZvWeGxaWprS0tKS2VwAAADAFxw10vDNN99oz5496tWrl7p3767u3bvrRz/6kSRp/vz56t69u7744gubWwkAAAD4i6NGGqZNm6ZLLrkkYtuOHTs0efJkTZw4URdffLG6d+9uT+MAAAAAn0pqaFi0aJH27t2r4uJiSdJf/vKXqvKjqVOnauDAgRo4cGDEYyrLlHr16lUjUAAAAABIvKSGhvvuu09btmyp+v7ll1/Wyy+/LEm6+uqrlZGRkczmAAAAAIhCUkNDtPdcOFa3bt1kjIl/YwAAAABExVEToQEAAAA4D6EBAAAAgCVCAwAAAABLhAYAAAAAlggNAAAAACwRGgAAAABYIjQAAAAAsERoAAAAAGCJ0AAAAADAEqEBAAAAgCVCAwAAAABLhAYAAAAAlggNAAAAACwRGgAAAABYIjQAAAAAsERoAAAAAGCJ0AAAAADAEqEBAAAAgCVCAwAAAABLhAYAAAAAlggNAAAAACwRGgAAAABYIjQAAAAAsERoAAAAAGCJ0AAAAADAEqEBAAAAgCVCAwAAAABLhAYAAAAAlggNAAAAACwRGgAAAABYIjQAAAAAsERoAAAAAGCJ0AAAAADAEqEBAAAAgCVCAwAAAABLhAYAAAAAlggNAAAAACwRGgAAAABYIjQAAAAAsERoAAAAAGCJ0AAAAADAEqEBAAAAgCVCAwAAAABLhAYAAAAAlggNAAAAACwRGgAAAABYIjQAAAAAsJS00LBv3z7Nnj1bI0aMUNu2bRUIBLR06dKIfSoqKrR06VKNHj1aWVlZatmypXr37q177rlHpaWlyWoqAAAAgGMkLTTs3LlTc+bM0caNG9WvX79a9zlw4IB+8Ytf6LvvvtN1112nBx98UIMGDdLs2bN14YUXyhiTrOYCAAAA+H9NkvVEHTt21LZt29ShQwetW7dOZ555Zo19mjZtqtWrV2vo0KFV2375y1+qW7dumj17tt5++2395Cc/SVaTAQAAACiJIw1paWnq0KGD5T5NmzaNCAyVLr30UknSxo0bE9I2AAAAAHVzxUTo7du3S5LatWtnc0sAAAAA/0laeVJj/O53v1N6erouvPDCOvcpKytTWVlZ1fehUCgZTQMAAAA8z/EjDfPnz9dbb72l3/72t2rTpk2d+y1YsEAZGRlVX1lZWclrJAAAAOBhjg4NL7zwgmbNmqW8vDxdf/31lvvOnDlTJSUlVV9FRUVJaiUAAADgbY4tT3rzzTd1zTXXaOTIkVq8eHG9+6elpSktLS0JLQMAAAD8xZEjDWvXrtWll16qM844Qy+++KKaNHFstgEAAAA8z3GhYePGjRo5cqS6deum1157Tc2bN7e7SQAAAICvJfUS/qJFi7R3714VFxdLkv7yl78oGAxKkqZOnaqUlBTl5uZqz549mj59ulasWBHx+B49emjIkCHJbDIAAADgewFjjEnWk3Xr1k1btmyp9Wf//ve/JUndu3ev8/ETJkzQ0qVLo3quUCikjIwMlZSUKD09vcFtBQAAAJzIjn5uUkcaCgsL690niRkGAAAAQBQcN6cBAAAAgLMQGgAAAABYIjQAAAAAsERoAAAAAGCJ0AAAAADAEqEBAAAAgCVCAwAAAABLhAYAAAAAlggNAAAAACwRGgAAAABYIjQAAAAAsERoAAAAAGCJ0AAAAADAEqEBAAAAgCVCAwAAAABLhAYAAAAAlggNAAAAACwRGgAAAABYIjQAAAAAsERoAAAAQIMEg9KqVeF/4Q+EBgAAAEStoEDq2lU677zwvwUFdrcIyUBoAAAAQFSCQSk/X6qoCH9fUSFNnsyIgx8QGgAAABCVzZuPBoZK5eXSv/5lT3uQPIQGAAAARCU7W0qp1ntMTZV69rSnPUgeQgMAAACikpkpLVkSDgpS+N8nnghvh7c1sbsBAAAAcI+8PCk3N1yS1LMngcEvCA0AAABokMxMwoLfUJ4EAAAAwBKhAQAAAIAlQgMAAADgEE692zahAQAAAHAAJ99tm9AAAAAA2Mzpd9smNAAAAAA2c/rdtgkNAAAAgM2cfrdtQgMAAABgM6ffbZubuwEAAAAO4OS7bRMaAAAAAIdw6t22KU8CAAAAYInQAAAAAMASoQEAAACAJUIDAAAAAEuEBgAAAACWCA0AAABIumBQWrUq/C+cj9AAAACApCookLp2lc47L/xvQYHdLUJ9CA0AAABImmBQys+XKirC31dUSJMnN3zEgZGK5CI0AAAAx6FD6F2bNx8NDJXKy8N3QY4WIxXJR2gAAACOQofQ27KzpZRqPdDUVKlnz+geH6+RimTxSgAmNAAAAMdwW4cQDZeZKS1ZEg4KUvjfJ54Ib49GPEYqksVLAThpoWHfvn2aPXu2RowYobZt2yoQCGjp0qW17rtx40aNGDFCrVq1Utu2bfXzn/9c3333XbKaCgAAbOKmDiFil5cnFRaGr8AXFoa/j1ZjRyqSxWsBOGmhYefOnZozZ442btyofv361blfMBjU8OHD9a9//Uvz58/XbbfdphUrVuj888/XoUOHktVcAABgA7d0CNF4mZlSTk70IwzHPq4xIxXJ4rUA3CRZT9SxY0dt27ZNHTp00Lp163TmmWfWut/8+fO1f/9+ffLJJ+rSpYskadCgQTr//PO1dOlS5efnJ6vJAAAgySo7hJMnhztYTu0Qwl55eVJubrgD3rOnM8+PygB8bHBwcwBO2khDWlqaOnToUO9+L730ki666KKqwCBJP/nJT3TyySfrxRdfTGQTAQCAAzSmdAX+EetIRbK4ZUQkWkkbaYjG1q1btWPHDp1xxhk1fjZo0CC9/vrrNrQKAAAkW2ameztXQCU3jIhEy1GhYdu
|
||
|
"text/plain": [
|
||
|
"<Figure size 900x600 with 1 Axes>"
|
||
|
]
|
||
|
},
|
||
|
"metadata": {},
|
||
|
"output_type": "display_data"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"import numpy as np\n",
|
||
|
"m = 100\n",
|
||
|
"X = 2 * np.random.rand(m, 1)\n",
|
||
|
"y = 4 + 3 * X + np.random.randn(m, 1)\n",
|
||
|
"plt.figure(figsize=(9,6))\n",
|
||
|
"plt.plot(X, y, \"b.\")\n",
|
||
|
"plt.xlabel(\"$x_1$\", fontsize=18)\n",
|
||
|
"plt.ylabel(\"$y$\", rotation=0, fontsize=18)\n",
|
||
|
"plt.axis([0, 2, 0, 15]);"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {
|
||
|
"slideshow": {
|
||
|
"slide_type": "subslide"
|
||
|
},
|
||
|
"tags": [
|
||
|
"exercise_pointer"
|
||
|
]
|
||
|
},
|
||
|
"source": [
|
||
|
"**Exercises:** *You can now complete Exercise 1 in the exercises associated with this lecture.*"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {
|
||
|
"slideshow": {
|
||
|
"slide_type": "subslide"
|
||
|
}
|
||
|
},
|
||
|
"source": [
|
||
|
"### Solve using Scikit-Learn"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 4,
|
||
|
"metadata": {
|
||
|
"execution": {
|
||
|
"iopub.execute_input": "2024-01-10T00:19:26.430231Z",
|
||
|
"iopub.status.busy": "2024-01-10T00:19:26.428868Z",
|
||
|
"iopub.status.idle": "2024-01-10T00:19:26.811959Z",
|
||
|
"shell.execute_reply": "2024-01-10T00:19:26.811220Z"
|
||
|
}
|
||
|
},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"text/plain": [
|
||
|
"(array([4.21509616]), array([[2.77011339]]))"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 4,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"from sklearn.linear_model import LinearRegression\n",
|
||
|
"lin_reg = LinearRegression()\n",
|
||
|
"lin_reg.fit(X, y)\n",
|
||
|
"lin_reg.intercept_, lin_reg.coef_"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 5,
|
||
|
"metadata": {
|
||
|
"execution": {
|
||
|
"iopub.execute_input": "2024-01-10T00:19:26.815257Z",
|
||
|
"iopub.status.busy": "2024-01-10T00:19:26.814784Z",
|
||
|
"iopub.status.idle": "2024-01-10T00:19:26.820509Z",
|
||
|
"shell.execute_reply": "2024-01-10T00:19:26.819869Z"
|
||
|
}
|
||
|
},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"text/plain": [
|
||
|
"array([[4.21509616],\n",
|
||
|
" [9.75532293]])"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 5,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"X_new = np.array([[0], [2]])\n",
|
||
|
"lin_reg.predict(X_new)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {
|
||
|
"slideshow": {
|
||
|
"slide_type": "subslide"
|
||
|
}
|
||
|
},
|
||
|
"source": [
|
||
|
"### Computational complexity\n",
|
||
|
"\n",
|
||
|
"Solving the normal equation requires inverting $X^{\\rm T} X$, which is an $n \\times n$ matrix (where $n$ is the number of features).\n",
|
||
|
"\n",
|
||
|
"Inverting an $n \\times n$ matrix is naively of complexity $\\mathcal{O}(n^3)$ ($\\mathcal{O}(n^{2.4})$ if certain fast algorithms are used).\n",
|
||
|
"\n",
|
||
|
"Also, typically requires all training instances to be held in memory at once.\n",
|
||
|
"\n",
|
||
|
"Other methods are required to handle large data-sets, i.e. when considering large numbers of features and large numbers of training instances."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {
|
||
|
"slideshow": {
|
||
|
"slide_type": "slide"
|
||
|
}
|
||
|
},
|
||
|
"source": [
|
||
|
"## Batch gradient descent"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"<img src=\"https://raw.githubusercontent.com/astro-informatics/course_mlbd_images/master/Lecture05_Images/optimization_gd.png\" width=\"400px\" style=\"display:block; margin:auto\"/>\n",
|
||
|
"\n",
|
||
|
"[[Image source](http://www.holehouse.org/mlclass)]"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {
|
||
|
"slideshow": {
|
||
|
"slide_type": "subslide"
|
||
|
}
|
||
|
},
|
||
|
"source": [
|
||
|
"Gradient of the cost function is given by\n",
|
||
|
"\n",
|
||
|
"$$ \\frac{\\partial}{\\partial \\theta} C(\\theta)\n",
|
||
|
"=\\frac{2}{m} X^{\\rm T} \\left( X \\theta - y \\right) \n",
|
||
|
"= \\left[\\frac{\\partial C}{\\partial \\theta_0},\\ \\frac{\\partial C}{\\partial \\theta_1},\\ ...,\\ \\frac{\\partial C}{\\partial \\theta_n} \\right]^{\\rm T} \n",
|
||
|
".\n",
|
||
|
"$$"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {
|
||
|
"slideshow": {
|
||
|
"slide_type": "subslide"
|
||
|
}
|
||
|
},
|
||
|
"source": [
|
||
|
"Batch gradient descent algorithm defined by taking a step $\\alpha$ in the direction of the gradient:\n",
|
||
|
"\n",
|
||
|
"$$\\theta^{(t)} = \\theta^{(t-1)} - \\alpha \\frac{\\partial C}{\\partial \\theta},$$\n",
|
||
|
"\n",
|
||
|
"where $t$ denotes the iteration number."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {
|
||
|
"slideshow": {
|
||
|
"slide_type": "fragment"
|
||
|
}
|
||
|
},
|
||
|
"source": [
|
||
|
"This algorithm is called *batch* gradient descent since it uses the full *batch* of training data ($X$) at each iteration. We will see other forms of gradient descent soon..."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {
|
||
|
"slideshow": {
|
||
|
"slide_type": "subslide"
|
||
|
},
|
||
|
"tags": [
|
||
|
"exercise_pointer"
|
||
|
]
|
||
|
},
|
||
|
"source": [
|
||
|
"**Exercises:** *You can now complete Exercise 2 in the exercises associated with this lecture.*"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {
|
||
|
"slideshow": {
|
||
|
"slide_type": "slide"
|
||
|
}
|
||
|
},
|
||
|
"source": [
|
||
|
"## Learning rate\n",
|
||
|
"\n",
|
||
|
"The step size $\\alpha$ is also called the *learning rate*."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {
|
||
|
"slideshow": {
|
||
|
"slide_type": "subslide"
|
||
|
}
|
||
|
},
|
||
|
"source": [
|
||
|
"### Good learning rate\n",
|
||
|
"\n",
|
||
|
"<img src=\"https://raw.githubusercontent.com/astro-informatics/course_mlbd_images/master/Lecture05_Images/bgd_alpha_good.png\" width=\"600px\" style=\"display:block; margin:auto\"/>\n",
|
||
|
"\n",
|
||
|
"[Source: Geron]\n"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {
|
||
|
"slideshow": {
|
||
|
"slide_type": "subslide"
|
||
|
}
|
||
|
},
|
||
|
"source": [
|
||
|
"### Learning rate too small \n",
|
||
|
"\n",
|
||
|
"Convergence will be slow.\n",
|
||
|
"\n",
|
||
|
"<img src=\"https://raw.githubusercontent.com/astro-informatics/course_mlbd_images/master/Lecture05_Images/bgd_alpha_small.png\" width=\"600px\" style=\"display:block; margin:auto\"/>\n",
|
||
|
"\n",
|
||
|
"[Source: Geron]\n"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {
|
||
|
"slideshow": {
|
||
|
"slide_type": "subslide"
|
||
|
}
|
||
|
},
|
||
|
"source": [
|
||
|
"### Learning rate too large \n",
|
||
|
"\n",
|
||
|
"May not converge.\n",
|
||
|
"\n",
|
||
|
"<img src=\"https://raw.githubusercontent.com/astro-informatics/course_mlbd_images/master/Lecture05_Images/bgd_alpha_large.png\" alt=\"data-layout\" width=\"600px\" style=\"display:block; margin:auto\"/>\n",
|
||
|
"\n",
|
||
|
"[Source: Geron]\n"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {
|
||
|
"slideshow": {
|
||
|
"slide_type": "subslide"
|
||
|
}
|
||
|
},
|
||
|
"source": [
|
||
|
"### Evolution of fitted curve \n",
|
||
|
"\n",
|
||
|
"Consider the evolution of the curve corresponding to the best fit parameters for the first 10 iterations for different learning rates $\\alpha$."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 6,
|
||
|
"metadata": {
|
||
|
"execution": {
|
||
|
"iopub.execute_input": "2024-01-10T00:19:26.824434Z",
|
||
|
"iopub.status.busy": "2024-01-10T00:19:26.823834Z",
|
||
|
"iopub.status.idle": "2024-01-10T00:19:26.831253Z",
|
||
|
"shell.execute_reply": "2024-01-10T00:19:26.830646Z"
|
||
|
}
|
||
|
},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"theta_path_bgd = []\n",
|
||
|
"\n",
|
||
|
"X_b = np.c_[np.ones((m, 1)), X] # add x0 = 1 to each instance\n",
|
||
|
"X_new_b = np.c_[np.ones((2, 1)), X_new] \n",
|
||
|
"\n",
|
||
|
"def plot_gradient_descent(theta, alpha, theta_path=None):\n",
|
||
|
" m = len(X_b)\n",
|
||
|
" plt.plot(X, y, \"b.\")\n",
|
||
|
" n_iterations = 1000\n",
|
||
|
" for iteration in range(n_iterations):\n",
|
||
|
" if iteration < 10:\n",
|
||
|
" y_predict = X_new_b.dot(theta)\n",
|
||
|
" style = \"b-\" if iteration > 0 else \"r--\"\n",
|
||
|
" plt.plot(X_new, y_predict, style)\n",
|
||
|
" gradients = 2/m * X_b.T.dot(X_b.dot(theta) - y)\n",
|
||
|
" theta = theta - alpha * gradients\n",
|
||
|
" if theta_path is not None:\n",
|
||
|
" theta_path.append(theta)\n",
|
||
|
" plt.xlabel(\"$x_1$\", fontsize=18)\n",
|
||
|
" plt.axis([0, 2, 0, 15])\n",
|
||
|
" plt.title(r\"$\\alpha = {}$\".format(alpha), fontsize=16)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 7,
|
||
|
"metadata": {
|
||
|
"execution": {
|
||
|
"iopub.execute_input": "2024-01-10T00:19:26.834198Z",
|
||
|
"iopub.status.busy": "2024-01-10T00:19:26.833760Z",
|
||
|
"iopub.status.idle": "2024-01-10T00:19:27.369568Z",
|
||
|
"shell.execute_reply": "2024-01-10T00:19:27.368907Z"
|
||
|
}
|
||
|
},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA1UAAAGZCAYAAABhShsgAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjcuNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8WgzjOAAAACXBIWXMAAA9hAAAPYQGoP6dpAADNIElEQVR4nOydd3gU9fPH35eQhFASQi+h996rVEUQC6iAKCKgKGChKCCgKE1FRbGhFEHATvsi2KmiKEpHkCpSEnoJCS315vfH/Jbd63v99m5ez3NPLrfts5vbyWd2Zt5jIiKCIAiCIAiCIAiC4BFRwR6AIAiCIAiCIAiCkRGnShAEQRAEQRAEwQvEqRIEQRAEQRAEQfACcaoEQRAEQRAEQRC8QJwqQRAEQRAEQRAELxCnShAEQRAEQRAEwQvEqRIEQRAEQRAEQfACcaoEQRAEQRAEQRC8QJwqQRAEQRAEQRAELxCnSjAES5cuRceOHZGUlISCBQuiYcOGePPNN5GTkxOw/ebk5GDdunUYM2YMmjdvjiJFiiAmJgalS5dG9+7d8f3333s1FkEQgoOv7cvBgwfxwQcfYODAgahfvz7y5csHk8mEV155xccjFwQhVPC1HRk4cCBMJpPTV2Zmpo/PQvAGExFRsAchCM4YOXIk3nvvPeTLlw+33norChUqhPXr1+Py5cto27YtVq9ejfj4eL/vd+3atbj99tsBAKVLl0bTpk1RsGBB7Nu3D3v37gUADB48GLNnz4bJZPLNyQuC4Ff8YV+UfVozdepUTJgwwVdDFwQhRPCHHRk4cCAWLVqEW265BdWqVbO7zscff4yYmBhfnILgC0gQQpgVK1YQACpUqBBt37795ufnz5+n+vXrEwAaNWpUQPa7bt066tmzJ/366682+/v6668pOjqaANCiRYvcHo8gCIHHX/bl448/ptGjR9MXX3xB+/fvp0ceeYQA0NSpU305fEEQQgB/2ZEBAwYQAFqwYIEPRyv4E3GqhJCmefPmBIBeeeUVm2W//fYbAaC4uDi6fPly0Pc7aNAgAkC33XabW2MRBCE4+Mu+WKNMjsSpEoTww192RJwq4yE1VQIOHz6MwYMHo2LFisifPz+qVauGyZMn38wDvv/++5E/f34cP348oOM6efIktm7dCgDo27evzfK2bduifPnyyMrKwg8//BD0/TZu3BgAkJKSonsbQQh3Is2+CILge8SOCEZAnKoIZ/78+WjQoAEWLFiAmjVr4o477sCpU6cwadIkvPfee9ixYwdWrFiBoUOHomLFigEd286dOwEARYsWReXKle2u06xZM4t1g7nfw4cPAwDKlCmjextBCGci0b4IguBbIt2ObNiwAaNGjcLgwYMxfvx4rFixAllZWZ4NWPAr+YI9ACF4LF++HE888QQSExOxdu1aNG3aFACwevVqdO3aFStXrsQvv/yCwoUL48UXX3S6L6Wg0l02bNiAjh072l129OhRAECFChUcbl++fHmLdfXgj/2eOXMGCxcuBAD07NlT91gEIVyJVPsiCILvEDsCfPrppzaflSlTBp988gnuuOMOj/Yp+AdxqiKUrKwsDBs2DESEGTNm3DRUANClSxcUKFAA27ZtQ2ZmJiZOnIgSJUo43V/btm09Gkfp0qUdLrty5QoAoGDBgg7XKVSoEAAgIyND9zF9vd/c3Fz069cP6enpqF+/PoYMGaJ7LIIQjkSyfREEwTdEuh1p2LAh3nvvPdx2222oUKECbty4gd27d2PSpEn4448/0L17d6xevdqhwycEHnGqIpQVK1bg9OnTqFOnDh599FGb5UlJSTh58iRKlCiBUaNGudzf448/jscff9wfQw15hg4dinXr1qFYsWJYtmwZYmNjgz0kQQgqYl8EQfCWSLcjzz77rMXvhQsXxu23347OnTvjvvvuw8qVKzFy5Ejs2rUrOAMUbJCaqghFKZjs1auX0/VeeOEFFC5cOBBDskE57rVr1xyuc/XqVQBAQkJCUPY7YsQIzJ8/H0lJSVizZg1q1KihexyCEK5Esn0RBME3iB2xj8lkwuTJkwEAu3fvFnGsEEIiVRGKolbToUMHm2U5OTm4fv06SpQogSeffFLX/ubNm4dNmza5PY5x48ahVq1adpdVqlQJgHM1PWWZsq4efLXfUaNG4f3330eRIkWwevXqm+p/ghDpRLJ9EQTBN4gdcUzt2rVvvk9NTb1ZtyUEF3GqIhRFdrRcuXI2y9555x2kpaWhbt26iIuL07W/TZs2eVQAOnDgQIfGSnFSLl68iKNHj9pV1tm2bRsAoEmTJrqP6Yv9Pv/885gxYwYSExOxevXqm+o+giBEtn0RBME3iB1xzMWLF2++D1aUTrBF0v8ilKgo/tNfvnzZ4vOjR49i6tSpAIDo6Gjd+1u4cCGIm0m79XJWYJmcnIzmzZsDAL788kub5Zs2bUJKSgri4uJw55136h6rt/sdN24cpk+fjsTERKxZs+bmvgRBYCLZvgiC4BvEjjjm66+/BsAphTVr1vTZfgUv8VdXYSG0adWqFQGgRx55hMxmMxERXbhwgRo3bkwmk4liYmKoaNGidP369aCOc8WKFQSAChUqRNu3b7/5+YULF6h+/foEgEaNGmV323HjxlHNmjVp3LhxPtvviy++SACoSJEitGXLFh+coSCEH5FuX6wZMGAAAaCpU6f6bOyCEO5Esh3ZuXMnrVy5knJyciw+z8vLo3nz5lH+/PkJAE2YMMH3JyR4jDhVEcrSpUsJAAGgRo0aUe/evalYsWIEgKZPn06tW7cmANSiRQt67733gjrW4cOHEwCKiYmhO+64g3r27ElFihQhAHTLLbc4NKjKRGbAgAE+2e/KlStvXrNmzZrRgAED7L4cGU9BiBQi3b5s376dWrZsefNVvHhxAkDJyckWn586dcrPZycIxiWS7YjiqCUlJdFtt91Gffv2pTvvvJMqVKhw85o89NBDNk6XEFzEqYpgvvrqK2revDkVKFCA8ufPT7Vr16avvvqKiIh27dpFDRs2JJPJRGPHjg3ySIkWL15M7du3p4SEBIqPj6d69erR66+/TllZWQ63ceVUubvfBQsW3DRmzl4VK1b0wRkLgrGJZPuyYcMGXbbi6NGj/jspQQgDItWO/PfffzRy5Ehq27YtlStXjvLnz09xcXFUoUIF6tWrF33//fd+PhvBE0xERN4nEQqCIAiCIAiCIEQmIlQhCIIgCIIgCILgBeJUCYIgCIIgCIIgeIE4VYIgCIIgCIIgCF4gTpUgCIIgCIIgCIIXiFMlCIIgCIIgCILgBeJUCYIgCIIgCIIgeEG+YA/AX5jNZpw6dQqFCxeGyWQK9nAEIaIhIly5cgVly5ZFVJRxnuWIHRGE0MGIdkRsiCCEDv62IWHrVJ06dQrly5cP9jAEQdCQkpKC5OTkYA9DN2JHBCH0MJIdERsiCKGHv2xI2DpVhQsXBsAXLiEhIcijEYTIJiMjA+XLl795XxoFsSOCEDoY0Y6IDbHko4+A8eOBIkWA7duB4sX1b7twITBiBBAXB/zxB1Ctmv5ts7KA224D9uwBbr0VWL4ccCdQceYM0LYtcP488PDDfB7u8MMPwEMP8fuPPuJ9CIHH3zYkbJ0qJcyekJAghkwQQgSjpb+IHRGE0MNIdkRsiEpqKvDqq/x++nSgShX92544AUyYwO9few1o0sS9Y48ezQ5V8eLA55+zU6eX3Fxg8GB2qOrXB+bOBQoU0L/9gQO8PQA88wzw5JNuDV3wA/6yIcZIShYEQRAEQRAMy8iRwNWrQOvWwGOP6d+OCHj8ceDKFaBNG45WucOaNcDbb/P7+fOBMmXc2/7ll4GNG4FChYClS91zqDIygHvv5bG3awfMmOHesQVjIU6VIAiCIAiC4Dd++IFT7qKjgdmz3Uu9mz+fHaP8+YFPPuF96OXCBWDAAH7/5JNA9+7ujfv774Fp09Rx1Kypf1uzGXjkEeDgQSA5mR2ymBj3ji8YC3GqBEEQBEEQBL9w/TqnvQHAs88CDRro3/bECeC55/j9K6+459QQAYMGAadPA7VrA2+9pX9bADh+nJ0igMf/wAPubf/
|
||
|
"text/plain": [
|
||
|
"<Figure size 1000x400 with 3 Axes>"
|
||
|
]
|
||
|
},
|
||
|
"metadata": {},
|
||
|
"output_type": "display_data"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"np.random.seed(42)\n",
|
||
|
"theta = np.random.randn(2,1) # random initialization\n",
|
||
|
"\n",
|
||
|
"plt.figure(figsize=(10,4))\n",
|
||
|
"plt.subplot(131); plot_gradient_descent(theta, alpha=0.02)\n",
|
||
|
"plt.ylabel(\"$y$\", rotation=0, fontsize=18)\n",
|
||
|
"plt.subplot(132); plot_gradient_descent(theta, alpha=0.1)\n",
|
||
|
"plt.subplot(133); plot_gradient_descent(theta, alpha=0.5)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {
|
||
|
"slideshow": {
|
||
|
"slide_type": "fragment"
|
||
|
}
|
||
|
},
|
||
|
"source": [
|
||
|
"- $\\alpha=0.02$: learning rate is too small and will take a long time to reach the global optimum.\n",
|
||
|
"- $\\alpha=0.1$: learning rate is good and global optimum will be found quickly.\n",
|
||
|
"- $\\alpha=0.5$: learning rate is too large and parameters jump about (may not converge)."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {
|
||
|
"slideshow": {
|
||
|
"slide_type": "slide"
|
||
|
}
|
||
|
},
|
||
|
"source": [
|
||
|
"## Convergence"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"Cost function may not be nice with a single local minimum that coincides with the global minimum.\n",
|
||
|
"\n",
|
||
|
"<img src=\"https://raw.githubusercontent.com/astro-informatics/course_mlbd_images/master/Lecture05_Images/bgd_pitfalls.png\" alt=\"data-layout\" width=\"600px\" style=\"display:block; margin:auto\"/>\n",
|
||
|
"\n",
|
||
|
"[Source: Geron]"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {
|
||
|
"slideshow": {
|
||
|
"slide_type": "subslide"
|
||
|
}
|
||
|
},
|
||
|
"source": [
|
||
|
"### Convex objective function\n",
|
||
|
"\n",
|
||
|
"MSE cost function for linear regression is convex (for any two points on the curve, the line segment connecting those points lies above the curve).\n",
|
||
|
"\n",
|
||
|
"Consequently, the cost function has a single local minimum (that therefore coincides with the global minimum).\n",
|
||
|
"\n",
|
||
|
"Furthermore, the cost function is smooth with a slope that never changes abruptly (i.e. it is Lipschitz continuous). \n",
|
||
|
"\n",
|
||
|
"Gradient descent is therefore guaranteed to converge to the global minimum."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {
|
||
|
"slideshow": {
|
||
|
"slide_type": "slide"
|
||
|
}
|
||
|
},
|
||
|
"source": [
|
||
|
"## Feature scaling\n"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"For gradient descent to work well it is critical for all features to have a similar scale."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"<img src=\"https://raw.githubusercontent.com/astro-informatics/course_mlbd_images/master/Lecture05_Images/bgd_feature_scaling.png\" alt=\"data-layout\" width=\"800px\" style=\"display:block; margin:auto\"/>\n",
|
||
|
"\n",
|
||
|
"[Source: Geron]\n",
|
||
|
"\n",
|
||
|
"In the case on the left, the features have the same scale and even intially the solution moves towards the minimum, thus reaching it quickly.\n",
|
||
|
"\n",
|
||
|
"In the case on the right, the features have different scales and initially the solution does not move directly towards the minimum. Convergence is therefore slow."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {
|
||
|
"slideshow": {
|
||
|
"slide_type": "subslide"
|
||
|
}
|
||
|
},
|
||
|
"source": [
|
||
|
"### Feature scaling methods\n",
|
||
|
"\n",
|
||
|
"In general should ensure features have a similar scale since many machine learning methods do not work well when numerical attributes have very different scales.\n",
|
||
|
"\n",
|
||
|
"#### Min-max scaling\n",
|
||
|
"\n",
|
||
|
"Min-max scaling given by\n",
|
||
|
"\\begin{align*}\n",
|
||
|
"x_j \\rightarrow \\frac{x_j - \\min_i x_i}{\\max_i x_i - \\min_i x_i} .\n",
|
||
|
"\\end{align*}\n",
|
||
|
"See Scikit-Learn [`MinMaxScalar`](http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MinMaxScaler.html).\n",
|
||
|
"\n",
|
||
|
"\n",
|
||
|
"#### Standardization\n",
|
||
|
"\n",
|
||
|
"Standardization scaling given by\n",
|
||
|
"\\begin{align*}\n",
|
||
|
"x_j \\rightarrow \\frac{x_j - \\mu_j}{\\sigma_j},\n",
|
||
|
"\\end{align*}\n",
|
||
|
"where $\\mu_j$ and $\\sigma_j$ are the mean and standard deviation of feature $j$ computed from the training instances.\n",
|
||
|
"See Scikit-Learn [`StandardScaler`](http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html)."
|
||
|
]
|
||
|
}
|
||
|
],
|
||
|
"metadata": {
|
||
|
"celltoolbar": "Slideshow",
|
||
|
"kernelspec": {
|
||
|
"display_name": "Python 3 (ipykernel)",
|
||
|
"language": "python",
|
||
|
"name": "python3"
|
||
|
},
|
||
|
"language_info": {
|
||
|
"codemirror_mode": {
|
||
|
"name": "ipython",
|
||
|
"version": 3
|
||
|
},
|
||
|
"file_extension": ".py",
|
||
|
"mimetype": "text/x-python",
|
||
|
"name": "python",
|
||
|
"nbconvert_exporter": "python",
|
||
|
"pygments_lexer": "ipython3",
|
||
|
"version": "3.8.18"
|
||
|
}
|
||
|
},
|
||
|
"nbformat": 4,
|
||
|
"nbformat_minor": 4
|
||
|
}
|