spce0038-machine-learning-w.../cw1/spce0038_coursework_sklearn_MCMQ7.ipynb

7842 lines
2.1 MiB
Plaintext
Raw Normal View History

2025-02-21 17:13:13 +00:00
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": ""
},
"tags": []
},
"source": [
"# Coursework\n",
"# SPCE0038: Machine Learning with Big-Data"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Overview\n",
"\n",
"This coursework is provided as a Jupyter notebook, which you will need to complete. \n",
"\n",
"Throughout the notebook you will need to complete code, analytic exercises (if equations are required please typeset your solutions using latex in the markdown cell provided) and descriptive answers. Much of the grading of the coursework will be performed automatically, so it is critical you name your variables as requested."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Before you turn this coursework in, make sure everything runs as expected. First, **restart the kernel** (in the menubar, select Kernel$\\rightarrow$Restart) and then **run all cells** (in the menubar, select Cell$\\rightarrow$Run All).\n",
"\n",
"Make sure you fill in any place that says \"YOUR ANSWER HERE\" or `YOUR CODE HERE` and remove remove the `raise NotImplementedError()` exceptions that are thrown before you have added your answers."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Please also:\n",
"- Make sure you use a python environment using the `requirements.txt` files provided by the course.\n",
"- Make sure your notebook executes without errors.\n",
"- Do not add and remove cells but only provide your answers in the spaces given.\n",
"- Do not add or change code in the cells other than the ones marked with `# YOUR CODE HERE`.\n",
"- Do not overwrite or rename any existing variables.\n",
"- Do not install code or packages in the notebooks.\n",
"- Do not import any libraries other than modules from `sklearn`.\n",
"- Always label your plots.\n",
"- Answer the questions concisely and show your work/derivations/reasoning."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Please rename the notebook filename to include your candidate number in the filename. And please also add your candidate number below:**"
]
},
{
"cell_type": "code",
2025-02-22 19:19:02 +00:00
"execution_count": 1,
2025-02-21 17:13:13 +00:00
"metadata": {},
"outputs": [],
"source": [
2025-02-24 22:04:18 +00:00
"CANDIDATE_NUMBER = \"MCMQ7\""
2025-02-21 17:13:13 +00:00
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You will be able to run some basic tests in the notebook to check the basic operation of your code is as expected. Although do not assume your responses are complete or fully correct just because the basic tests pass."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Once you have renamed the notebook file and completed the exercises, please upload the notebook to Moodle.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## AstroML\n",
"\n",
"The data used is this coursework is obtained using [AstroML](http://www.astroml.org), a python package for machine learning for astronomy. Although we take data from AstroML, this coursework is not based on standard AstroML examples. So you will *not* find the solutions in AstroML examples!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## SDSS\n",
"\n",
"The data obtained through AstroML was observed by the [Sloan Digital Sky Survey](https://www.sdss.org/) (SDSS), which began observations in 2000. SDSS data have lead to many scientific advances and the experiment is widely seen as one of the most successful surveys in the history of astronomy."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Dependencies\n",
"\n",
"- Standard course dependencies (e.g. numpy, scikit-learn, etc.)\n",
"- [AstoML](http://www.astroml.org)\n",
"- [AstroPy](http://www.astropy.org/)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---"
]
},
{
"cell_type": "code",
2025-02-22 19:19:02 +00:00
"execution_count": 2,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "ebf563e5f38beef45736bc1921c6c8ca",
"grade": false,
"grade_id": "cell-60b5947d6f57e1e5",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"outputs": [],
"source": [
"import numpy as np\n",
"from matplotlib import pyplot as plt"
]
},
{
"cell_type": "code",
2025-02-22 19:19:02 +00:00
"execution_count": 3,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "62f0bf3c5af939aa8d31f358766e54f1",
"grade": false,
"grade_id": "cell-ea880cd0d16868fc",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"outputs": [],
"source": [
"def check_var_defined(var):\n",
" try:\n",
" exec(var)\n",
" except NameError:\n",
" raise NameError(var + \" not defined.\")\n",
" else:\n",
" print(var + \" defined.\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "15eddda2a8295d028fd507afa72dce3b",
"grade": false,
"grade_id": "cell-b2775724006d2978",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"## Part 1: Regression"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "a9cff16a3014318dbe8e070dc4dac087",
"grade": false,
"grade_id": "cell-b3bc03ae580e8edb",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"In these exercises we will consider the regression problem of the astonomical distance modulus vs redshift relationship.\n",
"\n",
"In astronomy, the [distance modulus](https://en.wikipedia.org/wiki/Distance_modulus) specifies the difference between the apparent and absolute magnitudes of an astronomnical object. It provides a way of expressing astrophysical distances. \n",
"\n",
"Astronomical [redshift](https://en.wikipedia.org/wiki/Redshift) specifies the shift in wavelength that astronomical objects undergo due to the expansion of the Universe. Due to Hubble's Law, more distance objects experience a greater redshift.\n"
]
},
{
"cell_type": "code",
2025-02-22 19:19:02 +00:00
"execution_count": 4,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "a6a25d04743d3fe93a33d1b4411a8516",
"grade": false,
"grade_id": "cell-72d05aca43e8358d",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"outputs": [],
"source": [
"from astroML.datasets import generate_mu_z"
]
},
{
"cell_type": "code",
2025-02-22 19:19:02 +00:00
"execution_count": 5,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "169eb83a4ad31a60a51d6de794445254",
"grade": false,
"grade_id": "cell-b5b33781a14baffd",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"outputs": [],
"source": [
"# Load data\n",
"m = 150\n",
"z_sample, mu_sample, dmu = generate_mu_z(m, random_state=3)"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "208758e42789bf2d1470c30a5fb356e0",
"grade": false,
"grade_id": "cell-acd86dc17cc2dd53",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*Plot the distance modulus ($\\mu$) vs redhift ($z$), including error bars.*"
]
},
{
"cell_type": "code",
2025-02-24 22:04:18 +00:00
"execution_count": 6,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "a6759ec46b4d7447c6b0873e63a52ec2",
"grade": true,
"grade_id": "cell-ae7f16835ab51af3",
"locked": false,
"points": 1,
"schema_version": 3,
"solution": true
}
},
2025-02-22 19:19:02 +00:00
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAjMAAAHHCAYAAABKudlQAAAAOnRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjEwLjAsIGh0dHBzOi8vbWF0cGxvdGxpYi5vcmcvlHJYcgAAAAlwSFlzAAAPYQAAD2EBqD+naQAAWKlJREFUeJzt3Xl4FFX6NuCnk6YTSEjCjghZ2BIim4BA2JdgJMiIMgMyDIuso6iAwwzJjBA2DYwooIOK+ckyiLKKgxoMyCJrWAJRJBBQCARZMopJgAghdH1/5Oue9F7VW1V1P/d19QVdXak+Xd3pevOe95yjEQRBABEREZFKBcjdACIiIiJXMJghIiIiVWMwQ0RERKrGYIaIiIhUjcEMERERqRqDGSIiIlI1BjNERESkagxmiIiISNUYzBAREZGqMZghsmHOnDnQaDRyN8OvFRQUQKPRYPXq1U79vEajwZw5c9zaJl+1evVqaDQaFBQUiP4Zw/uzePFih/ta+32qqKjA3/72NzRp0gQBAQEYMmSIxFYTVWIwQ37B8EVtuAUHB6NRo0ZISkrC22+/jVu3brnlea5evYo5c+YgNzfXLccj8mUrV67EG2+8gd///vdYs2YNpk+fjry8PMyZM0dSUEXEYIb8yrx587B27Vq89957eOmllwAA06ZNQ5s2bfDdd9+Z7Pvqq6/it99+k3T8q1evYu7cuQxmiMxY+33avXs3Hn74YSxZsgSjRo1C7969kZeXh7lz5zKYIUm0cjeAyJsGDhyITp06Ge+npqZi9+7dePLJJ/G73/0OZ86cQfXq1QEAWq0WWi1/RUjZKioqoNfrodPp5G6KXdZ+n4qKihARESFPg8inMDNDfq9fv36YNWsWLl26hI8++si43Vof/86dO9GjRw9EREQgNDQUsbGx+Pvf/w4A2Lt3Lx577DEAwHPPPWfs0jLUe+zfvx9/+MMfEBkZiaCgIDRp0gTTp0+3+Gt17NixCA0NxU8//YQhQ4YgNDQU9erVw4wZM/DgwQOTffV6PZYtW4Y2bdogODgY9erVwxNPPIHjx4+b7PfRRx+hY8eOqF69OmrXro1nn30WhYWFDs+N4RycO3cOf/rTnxAeHo569eph1qxZEAQBhYWFeOqppxAWFoaGDRvizTfftDhGUVERxo8fjwYNGiA4OBjt2rXDmjVrLPYrLi7G2LFjER4ejoiICIwZMwbFxcUW+/Xp0wd9+vSx2D527FhER0fbfT229pH6XtvSunVr9O3b12K7Xq/Hww8/jN///vfGbevXr0fHjh1Rs2ZNhIWFoU2bNli2bJnd41etUVm6dCmaNWuGoKAg5OXlAQDOnj2L3//+96hduzaCg4PRqVMnbNu2zeI4p0+fRr9+/VC9enU0btwYCxYsgF6vt9jv+PHjSEpKQt26dVG9enXExMRg3LhxVtv2wQcfGNvz2GOP4dixYyaPVz3HhtexZ88enD592uR35Q9/+AMAoG/fvsbte/futXteiPhnJxGAUaNG4e9//zt27NiBiRMnWt3n9OnTePLJJ9G2bVvMmzcPQUFB+OGHH3Dw4EEAQKtWrTBv3jzMnj0bkyZNQs+ePQEA3bp1AwBs2rQJZWVleP7551GnTh0cPXoU77zzDq5cuYJNmzaZPNeDBw+QlJSELl26YPHixfj666/x5ptvolmzZnj++eeN+40fPx6rV6/GwIEDMWHCBFRUVGD//v3Izs42ZqBee+01zJo1C8OGDcOECRPw3//+F++88w569eqFkydPivrLePjw4WjVqhUWLlyIL7/8EgsWLEDt2rWxYsUK9OvXD4sWLcK6deswY8YMPPbYY+jVqxcA4LfffkOfPn3www8/4MUXX0RMTAw2bdqEsWPHori4GFOnTgUACIKAp556CgcOHMCf//xntGrVClu3bsWYMWMkvIvu4+i9tmX48OGYM2cOrl+/joYNGxq3HzhwAFevXsWzzz4LoDJQGjFiBPr3749FixYBAM6cOYODBw8az4k9q1atwt27dzFp0iQEBQWhdu3aOH36NLp3746HH34YKSkpCAkJwcaNGzFkyBBs2bIFTz/9NADg+vXr6Nu3LyoqKoz7ffDBB8aMpEFRUREef/xx1KtXDykpKYiIiEBBQQE+/fRTi/Z8/PHHuHXrFiZPngyNRoN//vOfeOaZZ3DhwgVUq1bNYv969eph7dq1eO2113D79m2kp6cDAFq0aIGXX34Zb7/9Nv7+97+jVatWAGD8l8gmgcgPrFq1SgAgHDt2zOY+4eHhwqOPPmq8n5aWJlT9FVmyZIkAQPjvf/9r8xjHjh0TAAirVq2yeKysrMxiW3p6uqDRaIRLly4Zt40ZM0YAIMybN89k30cffVTo2LGj8f7u3bsFAMLLL79scVy9Xi8IgiAUFBQIgYGBwmuvvWby+KlTpwStVmux3ZzhHEyaNMm4raKiQmjcuLGg0WiEhQsXGrf/+uuvQvXq1YUxY8YYty1dulQAIHz00UfGbeXl5UJCQoIQGhoqlJaWCoIgCJ999pkAQPjnP/9p8jw9e/a0OJ+9e/cWevfubdHWMWPGCFFRUSbbAAhpaWl296n6Og3EvNfW5OfnCwCEd955x2T7Cy+8IISGhho/A1OnThXCwsKEiooKSce/ePGiAEAICwsTioqKTB7r37+/0KZNG+Hu3bvGbXq9XujWrZvQokUL47Zp06YJAIQjR44YtxUVFQnh4eECAOHixYuCIAjC1q1bHf7OGNpTp04d4ebNm8bt//nPfwQAwueff27cZn6OBaHyvXzkkUdMtm3atEkAIOzZs8fxCSH6/9jNRPT/hYaG2h3VZMhg/Oc//7Gaknek6l++d+7cwc8//4xu3bpBEAScPHnSYv8///nPJvd79uyJCxcuGO9v2bIFGo0GaWlpFj9rSOd/+umn0Ov1GDZsGH7++WfjrWHDhmjRogX27Nkjqu0TJkww/j8wMBCdOnWCIAgYP368cXtERARiY2NN2piZmYmGDRtixIgRxm3VqlXDyy+/jNu3b+Obb74x7qfVak2yToGBgcYibW9z9r1u2bIl2rdvjw0bNhi3PXjwAJs3b8bgwYONn4GIiAjcuXMHO3fudKp9Q4cORb169Yz3b968id27d2PYsGG4deuW8X3+5ZdfkJSUhPPnz+Onn34CUHmuu3btis6dOxt/vl69ehg5cqTJcxjOwRdffIH79+/bbc/w4cNRq1Yt431DVrLqZ4HIkxjMEP1/t2/fRs2aNW0+Pnz4cHTv3h0TJkxAgwYN8Oyzz2Ljxo2iL3aXL1/G2LFjUbt2bWMdTO/evQEAJSUlJvsa6l+qqlWrFn799Vfj/R9//BGNGjVC7dq1bT7n+fPnIQgCWrRogXr16pnczpw5g6KiIlFtj4yMNLkfHh6O4OBg1K1b12J71TZeunQJLVq0QECA6VeNodvg0qVLxn8feughhIaGmuwXGxsrqn3u5sp7PXz4cBw8eNAYPOzduxdFRUUYPny4cZ8XXngBLVu2xMCBA9G4cWOMGzcOX331lej2xcTEmNz/4YcfIAgCZs2aZfE+G4Jdw3tteE/MmZ/r3r17Y+jQoZg7dy7q1q2Lp556CqtWrcK9e/csftb882EIbKp+Fog8iTUzRACuXLmCkpISNG/e3OY+1atXx759+7Bnzx58+eWX+Oqrr7Bhwwb069cPO3bsQGBgoM2fffDgAQYMGICbN29i5syZiIuLQ0hICH766SeMHTvW4iJp71hS6PV6aDQabN++3eoxzYMHW6z9rK02CoIgrZESaTQaq89hXhxt62etMf9ZV97r4cOHIzU1FZs2bcK0adOwceNGhIeH44knnjDuU79+feTm5iIrKwvbt2/H9u3bsWrVKowePdpqcbQ58/oWw+dnxowZSEpKsvoz9j7b1mg0GmzevBnZ2dn4/PPPkZWVhXHjxuHNN99Edna2yWdHrs8CkQGDGSIAa9euBQCbFwKDgIAA9O/fH/3798dbb72F119/Hf/4xz+wZ88eJCYm2rxYnjp1CufOncOaNWswevRo43ZnuxkAoFmzZsjKysLNmzdtZmeaNWsGQRAQExODli1
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
2025-02-21 17:13:13 +00:00
"source": [
"# Plot data\n",
"def plot_dist_mod():\n",
" # YOUR CODE HERE\n",
2025-02-22 19:19:02 +00:00
" plt.scatter(z_sample, mu_sample)\n",
" plt.errorbar(z_sample, mu_sample, yerr=dmu, fmt=\"o\")\n",
" #raise NotImplementedError()\n",
2025-02-21 17:13:13 +00:00
" plt.xlabel('$z$')\n",
" plt.ylabel('$\\mu$')\n",
" plt.title('Distance modulus vs redshift')\n",
" plt.ylim(36, 50)\n",
" plt.xlim(0, 1.5)\n",
"plot_dist_mod()"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "596c8005f2265096bb8d4dcfb6756856",
"grade": false,
"grade_id": "cell-cdbb1766c5be9049",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"Recall the normal equations for linear regression follow by analytically minimising the cost function: \n",
"\n",
"$$\\min_\\theta\\ C(\\theta) = \\min_\\theta \\ (X \\theta - y)^{\\rm T}(X \\theta - y).$$\n",
"\n",
"Show analytically that the solution is given by \n",
"\n",
"$$ \\hat{\\theta} = \\left( X^{\\rm T} X \\right)^{-1} X^{\\rm T} y. $$"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "c509e0ca35e2e737d727917ad9699ad8",
"grade": false,
"grade_id": "cell-3b18a50412c27c56",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"[Matrix calculus identities](https://en.wikipedia.org/wiki/Matrix_calculus) may be useful (note that we use the denominator layout convention)."
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "3a99014a4147c496f8f9de0535f06e6a",
"grade": false,
"grade_id": "cell-4701b02e60d7683c",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*Expand the cost function and drop terms that do not depend on $\\theta$ (use latex mathematics expressions):*"
]
},
{
2025-02-22 20:09:19 +00:00
"attachments": {},
2025-02-21 17:13:13 +00:00
"cell_type": "markdown",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "b97fe78f2b4bc7502cb9ebdfe1d17874",
"grade": true,
"grade_id": "cell-a114fd93ba74dac5",
"locked": false,
"points": 3,
"schema_version": 3,
"solution": true
}
},
"source": [
2025-02-24 22:04:18 +00:00
"Initial cost function:\n",
"\n",
2025-02-22 20:09:19 +00:00
"$$C(\\theta)=(X\\theta-y)^T(X\\theta-y)$$\n",
2025-02-24 22:04:18 +00:00
"\n",
"Using the identity\n",
"\n",
"$$(A+B)^T = A^T+B^T,$$\n",
"\n",
"We perform a quadratic expansion:\n",
"\n",
"$$C(\\theta)=(X\\theta)^T(X\\theta)-(X\\theta)^Ty-y^T(X\\theta)+y^Ty.$$\n",
"\n",
"We use the identity\n",
"\n",
"$$a^Tb=b^Ta$$\n",
"\n",
"to combine the intermediate terms:\n",
"\n",
"$$C(\\theta)=(X\\theta)^T(X\\theta)-2(X\\theta)^Ty + y^Ty.$$"
2025-02-21 17:13:13 +00:00
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "2e5cd989e6893a3cefd6a3020e615c8d",
"grade": false,
"grade_id": "cell-9a33de31635ab257",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*Calculate the derivative, set it to zero, and solve for $\\theta$ (use latex mathematics expressions):*"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "72da939ee35de54049492e263cc444f8",
"grade": true,
"grade_id": "cell-0f0b521f765f4826",
"locked": false,
"points": 3,
"schema_version": 3,
"solution": true
}
},
"source": [
2025-02-24 22:04:18 +00:00
"The first term of $C(\\theta)$ can be rewritten as\n",
"$$(X\\theta)^T(X\\theta)=X^T\\theta^TX\\theta.$$\n",
"Since $X^TX$ is symmetric we can apply the formula\n",
"$$\\frac{\\partial}{\\partial\\theta}(\\theta^TA\\theta)=2A\\theta$$\n",
"where $A=X^TX$ to find the derivative of the first term:\n",
"$$\\frac{\\partial}{\\partial\\theta}(X^T\\theta^TX\\theta)=2X^TX\\theta$$\n",
"The second term is linear in $\\theta$ so can be differentiated normally.\n",
"$$\\frac{\\partial}{\\partial\\theta}(-2(X\\theta)^Ty) = -2X^Ty$$\n",
"Combining the individual terms for a complete derivative of $C(\\theta)$:\n",
"$$\\nabla_{\\theta}C(\\theta)=2X^TX\\theta-2X^Ty+0$$\n",
"Set the derivative to zero\n",
"$$0=2X^TX\\theta-2X^Ty$$\n",
"$$X^Ty=X^TX\\theta$$\n",
"$$\\theta=(X^TX)^{-1}X^Ty$$\n"
2025-02-21 17:13:13 +00:00
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "37c7febbf20f95d2bd6ef750ed2d11f3",
"grade": false,
"grade_id": "cell-dd5113a0fc89a960",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*Solve for $\\theta$ by numerically implementing the analytic solution given above.*"
]
},
{
"cell_type": "code",
2025-02-27 00:32:43 +00:00
"execution_count": 7,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "0993e1f7ab7124438012231f05c51e9a",
"grade": false,
"grade_id": "cell-d71c2644693323b2",
"locked": false,
"schema_version": 3,
"solution": true
}
},
"outputs": [],
"source": [
"def compute_theta_lin_reg(X, y):\n",
" # YOUR CODE HERE\n",
2025-02-24 22:04:18 +00:00
" # X needs to be a 2D array where each row is a sample\n",
" X = X.reshape((-1, 1))\n",
" # Add a column of ones for the bias term\n",
" X = np.c_[np.ones((X.shape[0], 1)), X]\n",
" theta = np.linalg.inv(X.T @ X) @ X.T @ y\n",
" #raise NotImplementedError()\n",
2025-02-21 17:13:13 +00:00
" return theta"
]
},
{
"cell_type": "code",
2025-02-27 00:32:43 +00:00
"execution_count": 8,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "ce1006c8dcbd4682472e45a7a6e54960",
"grade": true,
"grade_id": "cell-f024710582a10726",
"locked": true,
"points": 1,
"schema_version": 3,
"solution": false
}
},
2025-02-24 22:04:18 +00:00
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Linear regression parameters recovered analytically: intercept=39.5505, slope=4.9538\n"
]
}
],
2025-02-21 17:13:13 +00:00
"source": [
"assert compute_theta_lin_reg(z_sample, mu_sample).shape == (2,)\n",
"theta = compute_theta_lin_reg(z_sample, mu_sample)\n",
"(theta_c, theta_m) = theta\n",
"print(\"Linear regression parameters recovered analytically: intercept={0:.4f}, slope={1:.4f}\".format(theta_c, theta_m))"
]
},
{
"cell_type": "code",
2025-02-27 00:32:43 +00:00
"execution_count": 9,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "0276019b59b620bff4ba748f13afb83e",
"grade": true,
"grade_id": "cell-52c5a2bcccb2010d",
"locked": true,
"points": 1,
"schema_version": 3,
"solution": false
}
},
2025-02-24 22:04:18 +00:00
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"theta_c defined.\n",
"theta_m defined.\n"
]
}
],
2025-02-21 17:13:13 +00:00
"source": [
"check_var_defined('theta_c')\n",
"check_var_defined('theta_m')"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "96ed9fc16f94ad4965a2b03c463b7cd7",
"grade": false,
"grade_id": "cell-883f0e6586934a9f",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*Write a method to make a prediction for a given redshift.*"
]
},
{
"cell_type": "code",
2025-02-27 00:32:43 +00:00
"execution_count": 10,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "475f9a392bb5bb1dee3671f1e4e4c5cf",
"grade": false,
"grade_id": "cell-f5341cc7da485877",
"locked": false,
"schema_version": 3,
"solution": true
}
},
"outputs": [],
"source": [
"def predict_lin_reg(theta, x):\n",
" # YOUR CODE HERE\n",
2025-02-24 22:04:18 +00:00
" (c, m) = theta\n",
" y = m*x + c\n",
" #raise NotImplementedError()\n",
2025-02-21 17:13:13 +00:00
" return y"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "3cfc6f1f228d2759b5869de17f4c5754",
"grade": false,
"grade_id": "cell-c71c338c46a8bb93",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*Predict the distance modulus for a range of redshift values between 0.01 and 1.5 and plot the predicted curve overlayed on data (make a new plot; do not revise the plot above). Call the variable used to store the predictions for your polynomial model `mu_pred_lin`.*"
]
},
{
"cell_type": "code",
2025-02-27 00:32:43 +00:00
"execution_count": 11,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "cd987dfd33566ad55e8f9bfe8f8997aa",
"grade": false,
"grade_id": "cell-fc6aa680d6528ab1",
"locked": false,
"schema_version": 3,
"solution": true
}
},
2025-02-24 22:04:18 +00:00
"outputs": [
{
"data": {
"text/plain": [
2025-03-02 00:11:57 +00:00
"<matplotlib.legend.Legend at 0x783476165590>"
2025-02-24 22:04:18 +00:00
]
},
2025-02-27 00:32:43 +00:00
"execution_count": 11,
2025-02-24 22:04:18 +00:00
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAjMAAAHHCAYAAABKudlQAAAAOnRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjEwLjAsIGh0dHBzOi8vbWF0cGxvdGxpYi5vcmcvlHJYcgAAAAlwSFlzAAAPYQAAD2EBqD+naQAAc0JJREFUeJzt3Xd4U2X7B/Bv0jQt3VD2aMtsKVNAoOxdKPKCoCAvljJFRVnyCvUnGyy8oiCIiqhQmTJEUYoFGS/IplBECmUvGRWhLVA60pzfHzGhaTNO0qzTfj/X1QtycnLy5CTNuXs/z3M/MkEQBBARERFJlNzZDSAiIiIqDgYzREREJGkMZoiIiEjSGMwQERGRpDGYISIiIkljMENERESSxmCGiIiIJI3BDBEREUkagxkiIiKSNAYzREbMnDkTMpnM2c0o1a5duwaZTIZVq1ZZ9XiZTIaZM2fatE0l1apVqyCTyXDt2jXRj9G+PwsXLjS7r6HfJ5VKhXfffRc1atSAXC5Hv379LGw1kQaDGSoVtF/U2h9PT09UrVoVkZGRWLJkCR49emST57l9+zZmzpyJ5ORkmxyPqCT75ptv8OGHH+Kll15CfHw8Jk6ciJSUFMycOdOioIqIwQyVKrNnz8bq1avx+eef4+233wYATJgwAY0aNcLvv/+ut+/777+Pp0+fWnT827dvY9asWQxmiAox9Pu0Z88eVKtWDYsWLUJ0dDQ6duyIlJQUzJo1i8EMWUTh7AYQOVKvXr3QokUL3e3Y2Fjs2bMHL7zwAv71r3/h3LlzKFOmDABAoVBAoeCvCLk2lUoFtVoNpVLp7KaYZOj3KS0tDQEBAc5pEJUozMxQqdelSxdMmzYN169fx5o1a3TbDfXx79q1C+3atUNAQAB8fHwQGhqK9957DwCwb98+PP/88wCA4cOH67q0tOM9Dhw4gJdffhlBQUHw8PBAjRo1MHHixCJ/rQ4bNgw+Pj74888/0a9fP/j4+KBChQqYPHky8vPz9fZVq9X45JNP0KhRI3h6eqJChQro2bMnTpw4obffmjVr0Lx5c5QpUwblypXDK6+8gps3b5o9N9pzcOHCBbz66qvw9/dHhQoVMG3aNAiCgJs3b6Jv377w8/ND5cqV8dFHHxU5RlpaGkaOHIlKlSrB09MTTZo0QXx8fJH90tPTMWzYMPj7+yMgIAAxMTFIT08vsl+nTp3QqVOnItuHDRuGkJAQk6/H2D6WvtfGNGzYEJ07dy6yXa1Wo1q1anjppZd02zZs2IDmzZvD19cXfn5+aNSoET755BOTxy84RmXx4sWoXbs2PDw8kJKSAgA4f/48XnrpJZQrVw6enp5o0aIFtm3bVuQ4Z8+eRZcuXVCmTBlUr14dc+fOhVqtLrLfiRMnEBkZifLly6NMmTKoWbMmRowYYbBtX375pa49zz//PI4fP653f8FzrH0de/fuxdmzZ/V+V15++WUAQOfOnXXb9+3bZ/K8EPHPTiIA0dHReO+997Bz506MHj3a4D5nz57FCy+8gMaNG2P27Nnw8PDApUuXcPDgQQBA/fr1MXv2bEyfPh2vvfYa2rdvDwBo06YNAGDTpk3IysrCG2+8gcDAQBw7dgxLly7FrVu3sGnTJr3nys/PR2RkJFq1aoWFCxfi119/xUcffYTatWvjjTfe0O03cuRIrFq1Cr169cKoUaOgUqlw4MABHDlyRJeBmjdvHqZNm4aBAwdi1KhR+Ouvv7B06VJ06NABp06dEvWX8aBBg1C/fn3Mnz8f27dvx9y5c1GuXDksX74cXbp0wYIFC7B27VpMnjwZzz//PDp06AAAePr0KTp16oRLly7hrbfeQs2aNbFp0yYMGzYM6enpGD9+PABAEAT07dsXv/32G15//XXUr18fW7duRUxMjAXvou2Ye6+NGTRoEGbOnIm7d++icuXKuu2//fYbbt++jVdeeQWAJlAaPHgwunbtigULFgAAzp07h4MHD+rOiSkrV65EdnY2XnvtNXh4eKBcuXI4e/Ys2rZti2rVqmHq1Knw9vbGxo0b0a9fP2zZsgUvvvgiAODu3bvo3LkzVCqVbr8vv/xSl5HUSktLQ48ePVChQgVMnToVAQEBuHbtGr7//vsi7Vm3bh0ePXqEMWPGQCaT4b///S/69++PK1euwN3dvcj+FSpUwOrVqzFv3jw8fvwYcXFxAIC6deti3LhxWLJkCd577z3Ur18fAHT/EhklEJUCK1euFAAIx48fN7qPv7+/8Nxzz+luz5gxQyj4K7Jo0SIBgPDXX38ZPcbx48cFAMLKlSuL3JeVlVVkW1xcnCCTyYTr16/rtsXExAgAhNmzZ+vt+9xzzwnNmzfX3d6zZ48AQBg3blyR46rVakEQBOHatWuCm5ubMG/ePL37z5w5IygUiiLbC9Oeg9dee023TaVSCdWrVxdkMpkwf/583faHDx8KZcqUEWJiYnTbFi9eLAAQ1qxZo9uWm5srRERECD4+PkJmZqYgCILwww8/CACE//73v3rP0759+yLns2PHjkLHjh2LtDUmJkYIDg7W2wZAmDFjhsl9Cr5OLTHvtSGpqakCAGHp0qV62998803Bx8dH9xkYP3684OfnJ6hUKouOf/XqVQGA4OfnJ6Slpend17VrV6FRo0ZCdna2bptarRbatGkj1K1bV7dtwoQJAgDh6NGjum1paWmCv7+/AEC4evWqIAiCsHXrVrO/M9r2BAYGCg8ePNBt//HHHwUAwk8//aTbVvgcC4LmvWzQoIHetk2bNgkAhL1795o/IUT/YDcT0T98fHxMzmrSZjB+/PFHgyl5cwr+5fvkyRPcv38fbdq0gSAIOHXqVJH9X3/9db3b7du3x5UrV3S3t2zZAplMhhkzZhR5rDad//3330OtVmPgwIG4f/++7qdy5cqoW7cu9u7dK6rto0aN0v3fzc0NLVq0gCAIGDlypG57QEAAQkND9dqYkJCAypUrY/Dgwbpt7u7uGDduHB4/foz//e9/uv0UCoVe1snNzU03SNvRrH2v69Wrh6ZNm+K7777TbcvPz8fmzZvRp08f3WcgICAAT548wa5du6xq34ABA1ChQgXd7QcPHmDPnj0YOHAgHj16pHuf//77b0RGRuLixYv4888/AWjOdevWrdGyZUvd4ytUqIAhQ4boPYf2HPz888/Iy8sz2Z5BgwahbNmyutvarGTBzwKRPTGYIfrH48eP4evra/T+QYMGoW3bthg1ahQqVaqEV155BRs3bhR9sbtx4waGDRuGcuXK6cbBdOzYEQCQkZGht692/EtBZcuWxcOHD3W3L1++jKpVq6JcuXJGn/PixYsQBAF169ZFhQoV9H7OnTuHtLQ0UW0PCgrSu+3v7w9PT0+UL1++yPaCbbx+/Trq1q0LuVz/q0bbbXD9+nXdv1WqVIGPj4/efqGhoaLaZ2vFea8HDRqEgwcP6oKHffv2IS0tDYMGDdLt8+abb6JevXro1asXqlevjhEjRuCXX34R3b6aNWvq3b506RIEQcC0adOKvM/aYFf7Xmvfk8IKn+uOHTtiwIABmDVrFsqXL4++ffti5cqVyMnJKfLYwp8PbWBT8LNAZE8cM0ME4NatW8jIyECdOnWM7lOmTBns378fe/fuxfbt2/HLL7/gu+++Q5cuXbBz5064ubkZfWx+fj66d++OBw8eYMqUKQgLC4O3tzf+/PNPDBs2rMhF0tSxLKFWqyGTybBjxw6DxywcPBhj6LHG2igIgmWNtJBMJjP4HIUHRxt7rCGFH1uc93rQoEGIjY3Fpk2bMGHCBGzcuBH+/v7o2bOnbp+KFSsiOTkZiYmJ2LFjB3bs2IGVK1di6NChBgdHF1Z4fIv28zN58mRERkYafIypz7YhMpkMmzdvxpEjR/DTTz8hMTERI0aMwEcffYQjR47ofXac9Vkg0mIwQwRg9erVAGD0QqAll8vRtWtXdO3aFR9//DE++OAD/N///R/27t2Lbt26Gb1YnjlzBhcuXEB8fDyGDh2q225tNwMA1K5dG4mJiXjw4IHR7Ezt2rUhCAJq1qyJevXqWf1c1goODsbvv/8OtVqtl505f/687n7
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
2025-02-21 17:13:13 +00:00
"source": [
"z = np.linspace(0.01, 1.5, 1000)\n",
"plot_dist_mod()\n",
"# YOUR CODE HERE\n",
2025-02-24 22:04:18 +00:00
"mu_pred_lin = np.array([predict_lin_reg(theta, x) for x in z])\n",
"plt.plot(z, mu_pred_lin, label=\"mu_pred_lin\")\n",
"plt.legend()\n",
"#raise NotImplementedError()"
2025-02-21 17:13:13 +00:00
]
},
{
"cell_type": "code",
2025-02-27 00:32:43 +00:00
"execution_count": 12,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "08cf8664e67c3940e4b3acc27b0d325e",
"grade": true,
"grade_id": "cell-f8e904006f960080",
"locked": true,
"points": 2,
"schema_version": 3,
"solution": false
}
},
2025-02-24 22:04:18 +00:00
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"mu_pred_lin defined.\n"
]
}
],
2025-02-21 17:13:13 +00:00
"source": [
"check_var_defined('mu_pred_lin')\n",
"assert mu_pred_lin.shape == (len(z),), \"Make sure the shape of your predictions is correct\""
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "78269320b540a1d943b94e85f8887bdf",
"grade": false,
"grade_id": "cell-c55d51874c74517a",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*Solve for the parameters $\\theta$ using Scikit-Learn.*"
]
},
{
"cell_type": "code",
2025-02-27 00:32:43 +00:00
"execution_count": 13,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "dcb822e86c7cfb2cc413af696ebf6b12",
"grade": false,
"grade_id": "cell-21b18d58a127b96c",
"locked": false,
"schema_version": 3,
"solution": true
}
},
2025-02-24 22:04:18 +00:00
"outputs": [
{
"data": {
"text/html": [
2025-02-27 00:32:43 +00:00
"<style>#sk-container-id-1 {\n",
2025-02-24 22:04:18 +00:00
" /* Definition of color scheme common for light and dark mode */\n",
" --sklearn-color-text: #000;\n",
" --sklearn-color-text-muted: #666;\n",
" --sklearn-color-line: gray;\n",
" /* Definition of color scheme for unfitted estimators */\n",
" --sklearn-color-unfitted-level-0: #fff5e6;\n",
" --sklearn-color-unfitted-level-1: #f6e4d2;\n",
" --sklearn-color-unfitted-level-2: #ffe0b3;\n",
" --sklearn-color-unfitted-level-3: chocolate;\n",
" /* Definition of color scheme for fitted estimators */\n",
" --sklearn-color-fitted-level-0: #f0f8ff;\n",
" --sklearn-color-fitted-level-1: #d4ebff;\n",
" --sklearn-color-fitted-level-2: #b3dbfd;\n",
" --sklearn-color-fitted-level-3: cornflowerblue;\n",
"\n",
" /* Specific color for light theme */\n",
" --sklearn-color-text-on-default-background: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, black)));\n",
" --sklearn-color-background: var(--sg-background-color, var(--theme-background, var(--jp-layout-color0, white)));\n",
" --sklearn-color-border-box: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, black)));\n",
" --sklearn-color-icon: #696969;\n",
"\n",
" @media (prefers-color-scheme: dark) {\n",
" /* Redefinition of color scheme for dark theme */\n",
" --sklearn-color-text-on-default-background: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, white)));\n",
" --sklearn-color-background: var(--sg-background-color, var(--theme-background, var(--jp-layout-color0, #111)));\n",
" --sklearn-color-border-box: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, white)));\n",
" --sklearn-color-icon: #878787;\n",
" }\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-1 {\n",
2025-02-24 22:04:18 +00:00
" color: var(--sklearn-color-text);\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-1 pre {\n",
2025-02-24 22:04:18 +00:00
" padding: 0;\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-1 input.sk-hidden--visually {\n",
2025-02-24 22:04:18 +00:00
" border: 0;\n",
" clip: rect(1px 1px 1px 1px);\n",
" clip: rect(1px, 1px, 1px, 1px);\n",
" height: 1px;\n",
" margin: -1px;\n",
" overflow: hidden;\n",
" padding: 0;\n",
" position: absolute;\n",
" width: 1px;\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-1 div.sk-dashed-wrapped {\n",
2025-02-24 22:04:18 +00:00
" border: 1px dashed var(--sklearn-color-line);\n",
" margin: 0 0.4em 0.5em 0.4em;\n",
" box-sizing: border-box;\n",
" padding-bottom: 0.4em;\n",
" background-color: var(--sklearn-color-background);\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-1 div.sk-container {\n",
2025-02-24 22:04:18 +00:00
" /* jupyter's `normalize.less` sets `[hidden] { display: none; }`\n",
" but bootstrap.min.css set `[hidden] { display: none !important; }`\n",
" so we also need the `!important` here to be able to override the\n",
" default hidden behavior on the sphinx rendered scikit-learn.org.\n",
" See: https://github.com/scikit-learn/scikit-learn/issues/21755 */\n",
" display: inline-block !important;\n",
" position: relative;\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-1 div.sk-text-repr-fallback {\n",
2025-02-24 22:04:18 +00:00
" display: none;\n",
"}\n",
"\n",
"div.sk-parallel-item,\n",
"div.sk-serial,\n",
"div.sk-item {\n",
" /* draw centered vertical line to link estimators */\n",
" background-image: linear-gradient(var(--sklearn-color-text-on-default-background), var(--sklearn-color-text-on-default-background));\n",
" background-size: 2px 100%;\n",
" background-repeat: no-repeat;\n",
" background-position: center center;\n",
"}\n",
"\n",
"/* Parallel-specific style estimator block */\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-1 div.sk-parallel-item::after {\n",
2025-02-24 22:04:18 +00:00
" content: \"\";\n",
" width: 100%;\n",
" border-bottom: 2px solid var(--sklearn-color-text-on-default-background);\n",
" flex-grow: 1;\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-1 div.sk-parallel {\n",
2025-02-24 22:04:18 +00:00
" display: flex;\n",
" align-items: stretch;\n",
" justify-content: center;\n",
" background-color: var(--sklearn-color-background);\n",
" position: relative;\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-1 div.sk-parallel-item {\n",
2025-02-24 22:04:18 +00:00
" display: flex;\n",
" flex-direction: column;\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-1 div.sk-parallel-item:first-child::after {\n",
2025-02-24 22:04:18 +00:00
" align-self: flex-end;\n",
" width: 50%;\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-1 div.sk-parallel-item:last-child::after {\n",
2025-02-24 22:04:18 +00:00
" align-self: flex-start;\n",
" width: 50%;\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-1 div.sk-parallel-item:only-child::after {\n",
2025-02-24 22:04:18 +00:00
" width: 0;\n",
"}\n",
"\n",
"/* Serial-specific style estimator block */\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-1 div.sk-serial {\n",
2025-02-24 22:04:18 +00:00
" display: flex;\n",
" flex-direction: column;\n",
" align-items: center;\n",
" background-color: var(--sklearn-color-background);\n",
" padding-right: 1em;\n",
" padding-left: 1em;\n",
"}\n",
"\n",
"\n",
"/* Toggleable style: style used for estimator/Pipeline/ColumnTransformer box that is\n",
"clickable and can be expanded/collapsed.\n",
"- Pipeline and ColumnTransformer use this feature and define the default style\n",
"- Estimators will overwrite some part of the style using the `sk-estimator` class\n",
"*/\n",
"\n",
"/* Pipeline and ColumnTransformer style (default) */\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-1 div.sk-toggleable {\n",
2025-02-24 22:04:18 +00:00
" /* Default theme specific background. It is overwritten whether we have a\n",
" specific estimator or a Pipeline/ColumnTransformer */\n",
" background-color: var(--sklearn-color-background);\n",
"}\n",
"\n",
"/* Toggleable label */\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-1 label.sk-toggleable__label {\n",
2025-02-24 22:04:18 +00:00
" cursor: pointer;\n",
" display: flex;\n",
" width: 100%;\n",
" margin-bottom: 0;\n",
" padding: 0.5em;\n",
" box-sizing: border-box;\n",
" text-align: center;\n",
" align-items: start;\n",
" justify-content: space-between;\n",
" gap: 0.5em;\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-1 label.sk-toggleable__label .caption {\n",
2025-02-24 22:04:18 +00:00
" font-size: 0.6rem;\n",
" font-weight: lighter;\n",
" color: var(--sklearn-color-text-muted);\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-1 label.sk-toggleable__label-arrow:before {\n",
2025-02-24 22:04:18 +00:00
" /* Arrow on the left of the label */\n",
" content: \"▸\";\n",
" float: left;\n",
" margin-right: 0.25em;\n",
" color: var(--sklearn-color-icon);\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-1 label.sk-toggleable__label-arrow:hover:before {\n",
2025-02-24 22:04:18 +00:00
" color: var(--sklearn-color-text);\n",
"}\n",
"\n",
"/* Toggleable content - dropdown */\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-1 div.sk-toggleable__content {\n",
2025-02-24 22:04:18 +00:00
" max-height: 0;\n",
" max-width: 0;\n",
" overflow: hidden;\n",
" text-align: left;\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-0);\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-1 div.sk-toggleable__content.fitted {\n",
2025-02-24 22:04:18 +00:00
" /* fitted */\n",
" background-color: var(--sklearn-color-fitted-level-0);\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-1 div.sk-toggleable__content pre {\n",
2025-02-24 22:04:18 +00:00
" margin: 0.2em;\n",
" border-radius: 0.25em;\n",
" color: var(--sklearn-color-text);\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-0);\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-1 div.sk-toggleable__content.fitted pre {\n",
2025-02-24 22:04:18 +00:00
" /* unfitted */\n",
" background-color: var(--sklearn-color-fitted-level-0);\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-1 input.sk-toggleable__control:checked~div.sk-toggleable__content {\n",
2025-02-24 22:04:18 +00:00
" /* Expand drop-down */\n",
" max-height: 200px;\n",
" max-width: 100%;\n",
" overflow: auto;\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-1 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {\n",
2025-02-24 22:04:18 +00:00
" content: \"▾\";\n",
"}\n",
"\n",
"/* Pipeline/ColumnTransformer-specific style */\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-1 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
2025-02-24 22:04:18 +00:00
" color: var(--sklearn-color-text);\n",
" background-color: var(--sklearn-color-unfitted-level-2);\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-1 div.sk-label.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
2025-02-24 22:04:18 +00:00
" background-color: var(--sklearn-color-fitted-level-2);\n",
"}\n",
"\n",
"/* Estimator-specific style */\n",
"\n",
"/* Colorize estimator box */\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-1 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
2025-02-24 22:04:18 +00:00
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-2);\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-1 div.sk-estimator.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
2025-02-24 22:04:18 +00:00
" /* fitted */\n",
" background-color: var(--sklearn-color-fitted-level-2);\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-1 div.sk-label label.sk-toggleable__label,\n",
"#sk-container-id-1 div.sk-label label {\n",
2025-02-24 22:04:18 +00:00
" /* The background is the default theme color */\n",
" color: var(--sklearn-color-text-on-default-background);\n",
"}\n",
"\n",
"/* On hover, darken the color of the background */\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-1 div.sk-label:hover label.sk-toggleable__label {\n",
2025-02-24 22:04:18 +00:00
" color: var(--sklearn-color-text);\n",
" background-color: var(--sklearn-color-unfitted-level-2);\n",
"}\n",
"\n",
"/* Label box, darken color on hover, fitted */\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-1 div.sk-label.fitted:hover label.sk-toggleable__label.fitted {\n",
2025-02-24 22:04:18 +00:00
" color: var(--sklearn-color-text);\n",
" background-color: var(--sklearn-color-fitted-level-2);\n",
"}\n",
"\n",
"/* Estimator label */\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-1 div.sk-label label {\n",
2025-02-24 22:04:18 +00:00
" font-family: monospace;\n",
" font-weight: bold;\n",
" display: inline-block;\n",
" line-height: 1.2em;\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-1 div.sk-label-container {\n",
2025-02-24 22:04:18 +00:00
" text-align: center;\n",
"}\n",
"\n",
"/* Estimator-specific */\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-1 div.sk-estimator {\n",
2025-02-24 22:04:18 +00:00
" font-family: monospace;\n",
" border: 1px dotted var(--sklearn-color-border-box);\n",
" border-radius: 0.25em;\n",
" box-sizing: border-box;\n",
" margin-bottom: 0.5em;\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-0);\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-1 div.sk-estimator.fitted {\n",
2025-02-24 22:04:18 +00:00
" /* fitted */\n",
" background-color: var(--sklearn-color-fitted-level-0);\n",
"}\n",
"\n",
"/* on hover */\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-1 div.sk-estimator:hover {\n",
2025-02-24 22:04:18 +00:00
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-2);\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-1 div.sk-estimator.fitted:hover {\n",
2025-02-24 22:04:18 +00:00
" /* fitted */\n",
" background-color: var(--sklearn-color-fitted-level-2);\n",
"}\n",
"\n",
"/* Specification for estimator info (e.g. \"i\" and \"?\") */\n",
"\n",
"/* Common style for \"i\" and \"?\" */\n",
"\n",
".sk-estimator-doc-link,\n",
"a:link.sk-estimator-doc-link,\n",
"a:visited.sk-estimator-doc-link {\n",
" float: right;\n",
" font-size: smaller;\n",
" line-height: 1em;\n",
" font-family: monospace;\n",
" background-color: var(--sklearn-color-background);\n",
" border-radius: 1em;\n",
" height: 1em;\n",
" width: 1em;\n",
" text-decoration: none !important;\n",
" margin-left: 0.5em;\n",
" text-align: center;\n",
" /* unfitted */\n",
" border: var(--sklearn-color-unfitted-level-1) 1pt solid;\n",
" color: var(--sklearn-color-unfitted-level-1);\n",
"}\n",
"\n",
".sk-estimator-doc-link.fitted,\n",
"a:link.sk-estimator-doc-link.fitted,\n",
"a:visited.sk-estimator-doc-link.fitted {\n",
" /* fitted */\n",
" border: var(--sklearn-color-fitted-level-1) 1pt solid;\n",
" color: var(--sklearn-color-fitted-level-1);\n",
"}\n",
"\n",
"/* On hover */\n",
"div.sk-estimator:hover .sk-estimator-doc-link:hover,\n",
".sk-estimator-doc-link:hover,\n",
"div.sk-label-container:hover .sk-estimator-doc-link:hover,\n",
".sk-estimator-doc-link:hover {\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-3);\n",
" color: var(--sklearn-color-background);\n",
" text-decoration: none;\n",
"}\n",
"\n",
"div.sk-estimator.fitted:hover .sk-estimator-doc-link.fitted:hover,\n",
".sk-estimator-doc-link.fitted:hover,\n",
"div.sk-label-container:hover .sk-estimator-doc-link.fitted:hover,\n",
".sk-estimator-doc-link.fitted:hover {\n",
" /* fitted */\n",
" background-color: var(--sklearn-color-fitted-level-3);\n",
" color: var(--sklearn-color-background);\n",
" text-decoration: none;\n",
"}\n",
"\n",
"/* Span, style for the box shown on hovering the info icon */\n",
".sk-estimator-doc-link span {\n",
" display: none;\n",
" z-index: 9999;\n",
" position: relative;\n",
" font-weight: normal;\n",
" right: .2ex;\n",
" padding: .5ex;\n",
" margin: .5ex;\n",
" width: min-content;\n",
" min-width: 20ex;\n",
" max-width: 50ex;\n",
" color: var(--sklearn-color-text);\n",
" box-shadow: 2pt 2pt 4pt #999;\n",
" /* unfitted */\n",
" background: var(--sklearn-color-unfitted-level-0);\n",
" border: .5pt solid var(--sklearn-color-unfitted-level-3);\n",
"}\n",
"\n",
".sk-estimator-doc-link.fitted span {\n",
" /* fitted */\n",
" background: var(--sklearn-color-fitted-level-0);\n",
" border: var(--sklearn-color-fitted-level-3);\n",
"}\n",
"\n",
".sk-estimator-doc-link:hover span {\n",
" display: block;\n",
"}\n",
"\n",
"/* \"?\"-specific style due to the `<a>` HTML tag */\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-1 a.estimator_doc_link {\n",
2025-02-24 22:04:18 +00:00
" float: right;\n",
" font-size: 1rem;\n",
" line-height: 1em;\n",
" font-family: monospace;\n",
" background-color: var(--sklearn-color-background);\n",
" border-radius: 1rem;\n",
" height: 1rem;\n",
" width: 1rem;\n",
" text-decoration: none;\n",
" /* unfitted */\n",
" color: var(--sklearn-color-unfitted-level-1);\n",
" border: var(--sklearn-color-unfitted-level-1) 1pt solid;\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-1 a.estimator_doc_link.fitted {\n",
2025-02-24 22:04:18 +00:00
" /* fitted */\n",
" border: var(--sklearn-color-fitted-level-1) 1pt solid;\n",
" color: var(--sklearn-color-fitted-level-1);\n",
"}\n",
"\n",
"/* On hover */\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-1 a.estimator_doc_link:hover {\n",
2025-02-24 22:04:18 +00:00
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-3);\n",
" color: var(--sklearn-color-background);\n",
" text-decoration: none;\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-1 a.estimator_doc_link.fitted:hover {\n",
2025-02-24 22:04:18 +00:00
" /* fitted */\n",
" background-color: var(--sklearn-color-fitted-level-3);\n",
"}\n",
2025-02-27 00:32:43 +00:00
"</style><div id=\"sk-container-id-1\" class=\"sk-top-container\"><div class=\"sk-text-repr-fallback\"><pre>LinearRegression()</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class=\"sk-container\" hidden><div class=\"sk-item\"><div class=\"sk-estimator fitted sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-1\" type=\"checkbox\" checked><label for=\"sk-estimator-id-1\" class=\"sk-toggleable__label fitted sk-toggleable__label-arrow\"><div><div>LinearRegression</div></div><div><a class=\"sk-estimator-doc-link fitted\" rel=\"noreferrer\" target=\"_blank\" href=\"https://scikit-learn.org/1.6/modules/generated/sklearn.linear_model.LinearRegression.html\">?<span>Documentation for LinearRegression</span></a><span class=\"sk-estimator-doc-link fitted\">i<span>Fitted</span></span></div></label><div class=\"sk-toggleable__content fitted\"><pre>LinearRegression()</pre></div> </div></div></div></div>"
2025-02-24 22:04:18 +00:00
],
"text/plain": [
"LinearRegression()"
]
},
2025-02-27 00:32:43 +00:00
"execution_count": 13,
2025-02-24 22:04:18 +00:00
"metadata": {},
"output_type": "execute_result"
}
],
2025-02-21 17:13:13 +00:00
"source": [
"from sklearn.linear_model import LinearRegression\n",
"lin_reg = LinearRegression()\n",
"# YOUR CODE HERE\n",
2025-02-24 22:04:18 +00:00
"lin_reg.fit(z_sample.reshape(-1, 1), mu_sample)\n",
"#raise NotImplementedError()"
2025-02-21 17:13:13 +00:00
]
},
{
"cell_type": "code",
2025-02-27 00:32:43 +00:00
"execution_count": 14,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "2dc255e53a5d0c973dfca577078d77c7",
"grade": true,
"grade_id": "cell-7ac2cec6a7505062",
"locked": true,
"points": 2,
"schema_version": 3,
"solution": false
}
},
2025-02-24 22:04:18 +00:00
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Linear regression parameters recovered by scikit-learn: intercept=39.5505, slope=4.9538\n"
]
}
],
2025-02-21 17:13:13 +00:00
"source": [
"assert lin_reg.coef_.shape == (1,), \"Make sure your features have the right shape, such that we have 1 fitted coefficient\"\n",
"print(\"Linear regression parameters recovered by scikit-learn: intercept={0:.4f}, slope={1:.4f}\"\n",
" .format(lin_reg.intercept_, lin_reg.coef_[0]))"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "7e98e11722002c12cbf55f1e5039e6dd",
"grade": false,
"grade_id": "cell-59443e08cb021f49",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*Extend your model to include polynomial features up to degree 15 (using Scikit-Learn). Use variable `lin_reg_poly` for your revised model.*"
]
},
{
"cell_type": "code",
2025-02-27 00:32:43 +00:00
"execution_count": 15,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "c48074367c1bb4a2704f931153372c1b",
"grade": false,
"grade_id": "cell-8ea9ea3be9d57ce5",
"locked": false,
"schema_version": 3,
"solution": true
}
},
2025-02-24 22:04:18 +00:00
"outputs": [
{
"data": {
"text/html": [
2025-02-27 00:32:43 +00:00
"<style>#sk-container-id-2 {\n",
2025-02-24 22:04:18 +00:00
" /* Definition of color scheme common for light and dark mode */\n",
" --sklearn-color-text: #000;\n",
" --sklearn-color-text-muted: #666;\n",
" --sklearn-color-line: gray;\n",
" /* Definition of color scheme for unfitted estimators */\n",
" --sklearn-color-unfitted-level-0: #fff5e6;\n",
" --sklearn-color-unfitted-level-1: #f6e4d2;\n",
" --sklearn-color-unfitted-level-2: #ffe0b3;\n",
" --sklearn-color-unfitted-level-3: chocolate;\n",
" /* Definition of color scheme for fitted estimators */\n",
" --sklearn-color-fitted-level-0: #f0f8ff;\n",
" --sklearn-color-fitted-level-1: #d4ebff;\n",
" --sklearn-color-fitted-level-2: #b3dbfd;\n",
" --sklearn-color-fitted-level-3: cornflowerblue;\n",
"\n",
" /* Specific color for light theme */\n",
" --sklearn-color-text-on-default-background: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, black)));\n",
" --sklearn-color-background: var(--sg-background-color, var(--theme-background, var(--jp-layout-color0, white)));\n",
" --sklearn-color-border-box: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, black)));\n",
" --sklearn-color-icon: #696969;\n",
"\n",
" @media (prefers-color-scheme: dark) {\n",
" /* Redefinition of color scheme for dark theme */\n",
" --sklearn-color-text-on-default-background: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, white)));\n",
" --sklearn-color-background: var(--sg-background-color, var(--theme-background, var(--jp-layout-color0, #111)));\n",
" --sklearn-color-border-box: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, white)));\n",
" --sklearn-color-icon: #878787;\n",
" }\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-2 {\n",
2025-02-24 22:04:18 +00:00
" color: var(--sklearn-color-text);\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-2 pre {\n",
2025-02-24 22:04:18 +00:00
" padding: 0;\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-2 input.sk-hidden--visually {\n",
2025-02-24 22:04:18 +00:00
" border: 0;\n",
" clip: rect(1px 1px 1px 1px);\n",
" clip: rect(1px, 1px, 1px, 1px);\n",
" height: 1px;\n",
" margin: -1px;\n",
" overflow: hidden;\n",
" padding: 0;\n",
" position: absolute;\n",
" width: 1px;\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-2 div.sk-dashed-wrapped {\n",
2025-02-24 22:04:18 +00:00
" border: 1px dashed var(--sklearn-color-line);\n",
" margin: 0 0.4em 0.5em 0.4em;\n",
" box-sizing: border-box;\n",
" padding-bottom: 0.4em;\n",
" background-color: var(--sklearn-color-background);\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-2 div.sk-container {\n",
2025-02-24 22:04:18 +00:00
" /* jupyter's `normalize.less` sets `[hidden] { display: none; }`\n",
" but bootstrap.min.css set `[hidden] { display: none !important; }`\n",
" so we also need the `!important` here to be able to override the\n",
" default hidden behavior on the sphinx rendered scikit-learn.org.\n",
" See: https://github.com/scikit-learn/scikit-learn/issues/21755 */\n",
" display: inline-block !important;\n",
" position: relative;\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-2 div.sk-text-repr-fallback {\n",
2025-02-24 22:04:18 +00:00
" display: none;\n",
"}\n",
"\n",
"div.sk-parallel-item,\n",
"div.sk-serial,\n",
"div.sk-item {\n",
" /* draw centered vertical line to link estimators */\n",
" background-image: linear-gradient(var(--sklearn-color-text-on-default-background), var(--sklearn-color-text-on-default-background));\n",
" background-size: 2px 100%;\n",
" background-repeat: no-repeat;\n",
" background-position: center center;\n",
"}\n",
"\n",
"/* Parallel-specific style estimator block */\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-2 div.sk-parallel-item::after {\n",
2025-02-24 22:04:18 +00:00
" content: \"\";\n",
" width: 100%;\n",
" border-bottom: 2px solid var(--sklearn-color-text-on-default-background);\n",
" flex-grow: 1;\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-2 div.sk-parallel {\n",
2025-02-24 22:04:18 +00:00
" display: flex;\n",
" align-items: stretch;\n",
" justify-content: center;\n",
" background-color: var(--sklearn-color-background);\n",
" position: relative;\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-2 div.sk-parallel-item {\n",
2025-02-24 22:04:18 +00:00
" display: flex;\n",
" flex-direction: column;\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-2 div.sk-parallel-item:first-child::after {\n",
2025-02-24 22:04:18 +00:00
" align-self: flex-end;\n",
" width: 50%;\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-2 div.sk-parallel-item:last-child::after {\n",
2025-02-24 22:04:18 +00:00
" align-self: flex-start;\n",
" width: 50%;\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-2 div.sk-parallel-item:only-child::after {\n",
2025-02-24 22:04:18 +00:00
" width: 0;\n",
"}\n",
"\n",
"/* Serial-specific style estimator block */\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-2 div.sk-serial {\n",
2025-02-24 22:04:18 +00:00
" display: flex;\n",
" flex-direction: column;\n",
" align-items: center;\n",
" background-color: var(--sklearn-color-background);\n",
" padding-right: 1em;\n",
" padding-left: 1em;\n",
"}\n",
"\n",
"\n",
"/* Toggleable style: style used for estimator/Pipeline/ColumnTransformer box that is\n",
"clickable and can be expanded/collapsed.\n",
"- Pipeline and ColumnTransformer use this feature and define the default style\n",
"- Estimators will overwrite some part of the style using the `sk-estimator` class\n",
"*/\n",
"\n",
"/* Pipeline and ColumnTransformer style (default) */\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-2 div.sk-toggleable {\n",
2025-02-24 22:04:18 +00:00
" /* Default theme specific background. It is overwritten whether we have a\n",
" specific estimator or a Pipeline/ColumnTransformer */\n",
" background-color: var(--sklearn-color-background);\n",
"}\n",
"\n",
"/* Toggleable label */\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-2 label.sk-toggleable__label {\n",
2025-02-24 22:04:18 +00:00
" cursor: pointer;\n",
" display: flex;\n",
" width: 100%;\n",
" margin-bottom: 0;\n",
" padding: 0.5em;\n",
" box-sizing: border-box;\n",
" text-align: center;\n",
" align-items: start;\n",
" justify-content: space-between;\n",
" gap: 0.5em;\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-2 label.sk-toggleable__label .caption {\n",
2025-02-24 22:04:18 +00:00
" font-size: 0.6rem;\n",
" font-weight: lighter;\n",
" color: var(--sklearn-color-text-muted);\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-2 label.sk-toggleable__label-arrow:before {\n",
2025-02-24 22:04:18 +00:00
" /* Arrow on the left of the label */\n",
" content: \"▸\";\n",
" float: left;\n",
" margin-right: 0.25em;\n",
" color: var(--sklearn-color-icon);\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-2 label.sk-toggleable__label-arrow:hover:before {\n",
2025-02-24 22:04:18 +00:00
" color: var(--sklearn-color-text);\n",
"}\n",
"\n",
"/* Toggleable content - dropdown */\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-2 div.sk-toggleable__content {\n",
2025-02-24 22:04:18 +00:00
" max-height: 0;\n",
" max-width: 0;\n",
" overflow: hidden;\n",
" text-align: left;\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-0);\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-2 div.sk-toggleable__content.fitted {\n",
2025-02-24 22:04:18 +00:00
" /* fitted */\n",
" background-color: var(--sklearn-color-fitted-level-0);\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-2 div.sk-toggleable__content pre {\n",
2025-02-24 22:04:18 +00:00
" margin: 0.2em;\n",
" border-radius: 0.25em;\n",
" color: var(--sklearn-color-text);\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-0);\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-2 div.sk-toggleable__content.fitted pre {\n",
2025-02-24 22:04:18 +00:00
" /* unfitted */\n",
" background-color: var(--sklearn-color-fitted-level-0);\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-2 input.sk-toggleable__control:checked~div.sk-toggleable__content {\n",
2025-02-24 22:04:18 +00:00
" /* Expand drop-down */\n",
" max-height: 200px;\n",
" max-width: 100%;\n",
" overflow: auto;\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-2 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {\n",
2025-02-24 22:04:18 +00:00
" content: \"▾\";\n",
"}\n",
"\n",
"/* Pipeline/ColumnTransformer-specific style */\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-2 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
2025-02-24 22:04:18 +00:00
" color: var(--sklearn-color-text);\n",
" background-color: var(--sklearn-color-unfitted-level-2);\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-2 div.sk-label.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
2025-02-24 22:04:18 +00:00
" background-color: var(--sklearn-color-fitted-level-2);\n",
"}\n",
"\n",
"/* Estimator-specific style */\n",
"\n",
"/* Colorize estimator box */\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-2 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
2025-02-24 22:04:18 +00:00
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-2);\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-2 div.sk-estimator.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
2025-02-24 22:04:18 +00:00
" /* fitted */\n",
" background-color: var(--sklearn-color-fitted-level-2);\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-2 div.sk-label label.sk-toggleable__label,\n",
"#sk-container-id-2 div.sk-label label {\n",
2025-02-24 22:04:18 +00:00
" /* The background is the default theme color */\n",
" color: var(--sklearn-color-text-on-default-background);\n",
"}\n",
"\n",
"/* On hover, darken the color of the background */\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-2 div.sk-label:hover label.sk-toggleable__label {\n",
2025-02-24 22:04:18 +00:00
" color: var(--sklearn-color-text);\n",
" background-color: var(--sklearn-color-unfitted-level-2);\n",
"}\n",
"\n",
"/* Label box, darken color on hover, fitted */\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-2 div.sk-label.fitted:hover label.sk-toggleable__label.fitted {\n",
2025-02-24 22:04:18 +00:00
" color: var(--sklearn-color-text);\n",
" background-color: var(--sklearn-color-fitted-level-2);\n",
"}\n",
"\n",
"/* Estimator label */\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-2 div.sk-label label {\n",
2025-02-24 22:04:18 +00:00
" font-family: monospace;\n",
" font-weight: bold;\n",
" display: inline-block;\n",
" line-height: 1.2em;\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-2 div.sk-label-container {\n",
2025-02-24 22:04:18 +00:00
" text-align: center;\n",
"}\n",
"\n",
"/* Estimator-specific */\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-2 div.sk-estimator {\n",
2025-02-24 22:04:18 +00:00
" font-family: monospace;\n",
" border: 1px dotted var(--sklearn-color-border-box);\n",
" border-radius: 0.25em;\n",
" box-sizing: border-box;\n",
" margin-bottom: 0.5em;\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-0);\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-2 div.sk-estimator.fitted {\n",
2025-02-24 22:04:18 +00:00
" /* fitted */\n",
" background-color: var(--sklearn-color-fitted-level-0);\n",
"}\n",
"\n",
"/* on hover */\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-2 div.sk-estimator:hover {\n",
2025-02-24 22:04:18 +00:00
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-2);\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-2 div.sk-estimator.fitted:hover {\n",
2025-02-24 22:04:18 +00:00
" /* fitted */\n",
" background-color: var(--sklearn-color-fitted-level-2);\n",
"}\n",
"\n",
"/* Specification for estimator info (e.g. \"i\" and \"?\") */\n",
"\n",
"/* Common style for \"i\" and \"?\" */\n",
"\n",
".sk-estimator-doc-link,\n",
"a:link.sk-estimator-doc-link,\n",
"a:visited.sk-estimator-doc-link {\n",
" float: right;\n",
" font-size: smaller;\n",
" line-height: 1em;\n",
" font-family: monospace;\n",
" background-color: var(--sklearn-color-background);\n",
" border-radius: 1em;\n",
" height: 1em;\n",
" width: 1em;\n",
" text-decoration: none !important;\n",
" margin-left: 0.5em;\n",
" text-align: center;\n",
" /* unfitted */\n",
" border: var(--sklearn-color-unfitted-level-1) 1pt solid;\n",
" color: var(--sklearn-color-unfitted-level-1);\n",
"}\n",
"\n",
".sk-estimator-doc-link.fitted,\n",
"a:link.sk-estimator-doc-link.fitted,\n",
"a:visited.sk-estimator-doc-link.fitted {\n",
" /* fitted */\n",
" border: var(--sklearn-color-fitted-level-1) 1pt solid;\n",
" color: var(--sklearn-color-fitted-level-1);\n",
"}\n",
"\n",
"/* On hover */\n",
"div.sk-estimator:hover .sk-estimator-doc-link:hover,\n",
".sk-estimator-doc-link:hover,\n",
"div.sk-label-container:hover .sk-estimator-doc-link:hover,\n",
".sk-estimator-doc-link:hover {\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-3);\n",
" color: var(--sklearn-color-background);\n",
" text-decoration: none;\n",
"}\n",
"\n",
"div.sk-estimator.fitted:hover .sk-estimator-doc-link.fitted:hover,\n",
".sk-estimator-doc-link.fitted:hover,\n",
"div.sk-label-container:hover .sk-estimator-doc-link.fitted:hover,\n",
".sk-estimator-doc-link.fitted:hover {\n",
" /* fitted */\n",
" background-color: var(--sklearn-color-fitted-level-3);\n",
" color: var(--sklearn-color-background);\n",
" text-decoration: none;\n",
"}\n",
"\n",
"/* Span, style for the box shown on hovering the info icon */\n",
".sk-estimator-doc-link span {\n",
" display: none;\n",
" z-index: 9999;\n",
" position: relative;\n",
" font-weight: normal;\n",
" right: .2ex;\n",
" padding: .5ex;\n",
" margin: .5ex;\n",
" width: min-content;\n",
" min-width: 20ex;\n",
" max-width: 50ex;\n",
" color: var(--sklearn-color-text);\n",
" box-shadow: 2pt 2pt 4pt #999;\n",
" /* unfitted */\n",
" background: var(--sklearn-color-unfitted-level-0);\n",
" border: .5pt solid var(--sklearn-color-unfitted-level-3);\n",
"}\n",
"\n",
".sk-estimator-doc-link.fitted span {\n",
" /* fitted */\n",
" background: var(--sklearn-color-fitted-level-0);\n",
" border: var(--sklearn-color-fitted-level-3);\n",
"}\n",
"\n",
".sk-estimator-doc-link:hover span {\n",
" display: block;\n",
"}\n",
"\n",
"/* \"?\"-specific style due to the `<a>` HTML tag */\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-2 a.estimator_doc_link {\n",
2025-02-24 22:04:18 +00:00
" float: right;\n",
" font-size: 1rem;\n",
" line-height: 1em;\n",
" font-family: monospace;\n",
" background-color: var(--sklearn-color-background);\n",
" border-radius: 1rem;\n",
" height: 1rem;\n",
" width: 1rem;\n",
" text-decoration: none;\n",
" /* unfitted */\n",
" color: var(--sklearn-color-unfitted-level-1);\n",
" border: var(--sklearn-color-unfitted-level-1) 1pt solid;\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-2 a.estimator_doc_link.fitted {\n",
2025-02-24 22:04:18 +00:00
" /* fitted */\n",
" border: var(--sklearn-color-fitted-level-1) 1pt solid;\n",
" color: var(--sklearn-color-fitted-level-1);\n",
"}\n",
"\n",
"/* On hover */\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-2 a.estimator_doc_link:hover {\n",
2025-02-24 22:04:18 +00:00
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-3);\n",
" color: var(--sklearn-color-background);\n",
" text-decoration: none;\n",
"}\n",
"\n",
2025-02-27 00:32:43 +00:00
"#sk-container-id-2 a.estimator_doc_link.fitted:hover {\n",
2025-02-24 22:04:18 +00:00
" /* fitted */\n",
" background-color: var(--sklearn-color-fitted-level-3);\n",
"}\n",
2025-02-27 00:32:43 +00:00
"</style><div id=\"sk-container-id-2\" class=\"sk-top-container\"><div class=\"sk-text-repr-fallback\"><pre>LinearRegression()</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class=\"sk-container\" hidden><div class=\"sk-item\"><div class=\"sk-estimator fitted sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-2\" type=\"checkbox\" checked><label for=\"sk-estimator-id-2\" class=\"sk-toggleable__label fitted sk-toggleable__label-arrow\"><div><div>LinearRegression</div></div><div><a class=\"sk-estimator-doc-link fitted\" rel=\"noreferrer\" target=\"_blank\" href=\"https://scikit-learn.org/1.6/modules/generated/sklearn.linear_model.LinearRegression.html\">?<span>Documentation for LinearRegression</span></a><span class=\"sk-estimator-doc-link fitted\">i<span>Fitted</span></span></div></label><div class=\"sk-toggleable__content fitted\"><pre>LinearRegression()</pre></div> </div></div></div></div>"
2025-02-24 22:04:18 +00:00
],
"text/plain": [
"LinearRegression()"
]
},
2025-02-27 00:32:43 +00:00
"execution_count": 15,
2025-02-24 22:04:18 +00:00
"metadata": {},
"output_type": "execute_result"
}
],
2025-02-21 17:13:13 +00:00
"source": [
"degree = 15\n",
"bias = False\n",
"from sklearn.preprocessing import PolynomialFeatures\n",
"def compute_poly_features(degree, bias):\n",
" # Return polynomial features of samples and class\n",
" # YOUR CODE HERE\n",
2025-02-24 22:04:18 +00:00
" poly_features = PolynomialFeatures(degree=degree, include_bias=bias)\n",
" z_sample_poly = poly_features.fit_transform(z_sample.reshape(-1, 1))\n",
" \n",
" #raise NotImplementedError()\n",
2025-02-21 17:13:13 +00:00
" return z_sample_poly, poly_features\n",
"z_sample_poly, poly_features = compute_poly_features(degree, bias)\n",
"# Train model\n",
"# YOUR CODE HERE\n",
2025-02-24 22:04:18 +00:00
"lin_reg_poly = LinearRegression()\n",
"lin_reg_poly.fit(z_sample_poly, mu_sample)\n",
"#raise NotImplementedError()"
2025-02-21 17:13:13 +00:00
]
},
{
"cell_type": "code",
2025-02-27 00:32:43 +00:00
"execution_count": 16,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "01cfbed87f8a26ed0ec8be34e437801a",
"grade": true,
"grade_id": "cell-d46184658a6b600b",
"locked": true,
"points": 2,
"schema_version": 3,
"solution": false
}
},
2025-02-27 00:32:43 +00:00
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"lin_reg_poly defined.\n"
]
}
],
2025-02-21 17:13:13 +00:00
"source": [
"check_var_defined('lin_reg_poly')"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "735fd8a02427e9a36d74685def71342c",
"grade": false,
"grade_id": "cell-5ccd955a752b81ea",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*Plot the data and the predictions of your models considered so far (linear and polynomial regression). Call the variable used to store the predictions for your polynomial model `mu_pred_poly`.*"
]
},
{
"cell_type": "code",
2025-02-27 00:32:43 +00:00
"execution_count": 17,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "dd2a4e0385cd8ed4673e4aad5b632280",
"grade": false,
"grade_id": "cell-24f6bf2cfd9a7f71",
"locked": false,
"schema_version": 3,
"solution": true
}
},
2025-02-24 22:04:18 +00:00
"outputs": [
{
"data": {
"text/plain": [
2025-03-02 00:11:57 +00:00
"<matplotlib.legend.Legend at 0x783475e94750>"
2025-02-24 22:04:18 +00:00
]
},
2025-02-27 00:32:43 +00:00
"execution_count": 17,
2025-02-24 22:04:18 +00:00
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAjMAAAHHCAYAAABKudlQAAAAOnRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjEwLjAsIGh0dHBzOi8vbWF0cGxvdGxpYi5vcmcvlHJYcgAAAAlwSFlzAAAPYQAAD2EBqD+naQAAmG9JREFUeJzs3Xd4k+X6wPFvRtM9oOxRWjZlCggUBNllyAHxCHIQQRG3oMg5Un+ggHrAI4qKi4MKCAhuXCAiyGHvTaHsPcrsoDPN+/sjJDRt0r5pkyZt78919dK8efPmaVKau/dzP/ejURRFQQghhBCilNJ6egBCCCGEEMUhwYwQQgghSjUJZoQQQghRqkkwI4QQQohSTYIZIYQQQpRqEswIIYQQolSTYEYIIYQQpZoEM0IIIYQo1SSYEUIIIUSpJsGMEA5MmTIFjUbj6WGUa6dOnUKj0TB//vwiPV6j0TBlyhSXjqmsmj9/PhqNhlOnTql+jOX9mTlzZqHn2vv3ZDQa+de//kXt2rXRarUMGjTIyVELYSbBjCgXLL+oLV9+fn7UqFGD2NhYPvjgA1JSUlzyPBcuXGDKlCns2bPHJdcToiz74osvePvtt/n73//OggULePHFF4mPj2fKlClOBVVCSDAjypVp06axcOFCPvnkE55//nkAXnjhBZo3b86+fftszp00aRLp6elOXf/ChQtMnTpVghkh8rD372nNmjXUrFmTWbNmMWLECO69917i4+OZOnWqBDPCKXpPD0CIktS3b1/atm1rvR0XF8eaNWu47777+Nvf/sahQ4fw9/cHQK/Xo9fLPxHh3YxGIyaTCYPB4OmhFMjev6fExETCwsI8MyBRpkhmRpR73bt3Z/LkyZw+fZpFixZZj9ub41+1ahX33HMPYWFhBAUF0ahRI1555RUA1q5dy9133w3Ao48+ap3SstR7rF+/ngcffJCIiAh8fX2pXbs2L774Yr6/VkeNGkVQUBDnz59n0KBBBAUFUblyZSZMmEBOTo7NuSaTiffff5/mzZvj5+dH5cqV6dOnDzt27LA5b9GiRbRp0wZ/f38qVqzIQw89xNmzZwt9bSyvwZEjR3j44YcJDQ2lcuXKTJ48GUVROHv2LAMHDiQkJIRq1arxzjvv5LtGYmIio0ePpmrVqvj5+dGyZUsWLFiQ77ybN28yatQoQkNDCQsLY+TIkdy8eTPfeV27dqVr1675jo8aNYrIyMgCvx9H5zj7XjvSrFkzunXrlu+4yWSiZs2a/P3vf7ceW7p0KW3atCE4OJiQkBCaN2/O+++/X+D1c9eovPfee9SrVw9fX1/i4+MBOHz4MH//+9+pWLEifn5+tG3blp9//jnfdQ4ePEj37t3x9/enVq1avPHGG5hMpnzn7dixg9jYWCpVqoS/vz9RUVE89thjdsf23//+1zqeu+++m+3bt9vcn/s1tnwff/31FwcPHrT5t/Lggw8C0K1bN+vxtWvXFvi6CCF/dgoBjBgxgldeeYU//viDMWPG2D3n4MGD3HfffbRo0YJp06bh6+vLsWPH2LhxIwBNmjRh2rRpvPrqqzzxxBN07twZgI4dOwLw7bffkpaWxtNPP014eDjbtm1j9uzZnDt3jm+//dbmuXJycoiNjaV9+/bMnDmTP//8k3feeYd69erx9NNPW88bPXo08+fPp2/fvjz++OMYjUbWr1/Pli1brBmoN998k8mTJzNkyBAef/xxrly5wuzZs+nSpQu7d+9W9Zfx0KFDadKkCTNmzOC3337jjTfeoGLFisyZM4fu3bvz1ltvsXjxYiZMmMDdd99Nly5dAEhPT6dr164cO3aM5557jqioKL799ltGjRrFzZs3GTduHACKojBw4EA2bNjAU089RZMmTfjxxx8ZOXKkE++i6xT2XjsydOhQpkyZwqVLl6hWrZr1+IYNG7hw4QIPPfQQYA6Uhg0bRo8ePXjrrbcAOHToEBs3brS+JgWZN28eGRkZPPHEE/j6+lKxYkUOHjxIp06dqFmzJhMnTiQwMJBvvvmGQYMG8f3333P//fcDcOnSJbp164bRaLSe99///teakbRITEykd+/eVK5cmYkTJxIWFsapU6f44Ycf8o3nq6++IiUlhSeffBKNRsN//vMfBg8ezIkTJ/Dx8cl3fuXKlVm4cCFvvvkmqampTJ8+HYAGDRowduxYPvjgA1555RWaNGkCYP2vEA4pQpQD8+bNUwBl+/btDs8JDQ1V7rrrLuvt1157Tcn9T2TWrFkKoFy5csXhNbZv364Ayrx58/Ldl5aWlu/Y9OnTFY1Go5w+fdp6bOTIkQqgTJs2zebcu+66S2nTpo319po1axRAGTt2bL7rmkwmRVEU5dSpU4pOp1PefPNNm/v379+v6PX6fMfzsrwGTzzxhPWY0WhUatWqpWg0GmXGjBnW4zdu3FD8/f2VkSNHWo+99957CqAsWrTIeiwrK0uJiYlRgoKClOTkZEVRFGXZsmUKoPznP/+xeZ7OnTvnez3vvfde5d5778031pEjRyp16tSxOQYor732WoHn5P4+LdS81/YkJCQogDJ79myb488884wSFBRk/RkYN26cEhISohiNRqeuf/LkSQVQQkJClMTERJv7evTooTRv3lzJyMiwHjOZTErHjh2VBg0aWI+98MILCqBs3brVeiwxMVEJDQ1VAOXkyZOKoijKjz/+WOi/Gct4wsPDlevXr1uP//TTTwqg/PLLL9ZjeV9jRTG/l02bNrU59u233yqA8tdffxX+gghxm0wzCXFbUFBQgauaLBmMn376yW5KvjC5//K9desWV69epWPHjiiKwu7du/Od/9RTT9nc7ty5MydOnLDe/v7779FoNLz22mv5HmtJ5//www+YTCaGDBnC1atXrV/VqlWjQYMG/PXXX6rG/vjjj1v/X6fT0bZtWxRFYfTo0dbjYWFhNGrUyGaMy5cvp1q1agwbNsx6zMfHh7Fjx5Kamsr//vc/63l6vd4m66TT6axF2iWtqO91w4YNadWqFV9//bX1WE5ODt999x0DBgyw/gyEhYVx69YtVq1aVaTxPfDAA1SuXNl6+/r166xZs4YhQ4aQkpJifZ+vXbtGbGwsR48e5fz584D5te7QoQPt2rWzPr5y5coMHz7c5jksr8Gvv/5KdnZ2geMZOnQoFSpUsN62ZCVz/ywI4U4SzAhxW2pqKsHBwQ7vHzp0KJ06deLxxx+natWqPPTQQ3zzzTeqP+zOnDnDqFGjqFixorUO5t577wUgKSnJ5lxL/UtuFSpU4MaNG9bbx48fp0aNGlSsWNHhcx49ehRFUWjQoAGVK1e2+Tp06BCJiYmqxh4REWFzOzQ0FD8/PypVqpTveO4xnj59mgYNGqDV2v6qsUwbnD592vrf6tWrExQUZHNeo0aNVI3P1YrzXg8dOpSNGzdag4e1a9eSmJjI0KFDrec888wzNGzYkL59+1KrVi0ee+wxfv/9d9Xji4qKsrl97NgxFEVh8uTJ+d5nS7Brea8t70leeV/re++9lwceeICpU6dSqVIlBg4cyLx588jMzMz32Lw/H5bAJvfPghDuJDUzQgDnzp0jKSmJ+vXrOzzH39+fdevW8ddff/Hbb7/x+++/8/XXX9O9e3f++OMPdDqdw8fm5OTQq1cvrl+/zssvv0zjxo0JDAzk/PnzjBo1Kt+HZEHXcobJZEKj0bBixQq718wbPDhi77GOxqgoinODdJJGo7H7HHmLox091p68jy3Oez106FDi4uL49ttveeGFF/jmm28IDQ2lT58+1nOqVKnCnj17WLlyJStWrGDFihXMmzePRx55xG5xdF5561ssPz8TJkwgNjbW7mMK+tm2R6PR8N1337FlyxZ++eUXVq5cyWOPPcY777zDli1bbH52PPWzIISFBDNCAAsXLgRw+EFgodVq6dGjBz169ODdd9/l3//+N//3f//HX3/9Rc+ePR1+WO7fv58jR46wYMECHnnkEevxok4zANSrV4+VK1dy/fp1h9mZevXqoSgKUVFRNGzYsMjPVVR16tRh3759mEwmm+zM4cOHrfdb/rt69WpSU1NtPiQTEhLyXbN
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
2025-02-21 17:13:13 +00:00
"source": [
"# YOUR CODE HERE\n",
2025-02-24 22:04:18 +00:00
"plot_dist_mod()\n",
"plt.plot(z, mu_pred_lin, label=\"mu_pred_lin\")\n",
"\n",
"mu_pred_poly = lin_reg_poly.predict(poly_features.transform(z.reshape(-1,1)))\n",
"plt.plot(z, mu_pred_poly, label=\"mu_pred_poly\")\n",
"plt.legend()\n",
"#raise NotImplementedError()"
2025-02-21 17:13:13 +00:00
]
},
{
"cell_type": "code",
2025-02-27 00:32:43 +00:00
"execution_count": 18,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "11efe12f85bf5edc3719a470150c480e",
"grade": true,
"grade_id": "cell-55abac99185aa2df",
"locked": true,
"points": 2,
"schema_version": 3,
"solution": false
}
},
2025-02-24 22:04:18 +00:00
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"mu_pred_poly defined.\n"
]
}
],
2025-02-21 17:13:13 +00:00
"source": [
"check_var_defined('mu_pred_poly')\n",
"assert mu_pred_poly.shape == (len(z),)"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "0dfd352eae6049c8302271e9173b5b86",
"grade": false,
"grade_id": "cell-dc45782c437fb3b3",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*Comment on the accuracy of your models.*"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "7a9b6ddcab190716cd17c327abb8da2a",
"grade": true,
"grade_id": "cell-c37fd48a9be1bd84",
"locked": false,
"points": 2,
"schema_version": 3,
"solution": true
}
},
"source": [
2025-02-24 22:04:18 +00:00
"The basic linear regression seems to fit the data better than the polynomial model, which looks overfitted. A small number of data points at the extreme end of the x-axis have a significant impact on the shape of the curve, far more than most of the points which are clustered towards 0 on the x-axis."
2025-02-21 17:13:13 +00:00
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "7b3d585af5547dd71593690acb89f54a",
"grade": false,
"grade_id": "cell-642b8f293c37d087",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*Think about methods that could be used to improve the performance of your models. Improve your polynomial model and use the improved model to make predictions. Call the variable used to store the polynomial model `ridge_reg_poly`. Call the variable used to store the predictions for your polynomial model `mu_pred_poly_improved`.*"
]
},
{
"cell_type": "code",
2025-02-27 00:32:43 +00:00
"execution_count": 19,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "809a92a774fec689d2048a896f1f4357",
"grade": true,
"grade_id": "cell-249f8b263075ba25",
"locked": false,
"points": 3,
"schema_version": 3,
"solution": true
}
},
"outputs": [],
"source": [
"# YOUR CODE HERE\n",
2025-02-24 22:04:18 +00:00
"from sklearn.pipeline import Pipeline\n",
"from sklearn.preprocessing import StandardScaler\n",
2025-02-24 22:04:18 +00:00
"from sklearn.linear_model import Ridge\n",
"\n",
"ridge_reg_poly = Pipeline((\n",
" (\"poly_features\", PolynomialFeatures(degree=30, include_bias=True)),\n",
" (\"std_scaler\", StandardScaler()),\n",
" (\"regul_reg\", Ridge(1e-20)),\n",
" #(\"lin_reg\", LinearRegression())\n",
2025-02-24 22:04:18 +00:00
"))\n",
"ridge_reg_poly.fit(z_sample.reshape(-1, 1), mu_sample)\n",
"mu_pred_poly_improved = ridge_reg_poly.predict(z.reshape(-1, 1))\n",
"\n",
"#raise NotImplementedError()"
2025-02-21 17:13:13 +00:00
]
},
{
"cell_type": "code",
2025-02-27 00:32:43 +00:00
"execution_count": 20,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "23cbdcf0308d7d2f474ccecf681ddd73",
"grade": false,
"grade_id": "cell-5f2d7d6ea319189b",
"locked": true,
"schema_version": 3,
"solution": false
}
},
2025-02-24 22:04:18 +00:00
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"ridge_reg_poly defined.\n",
"mu_pred_poly_improved defined.\n"
]
}
],
2025-02-21 17:13:13 +00:00
"source": [
"check_var_defined('ridge_reg_poly')\n",
"check_var_defined('mu_pred_poly_improved')\n",
"assert mu_pred_poly_improved.shape == (len(z),), \"Make sure the shape of your predictions is correct\""
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "8bb3a9403b75f4dc8b3a3e38664cd50a",
"grade": false,
"grade_id": "cell-d7ee6a9b415c62a9",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*Plot the predictions made with new model and all previous models considered.*"
]
},
{
"cell_type": "code",
2025-02-27 00:32:43 +00:00
"execution_count": 21,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "c882d9ba097863626e9c85736d3dab30",
"grade": true,
"grade_id": "cell-14a8a955921c3d9c",
"locked": false,
"points": 1,
"schema_version": 3,
"solution": true
}
},
2025-02-24 22:04:18 +00:00
"outputs": [
{
"data": {
"text/plain": [
2025-03-02 00:11:57 +00:00
"<matplotlib.legend.Legend at 0x783473d66710>"
2025-02-24 22:04:18 +00:00
]
},
2025-02-27 00:32:43 +00:00
"execution_count": 21,
2025-02-24 22:04:18 +00:00
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAjMAAAHHCAYAAABKudlQAAAAOnRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjEwLjAsIGh0dHBzOi8vbWF0cGxvdGxpYi5vcmcvlHJYcgAAAAlwSFlzAAAPYQAAD2EBqD+naQAAumJJREFUeJzsnXd8U/X6x98nSdM9oC2UVcoueypLFAFFQMSJchHBvcXBvdr7c6Je9KrXPbhcFUQU90YRFQRZslfZtmwoUNoCHWma8/sjTZq0GSdt0qTt8369Cs053/M93+SkOU8+z1JUVVURBEEQBEGoo+iCvQBBEARBEISaIMaMIAiCIAh1GjFmBEEQBEGo04gxIwiCIAhCnUaMGUEQBEEQ6jRizAiCIAiCUKcRY0YQBEEQhDqNGDOCIAiCINRpxJgRBEEQBKFOI8aMILjhySefRFGUYC+jQZOdnY2iKMyePbtaxyuKwpNPPunXNdVXZs+ejaIoZGdnaz7Gdn1efPFFr2Nd/T2ZzWb+8Y9/0KpVK3Q6HZdffrmPqxYEK2LMCA0C2we17SciIoLmzZszcuRIXnvtNU6fPu2X8xw+fJgnn3ySjRs3+mU+QajPvPfee7zwwgtcffXVzJkzhwceeIDMzEyefPJJn4wqQRBjRmhQTJ8+nblz5/L2229z7733AnD//ffTvXt3Nm/e7DT20UcfpaioyKf5Dx8+zFNPPSXGjCBUwtXf02+//UaLFi14+eWXmTRpEhdccAGZmZk89dRTYswIPmEI9gIEoTYZNWoU/fr1sz/OyMjgt99+49JLL+Wyyy5j+/btREZGAmAwGDAY5E9ECG3MZjMWiwWj0RjspXjE1d9TTk4OCQkJwVmQUK8QZUZo8AwbNozHHnuMffv28eGHH9q3u/LxL1q0iPPOO4+EhARiYmLo1KkT//znPwFYsmQJ55xzDgA33nij3aVli/dYtmwZ11xzDampqYSHh9OqVSseeOCBKt9Wp0yZQkxMDIcOHeLyyy8nJiaG5ORkpk2bRllZmdNYi8XCq6++Svfu3YmIiCA5OZlLLrmEtWvXOo378MMP6du3L5GRkTRu3JjrrruOAwcOeH1tbK/Brl27uP7664mPjyc5OZnHHnsMVVU5cOAA48aNIy4ujpSUFF566aUqc+Tk5HDzzTfTtGlTIiIi6NmzJ3PmzKkyLi8vjylTphAfH09CQgKTJ08mLy+vyrihQ4cydOjQKtunTJlCWlqax+fjboyv19od3bp148ILL6yy3WKx0KJFC66++mr7tvnz59O3b19iY2OJi4uje/fuvPrqqx7nd4xReeWVV2jXrh3h4eFkZmYCsGPHDq6++moaN25MREQE/fr149tvv60yz7Zt2xg2bBiRkZG0bNmSZ555BovFUmXc2rVrGTlyJElJSURGRtKmTRtuuukml2v773//a1/POeecw5o1a5z2O77GtuexePFitm3b5vS3cs011wBw4YUX2rcvWbLE4+siCPK1UxCASZMm8c9//pOff/6ZW2+91eWYbdu2cemll9KjRw+mT59OeHg4e/bsYfny5QB07tyZ6dOn8/jjj3PbbbcxZMgQAAYNGgTAZ599RmFhIXfeeSeJiYn8+eefvP766xw8eJDPPvvM6VxlZWWMHDmS/v378+KLL/LLL7/w0ksv0a5dO+688077uJtvvpnZs2czatQobrnlFsxmM8uWLWPVqlV2BerZZ5/lscceY/z48dxyyy0cP36c119/nfPPP58NGzZo+mZ87bXX0rlzZ5577jl++OEHnnnmGRo3bszMmTMZNmwYzz//PPPmzWPatGmcc845nH/++QAUFRUxdOhQ9uzZwz333EObNm347LPPmDJlCnl5eUydOhUAVVUZN24cf/zxB3fccQedO3fmq6++YvLkyT5cRf/h7Vq749prr+XJJ5/k6NGjpKSk2Lf/8ccfHD58mOuuuw6wGkoTJkxg+PDhPP/88wBs376d5cuX218TT7z//vsUFxdz2223ER4eTuPGjdm2bRuDBw+mRYsWPPLII0RHR/Ppp59y+eWX88UXX3DFFVcAcPToUS688ELMZrN93H//+1+7ImkjJyeHiy++mOTkZB555BESEhLIzs7myy+/rLKejz76iNOnT3P77bejKAr//ve/ufLKK/nrr78ICwurMj45OZm5c+fy7LPPcubMGWbMmAFAhw4duO+++3jttdf45z//SefOnQHs/wuCW1RBaAC8//77KqCuWbPG7Zj4+Hi1d+/e9sdPPPGE6vgn8vLLL6uAevz4cbdzrFmzRgXU999/v8q+wsLCKttmzJihKoqi7tu3z75t8uTJKqBOnz7daWzv3r3Vvn372h//9ttvKqDed999Vea1WCyqqqpqdna2qtfr1WeffdZp/5YtW1SDwVBle2Vsr8Ftt91m32Y2m9WWLVuqiqKozz33nH37qVOn1MjISHXy5Mn2ba+88ooKqB9++KF9m8lkUgcOHKjGxMSoBQUFqqqq6tdff60C6r///W+n8wwZMqTK63nBBReoF1xwQZW1Tp48WW3durXTNkB94oknPI5xfJ42tFxrV+zcuVMF1Ndff91p+1133aXGxMTY3wNTp05V4+LiVLPZ7NP8WVlZKqDGxcWpOTk5TvuGDx+udu/eXS0uLrZvs1gs6qBBg9QOHTrYt91///0qoK5evdq+LScnR42Pj1cBNSsrS1VVVf3qq6+8/s3Y1pOYmKjm5ubat3/zzTcqoH733Xf2bZVfY1W1XsuuXbs6bfvss89UQF28eLH3F0QQyhE3kyCUExMT4zGryaZgfPPNNy4leW84fvM9e/YsJ06cYNCgQaiqyoYNG6qMv+OOO5weDxkyhL/++sv++IsvvkBRFJ544okqx9rk/C+//BKLxcL48eM5ceKE/SclJYUOHTqwePFiTWu/5ZZb7L/r9Xr69euHqqrcfPPN9u0JCQl06tTJaY0LFiwgJSWFCRMm2LeFhYVx3333cebMGX7//Xf7OIPB4KQ66fV6e5B2bVPda92xY0d69erFJ598Yt9WVlbG559/ztixY+3vgYSEBM6ePcuiRYuqtb6rrrqK5ORk++Pc3Fx+++03xo8fz+nTp+3X+eTJk4wcOZLdu3dz6NAhwPpaDxgwgHPPPdd+fHJyMhMnTnQ6h+01+P777yktLfW4nmuvvZZGjRrZH9tUScf3giAEEjFmBKGcM2fOEBsb63b/tddey+DBg7nlllto2rQp1113HZ9++qnmm93+/fuZMmUKjRs3tsfBXHDBBQDk5+c7jbXFvzjSqFEjTp06ZX+8d+9emjdvTuPGjd2ec/fu3aiqSocOHUhOTnb62b59Ozk5OZrWnpqa6vQ4Pj6eiIgIkpKSqmx3XOO+ffvo0KEDOp3zR43NbbBv3z77/82aNSMmJsZpXKdOnTStz9/U5Fpfe+21LF++3G48LFmyhJycHK699lr7mLvuuouOHTsyatQoWrZsyU033cRPP/2keX1t2rRxerxnzx5UVeWxxx6rcp1txq7tWtuuSWUqv9YXXHABV111FU899RRJSUmMGzeO999/n5KSkirHVn5/2Awbx/eCIAQSiZkRBODgwYPk5+fTvn17t2MiIyNZunQpixcv5ocffuCnn37ik08+YdiwYfz888/o9Xq3x5aVlXHRRReRm5vLww8/THp6OtHR0Rw6dIgpU6ZUuUl6mssXLBYLiqLw448/upyzsvHgDlfHulujqqq+LdJHFEVxeY7KwdHujnVF5WNrcq2vvfZaMjIy+Oyzz7j//vv59NNPiY+P55JLLrGPadKkCRs3bmThwoX8+OOP/Pjjj7z//vvccMMNLoOjK1M5vsX2/pk2bRojR450eYyn97YrFEXh888/Z9WqVXz33XcsXLiQm266iZdeeolVq1Y5vXeC9V4QBBtizAgCMHfuXAC3NwIbOp2O4cOHM3z4cP7zn//wr3/9i//7v/9j8eLFjBgxwu3NcsuWLezatYs5c+Zwww032LdX180A0K5dOxYuXEhubq5bdaZdu3aoqkqbNm3o2LFjtc9VXVq3bs3mzZuxWCxO6syOHTvs+23
2025-02-24 22:04:18 +00:00
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
2025-02-21 17:13:13 +00:00
"source": [
"# YOUR CODE HERE\n",
2025-02-24 22:04:18 +00:00
"plot_dist_mod()\n",
"plt.plot(z, mu_pred_lin, label=\"mu_pred_lin\")\n",
"plt.plot(z, mu_pred_poly, label=\"mu_pred_poly\")\n",
"plt.plot(z, mu_pred_poly_improved, label=\"mu_pred_poly_improved\")\n",
"plt.legend()\n",
"#raise NotImplementedError()"
2025-02-21 17:13:13 +00:00
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "180a54403a99d155688f8a290fc1015b",
"grade": false,
"grade_id": "cell-62604dceb287f6e4",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*Compute the RMS error between your predictions and the data samples.*"
]
},
{
"cell_type": "code",
2025-02-27 00:32:43 +00:00
"execution_count": 22,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "6e268ea0e1117f31d0a367335c3ab18a",
"grade": false,
"grade_id": "cell-2ee5e0675e003f12",
"locked": false,
"schema_version": 3,
"solution": true
}
},
"outputs": [],
"source": [
"# Define a general function to compute the RMS error\n",
"def compute_rms(mu_1, mu_2):\n",
" # YOUR CODE HERE\n",
" return np.sqrt(np.mean([(x[0]-x[1])**2 for x in zip(mu_1, mu_2)]))\n",
2025-02-24 22:04:18 +00:00
" #raise NotImplementedError()"
2025-02-21 17:13:13 +00:00
]
},
{
"cell_type": "code",
2025-02-27 00:32:43 +00:00
"execution_count": 23,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "6f8072de8469e1f06df363cc72011cc8",
"grade": true,
"grade_id": "cell-946369d338039825",
"locked": true,
"points": 1,
"schema_version": 3,
"solution": false
}
},
"outputs": [],
"source": [
"assert np.isclose(compute_rms(mu_pred_lin, mu_pred_lin), 0.0)"
]
},
{
"cell_type": "code",
2025-02-27 00:32:43 +00:00
"execution_count": 24,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "7621e2e41683215f9f485a1161168a5c",
"grade": false,
"grade_id": "cell-9c52009b6ad3fd17",
"locked": false,
"schema_version": 3,
"solution": true
}
},
"outputs": [],
"source": [
"# Compute the RMS error between the data and the predictions for each model.\n",
"# Use variables rms_sample_lin, rms_sample_poly and rms_sample_poly_improved.\n",
"# YOUR CODE HERE\n",
"\n",
"# Generate a new array based on the x positions of the sample data\n",
"mu_sample_pred_lin = np.interp(z_sample, z, mu_pred_lin)\n",
"mu_sample_pred_poly = np.interp(z_sample, z, mu_pred_poly)\n",
"mu_sample_pred_poly_improved = np.interp(z_sample, z, mu_pred_poly_improved)\n",
"\n",
"rms_sample_lin = compute_rms(mu_sample, mu_sample_pred_lin)\n",
"rms_sample_poly = compute_rms(mu_sample, mu_sample_pred_poly)\n",
"rms_sample_poly_improved = compute_rms(mu_sample, mu_sample_pred_poly_improved)\n",
2025-02-24 22:04:18 +00:00
"#raise NotImplementedError()"
2025-02-21 17:13:13 +00:00
]
},
{
"cell_type": "code",
2025-02-27 00:32:43 +00:00
"execution_count": 25,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "f309e5fc58b77e4cda8bc8f3ee78ce1d",
"grade": false,
"grade_id": "cell-579ae5f1089bfb46",
"locked": true,
"schema_version": 3,
"solution": false
}
},
2025-02-24 22:04:18 +00:00
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"rms_sample_lin = 1.0084\n",
"rms_sample_poly = 0.8855\n",
"rms_sample_poly_improved = 0.8771\n"
2025-02-24 22:04:18 +00:00
]
}
],
2025-02-21 17:13:13 +00:00
"source": [
"# Print RMS values computed.\n",
"print(\"rms_sample_lin = {0:.4f}\".format(rms_sample_lin))\n",
"print(\"rms_sample_poly = {0:.4f}\".format(rms_sample_poly))\n",
"print(\"rms_sample_poly_improved = {0:.4f}\".format(rms_sample_poly_improved))"
]
},
{
"cell_type": "code",
2025-02-27 00:32:43 +00:00
"execution_count": 26,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "7d71e55b0de3ea49b8e285df5a75e860",
"grade": true,
"grade_id": "cell-09beb7cceccb40be",
"locked": true,
"points": 1,
"schema_version": 3,
"solution": false
}
},
2025-02-27 00:32:43 +00:00
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"rms_sample_lin defined.\n"
]
}
],
2025-02-21 17:13:13 +00:00
"source": [
"check_var_defined('rms_sample_lin')"
]
},
{
"cell_type": "code",
2025-02-27 00:32:43 +00:00
"execution_count": 27,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "3c12f33d96207e0b339d5a7d7e28146a",
"grade": true,
"grade_id": "cell-6000d96ec4b0de43",
"locked": true,
"points": 1,
"schema_version": 3,
"solution": false
}
},
2025-02-27 00:32:43 +00:00
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"rms_sample_poly defined.\n"
]
}
],
2025-02-21 17:13:13 +00:00
"source": [
"check_var_defined('rms_sample_poly')"
]
},
{
"cell_type": "code",
2025-02-27 00:32:43 +00:00
"execution_count": 28,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "c61d68d34d7d1760459d619d00f2a03c",
"grade": true,
"grade_id": "cell-06d3a7838b6ad7e7",
"locked": true,
"points": 1,
"schema_version": 3,
"solution": false
}
},
2025-02-27 00:32:43 +00:00
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"rms_sample_poly_improved defined.\n"
]
}
],
2025-02-21 17:13:13 +00:00
"source": [
"check_var_defined('rms_sample_poly_improved')"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "d2cfab16a3b881eb9666957d282df2eb",
"grade": false,
"grade_id": "cell-a4632d53a04563a6",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*Comment on what models you believe are best.*"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "65df151fe8f001ee61ec5607e15a0b3d",
"grade": true,
"grade_id": "cell-d34e017b234e387c",
"locked": false,
"points": 2,
"schema_version": 3,
"solution": true
}
},
"source": [
"Despite the better RMSE on the training data I believe the linear model is best, as the polynomial models are overfitted. Slight changes to the RMSE can be achieved by changing the degree of the polynomial and the regularization parameter, but this would not transfer well to new data, particularly when extrapolating beyond the given dataset."
2025-02-21 17:13:13 +00:00
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "92e241c8e04b9942db80a362a82b6429",
"grade": false,
"grade_id": "cell-d9ef9bfe07f359d0",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"Using our cosmological concordance model we can predict the theoretical distance modulus vs redshift relationship using our understanding of the physics."
]
},
{
"cell_type": "code",
2025-02-27 00:32:43 +00:00
"execution_count": 29,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "9df4b3203c1281eb061f9a3d159457f7",
"grade": false,
"grade_id": "cell-19fd3ac1f2d371eb",
"locked": true,
"schema_version": 3,
"solution": false
}
},
2025-02-24 22:04:18 +00:00
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"WARNING: AstroMLDeprecationWarning: The Cosmology class is deprecated and may be removed in a future version.\n",
" Use astropy.cosmology instead. [warnings]\n"
]
}
],
2025-02-21 17:13:13 +00:00
"source": [
"from astroML.cosmology import Cosmology\n",
"cosmo = Cosmology()\n",
"mu_cosmo = np.array(list(map(cosmo.mu, z)))"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "463b8b27c0408d32b519fd702dff0a63",
"grade": false,
"grade_id": "cell-22c310e26cfe2d41",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*Plot the data, predictions made with all regression models, and the values predicted by the cosmological model.*"
]
},
{
"cell_type": "code",
2025-02-27 00:32:43 +00:00
"execution_count": 30,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "27b490f88536997467462e8d6ce09aa7",
"grade": true,
"grade_id": "cell-62cecf73f9f9a228",
"locked": false,
"points": 1,
"schema_version": 3,
"solution": true
}
},
2025-02-24 22:04:18 +00:00
"outputs": [
{
"data": {
"text/plain": [
2025-03-02 00:11:57 +00:00
"<matplotlib.legend.Legend at 0x783473c56fd0>"
2025-02-24 22:04:18 +00:00
]
},
2025-02-27 00:32:43 +00:00
"execution_count": 30,
2025-02-24 22:04:18 +00:00
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAjMAAAHHCAYAAABKudlQAAAAOnRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjEwLjAsIGh0dHBzOi8vbWF0cGxvdGxpYi5vcmcvlHJYcgAAAAlwSFlzAAAPYQAAD2EBqD+naQAAy1NJREFUeJzsnXd4FHX+x1+zLb0nJLSQUAPSQWmiNEVR1LNzimA/K+pxp9zPinro2bvInYqIYu8gIqIgvbfQIRBqICG9bJvfH5vdZLO7yW6ym03I5/U8+yQ7+52Z7+7szrznUxVVVVUEQRAEQRCaKZpgT0AQBEEQBKEhiJgRBEEQBKFZI2JGEARBEIRmjYgZQRAEQRCaNSJmBEEQBEFo1oiYEQRBEAShWSNiRhAEQRCEZo2IGUEQBEEQmjUiZgRBEARBaNaImBEEDzz55JMoihLsabRosrKyUBSFDz/8sF7rK4rCk08+6dc5nal8+OGHKIpCVlaW1+vYj8+LL75Y51h3vyez2cw///lP2rdvj0aj4YorrvBx1oJgQ8SM0CKwn6jtj9DQUNq0acPYsWN5/fXXKSoq8st+jh49ypNPPsmmTZv8sj1BOJN5//33eeGFF7j66quZPXs2Dz74IJmZmTz55JM+iSpBEDEjtCimT5/OnDlzeOedd7jvvvsAeOCBB+jVqxdbtmxxGvvoo49SVlbm0/aPHj3KU089JWJGEGrg7vf022+/0bZtW1555RUmTpzI+eefT2ZmJk899ZSIGcEndMGegCA0JhdffDEDBw50PJ82bRq//fYbl156KZdddhk7duwgLCwMAJ1Oh04nPxGhaWM2m7FarRgMhmBPpVbc/Z5ycnKIjY0NzoSEMwqxzAgtnlGjRvHYY49x8OBBPv74Y8dydz7+RYsWce655xIbG0tkZCTdunXjX//6FwC///47Z599NgA333yzw6Vlj/dYtmwZ11xzDampqYSEhNC+fXsefPBBl7vVyZMnExkZyZEjR7jiiiuIjIwkKSmJqVOnYrFYnMZarVZee+01evXqRWhoKElJSVx00UWsW7fOadzHH3/MgAEDCAsLIz4+nuuvv57s7Ow6Pxv7Z7B7925uvPFGYmJiSEpK4rHHHkNVVbKzs7n88suJjo4mJSWFl156yWUbOTk53HrrrSQnJxMaGkqfPn2YPXu2y7j8/HwmT55MTEwMsbGxTJo0ifz8fJdxI0aMYMSIES7LJ0+eTFpaWq3vx9MYX4+1J3r27MnIkSNdllutVtq2bcvVV1/tWDZv3jwGDBhAVFQU0dHR9OrVi9dee63W7VePUXn11Vfp1KkTISEhZGZmArBz506uvvpq4uPjCQ0NZeDAgXz//fcu29m+fTujRo0iLCyMdu3a8cwzz2C1Wl3GrVu3jrFjx5KYmEhYWBjp6enccsstbuf23nvvOeZz9tlns3btWqfXq3/G9vexZMkStm/f7vRbueaaawAYOXKkY/nvv/9e6+ciCHLbKQjAxIkT+de//sUvv/zC7bff7nbM9u3bufTSS+nduzfTp08nJCSEvXv3snz5cgC6d+/O9OnTefzxx7njjjsYPnw4AEOHDgXgiy++oLS0lLvuuouEhATWrFnDG2+8weHDh/niiy+c9mWxWBg7diyDBg3ixRdf5Ndff+Wll16iU6dO3HXXXY5xt956Kx9++CEXX3wxt912G2azmWXLlrFq1SqHBerZZ5/lscce49prr+W2227j5MmTvPHGG5x33nls3LjRqzvj6667ju7du/Pcc8/x008/8cwzzxAfH8/MmTMZNWoUzz//PHPnzmXq1KmcffbZnHfeeQCUlZUxYsQI9u7dy7333kt6ejpffPEFkydPJj8/nylTpgCgqiqXX345f/75J3/729/o3r0733zzDZMmTfLhKPqPuo61J6677jqefPJJjh8/TkpKimP5n3/+ydGjR7n++usBm1CaMGECo0eP5vnnnwdgx44dLF++3PGZ1MYHH3xAeXk5d9xxByEhIcTHx7N9+3aGDRtG27ZteeSRR4iIiODzzz/niiuu4KuvvuIvf/kLAMePH2fkyJGYzWbHuPfee89hkbSTk5PDhRdeSFJSEo888gixsbFkZWXx9ddfu8znk08+oaioiDvvvBNFUfjPf/7DlVdeyf79+9Hr9S7jk5KSmDNnDs8++yzFxcXMmDEDgC5dunD//ffz+uuv869//Yvu3bsDOP4KgkdUQWgBfPDBByqgrl271uOYmJgYtV+/fo7nTzzxhFr9J/LKK6+ogHry5EmP21i7dq0KqB988IHLa6WlpS7LZsyYoSqKoh48eNCxbNKkSSqgTp8+3Wlsv3791AEDBjie//bbbyqg3n///S7btVqtqqqqalZWlqrVatVnn33W6fWtW7eqOp3OZXlN7J/BHXfc4VhmNpvVdu3aqYqiqM8995xj+enTp9WwsDB10qRJjmWvvvqqCqgff/yxY5nRaFSHDBmiRkZGqoWFhaqqquq3336rAup//vMfp/0MHz7c5fM8//zz1fPPP99lrpMmTVI7dOjgtAxQn3jiiVrHVH+fdrw51u7YtWuXCqhvvPGG0/K7775bjYyMdHwHpkyZokZHR6tms9mn7R84cEAF1OjoaDUnJ8fptdGjR6u9evVSy8vLHcusVqs6dOhQtUuXLo5lDzzwgAqoq1evdizLyclRY2JiVEA9cOCAqqqq+s0339T5m7HPJyEhQc3Ly3Ms/+6771RA/eGHHxzLan7Gqmo7lmeddZbTsi+++EIF1CVLltT9gQhCJeJmEoRKIiMja81qslswvvvuO7cm+bqofudbUlLCqVOnGDp0KKqqsnHjRpfxf/vb35yeDx8+nP379zuef/XVVyiKwhNPPOGyrt2c//XXX2O1Wrn22ms5deqU45GSkkKXLl1YsmSJV3O/7bbbHP9rtVoGDhyIqqrceuutjuWxsbF069bNaY7z588nJSWFCRMmOJbp9Xruv/9+iouL+eOPPxzjdDqdk9VJq9U6grQbm/oe665du9K3b18+++wzxzKLxcKXX37J+PHjHd+B2NhYSkpKWLRoUb3md9VVV5GUlOR4npeXx2+//ca1115LUVGR4zjn5uYyduxY9uzZw5EjRwDbZz148GDOOeccx/pJSUnccMMNTvuwfwY//vgjJpOp1vlcd911xMXFOZ7brZLVvwuCEEhEzAhCJcXFxURFRXl8/brrrmPYsGHcdtttJCcnc/311/P55597fbE7dOgQkydPJj4+3hEHc/755wNQUFDgNNYe/1KduLg4Tp8+7Xi+b98+2rRpQ3x8vMd97tmzB1VV6dKlC0lJSU6PHTt2kJOT49XcU1NTnZ7HxMQQGhpKYmKiy/Lqczx48CBdunRBo3E+1djdBgcPHnT8bd26NZGRkU7junXr5tX8/E1DjvV1113H8uXLHeLh999/Jycnh+uuu84x5u6776Zr165cfPHFtGvXjltuuYWff/7Z6/mlp6c7Pd+7dy+qqvLYY4+5HGe72LUfa/sxqUnNz/r888/nqquu4qmnniIxMZHLL7+cDz74gIqKCpd1a34/7MKm+ndBEAKJxMwIAnD48GEKCgro3LmzxzFhYWEsXbqUJUuW8NNPP/Hzzz/z2WefMWrUKH755Re0Wq3HdS0WCxdccAF5eXk8/PDDZGRkEBERwZEjR5g8ebLLRbK2bfmC1WpFURQWLFjgdps1xYMn3K3raY6qqvo2SR9RFMXtPmoGR3ta1x01123Isb7uuuuYNm0aX3zxBQ888ACff/45MTExXHTRRY4xrVq1YtOmTSxcuJAFCxawYMECPvjgA2666Sa3wdE1qRnfYv/+TJ06lbFjx7pdp7bvtjsUReHLL79k1apV/PDDDyxcuJBbbrmFl156iVWrVjl9d4L1XRAEOyJmBAGYM2cOgMcLgR2NRsPo0aMZPXo0L7/8Mv/+97/5v//7P5YsWcKYMWM8Xiy3bt3K7t27mT17NjfddJNjeX3dDACdOnVi4cKF5OXlebTOdOrUCVVVSU9Pp2vXrvXeV33p0KEDW7ZswWq1Ollndu7c6Xjd/nfx4sUUFxc7XSR
2025-02-24 22:04:18 +00:00
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
2025-02-21 17:13:13 +00:00
"source": [
"# YOUR CODE HERE\n",
2025-02-24 22:04:18 +00:00
"plot_dist_mod()\n",
"plt.plot(z, mu_pred_lin, label=\"mu_pred_lin\")\n",
"plt.plot(z, mu_pred_poly, label=\"mu_pred_poly\")\n",
"plt.plot(z, mu_pred_poly_improved, label=\"mu_pred_poly_improved\")\n",
"plt.plot(z, mu_cosmo, label=\"mu_cosmo\")\n",
"plt.legend()\n",
"#raise NotImplementedError()"
2025-02-21 17:13:13 +00:00
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "b277d71c70a41ca213e62c916b98207d",
"grade": false,
"grade_id": "cell-285160edd2e418e2",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*Compute the RMS error between the predictions made by the cosmological model and each of the regression models, over the sample array `z`.*"
]
},
{
"cell_type": "code",
2025-02-27 00:32:43 +00:00
"execution_count": 31,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "0034720cf8c44e319df90de05466ef84",
"grade": false,
"grade_id": "cell-939141aa04822e33",
"locked": false,
"schema_version": 3,
"solution": true
}
},
"outputs": [],
"source": [
"# Compute the RMS error between the data and the predictions for each model.\n",
"# Use variables rms_cosmo_lin, rms_cosmo_poly and rms_cosmo_poly_improved.\n",
"# YOUR CODE HERE\n",
2025-02-24 22:04:18 +00:00
"rms_cosmo_lin = compute_rms(mu_cosmo, mu_pred_lin)\n",
"rms_cosmo_poly = compute_rms(mu_cosmo, mu_pred_poly)\n",
"rms_cosmo_poly_improved = compute_rms(mu_cosmo, mu_pred_poly_improved)\n",
"#raise NotImplementedError()"
2025-02-21 17:13:13 +00:00
]
},
{
"cell_type": "code",
2025-02-27 00:32:43 +00:00
"execution_count": 32,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "f4a351596164e691fb7bb6359e3de55b",
"grade": false,
"grade_id": "cell-770a716893bdb639",
"locked": true,
"schema_version": 3,
"solution": false
}
},
2025-02-24 22:04:18 +00:00
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"rms_cosmo_lin = 1.1252\n",
"rms_cosmo_poly = 20.6652\n",
"rms_cosmo_poly_improved = 163.5297\n"
2025-02-24 22:04:18 +00:00
]
}
],
2025-02-21 17:13:13 +00:00
"source": [
"# Print RMS values computed.\n",
"print(\"rms_cosmo_lin = {0:.4f}\".format(rms_cosmo_lin))\n",
"print(\"rms_cosmo_poly = {0:.4f}\".format(rms_cosmo_poly))\n",
"print(\"rms_cosmo_poly_improved = {0:.4f}\".format(rms_cosmo_poly_improved))"
]
},
{
"cell_type": "code",
2025-02-27 00:32:43 +00:00
"execution_count": 33,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "6f439fe717a169e6d0658974b1dcbc7b",
"grade": true,
"grade_id": "cell-e8a9f757965b2069",
"locked": true,
"points": 1,
"schema_version": 3,
"solution": false
}
},
2025-02-24 22:04:18 +00:00
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"rms_cosmo_lin defined.\n"
]
}
],
2025-02-21 17:13:13 +00:00
"source": [
"check_var_defined('rms_cosmo_lin')"
]
},
{
"cell_type": "code",
2025-02-27 00:32:43 +00:00
"execution_count": 34,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "33956cf40ac1a6064c599cca42bb3520",
"grade": true,
"grade_id": "cell-30ed009401260759",
"locked": true,
"points": 1,
"schema_version": 3,
"solution": false
}
},
2025-02-24 22:04:18 +00:00
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"rms_cosmo_poly defined.\n"
]
}
],
2025-02-21 17:13:13 +00:00
"source": [
"check_var_defined('rms_cosmo_poly')"
]
},
{
"cell_type": "code",
2025-02-27 00:32:43 +00:00
"execution_count": 35,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "4a4fc0aef5b79d11ac43990c71b12d93",
"grade": true,
"grade_id": "cell-e6ee872c00472aa2",
"locked": true,
"points": 1,
"schema_version": 3,
"solution": false
}
},
2025-02-24 22:04:18 +00:00
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"rms_cosmo_poly_improved defined.\n"
]
}
],
2025-02-21 17:13:13 +00:00
"source": [
"check_var_defined('rms_cosmo_poly_improved')"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "f67fca72235a566438e6daed262ecc04",
"grade": false,
"grade_id": "cell-cfa4d93afc081e93",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*Comment on the RMS values computed and the implications for the accuracy of the different regression models considered.*"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "f25a554b6d3d42a22139116a83b3c1c7",
"grade": true,
"grade_id": "cell-a993842383ec9778",
"locked": false,
"points": 2,
"schema_version": 3,
"solution": true
}
},
"source": [
"Both the `poly` and `poly_improved` models perform worse than `lin` across the domain of `mu_cosmo`. This suggests these models would be poor at extrapolating beyond this domain. `poly` performs particularly poorly as it dives off the high end of the domain to diverge very quickly from the theoretical model."
2025-02-21 17:13:13 +00:00
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "e28b2898e145c2fb9c843928795b54f2",
"grade": false,
"grade_id": "cell-c8ac035dcf2c47fc",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"---"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "900204de52da9d3c8b85d7604102a518",
"grade": false,
"grade_id": "cell-2d2591b309fcc3d2",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"## Part 2: Classification"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "0e5cd46344bce92e33149bb5f42b9485",
"grade": false,
"grade_id": "cell-bb13c563fd6784c9",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"In these exercises we will consider classification of [RR Lyrae](https://en.wikipedia.org/wiki/RR_Lyrae_variable) variable stars. RR Lyrae variables are often used as standard candles to measure astronomical distances since their period of pulsation can be related to their absolute magnitude.\n",
"\n",
"Observations of star magnitudes are made in each [SDSS filter band](http://skyserver.sdss.org/dr2/en/proj/advanced/color/sdssfilters.asp): u, g, r, i, z.\n",
"\n",
"We will consider the space of astronomical \"colours\" to distinguish RR Lyraes from background stars. Astronomical colours are simply differences in magnitudes between bands, e.g. u-g, g-r, r-i, i-z. You can find further background [here](https://en.wikipedia.org/wiki/Color%E2%80%93color_diagram)."
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "c8a74d7f460b94e02dde82bb72bb5eaf",
"grade": false,
"grade_id": "cell-4f6b1f1dc074f5cc",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"First, download the data. (This may take some time on first execution. Subsequently executions will read from cached data on your system.)"
]
},
{
"cell_type": "code",
2025-02-27 00:32:43 +00:00
"execution_count": 36,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "df9d97d24e9496cc877839350420247b",
"grade": false,
"grade_id": "cell-73597701131bc8e2",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"outputs": [],
"source": [
"# Load data\n",
"from astroML.datasets import fetch_rrlyrae_combined\n",
"X, y = fetch_rrlyrae_combined()"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "8bf835a66bdebf03f21f25c2038156eb",
"grade": false,
"grade_id": "cell-2b739257efd6fbdf",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"You can learn more about the format of the returned data [here](http://www.astroml.org/modules/generated/astroML.datasets.fetch_rrlyrae_combined.html). In particular, note that the columns of `X` are u-g, g-r, r-i, i-z."
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "cbba628ec0398b2d517363e7ba5b669b",
"grade": false,
"grade_id": "cell-1d6b876f89c05942",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*Construct a Pandas DataFrame for the `X` data and a Series for the `y` data. Call your Pandas objects `X_pd` and `y_pd` respectively.*\n",
"\n",
"Be sure to give your colums the correct colour name, e.g. 'u-g'."
]
},
{
"cell_type": "code",
2025-02-27 00:32:43 +00:00
"execution_count": 37,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "17f47902f34b6b5623f6d387902a536e",
"grade": false,
"grade_id": "cell-7250404d5b9e0c13",
"locked": false,
"schema_version": 3,
"solution": true
}
},
"outputs": [],
2025-02-21 17:13:13 +00:00
"source": [
"import pandas as pd\n",
"cols=['u-g', 'g-r', 'r-i', 'i-z']\n",
"# YOUR CODE HERE\n",
2025-02-27 00:32:43 +00:00
"X_pd = pd.DataFrame({\n",
" \"u-g\": [c[0] for c in X],\n",
" \"g-r\": [c[1] for c in X],\n",
" \"r-i\": [c[2] for c in X],\n",
" \"i-z\": [c[3] for c in X],\n",
"})\n",
"# raise NotImplementedError"
2025-02-21 17:13:13 +00:00
]
},
{
"cell_type": "code",
2025-02-27 00:32:43 +00:00
"execution_count": 38,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "32251036f16850a2b13b1f78e0cf6ca8",
"grade": true,
"grade_id": "cell-a913d8acabdfba5c",
"locked": true,
"points": 1,
"schema_version": 3,
"solution": false
}
},
2025-02-27 00:32:43 +00:00
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"X_pd defined.\n",
" u-g g-r r-i i-z\n",
"0 1.250999 0.394000 0.137000 0.061999\n",
"1 1.048000 0.339001 0.151999 0.023001\n",
"2 1.008001 0.341999 0.129000 0.203001\n",
"3 0.965000 0.392000 0.149000 0.150000\n",
"4 1.040001 0.333000 0.125999 0.101999\n",
"... ... ... ... ...\n",
"93136 0.962999 0.059000 -0.025999 -0.025000\n",
"93137 1.059999 0.185001 0.050999 -0.023998\n",
"93138 1.044001 0.212000 0.035000 0.002001\n",
"93139 1.064999 0.172001 0.042000 0.003000\n",
"93140 1.125999 0.065001 -0.017000 -0.057999\n",
"\n",
"[93141 rows x 4 columns]\n"
]
}
],
2025-02-21 17:13:13 +00:00
"source": [
"check_var_defined('X_pd')\n",
"print(X_pd)"
]
},
{
"cell_type": "code",
2025-02-27 00:32:43 +00:00
"execution_count": 39,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "1a57773c7e968f9c71d9be3229d9e2a8",
"grade": false,
"grade_id": "cell-0b438d5f8dcac8e9",
"locked": false,
"schema_version": 3,
"solution": true
}
},
"outputs": [],
2025-02-21 17:13:13 +00:00
"source": [
"# YOUR CODE HERE\n",
2025-02-27 00:32:43 +00:00
"y_pd = pd.Series(y)\n",
"#raise NotImplementedError()"
2025-02-21 17:13:13 +00:00
]
},
{
"cell_type": "code",
2025-02-27 00:32:43 +00:00
"execution_count": 40,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "097b3a38b8aec4ddf4a649990924e628",
"grade": true,
"grade_id": "cell-d1392b89a707b35a",
"locked": true,
"points": 1,
"schema_version": 3,
"solution": false
}
},
2025-02-27 00:32:43 +00:00
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"y_pd defined.\n",
"0 0.0\n",
"1 0.0\n",
"2 0.0\n",
"3 0.0\n",
"4 0.0\n",
" ... \n",
"93136 1.0\n",
"93137 1.0\n",
"93138 1.0\n",
"93139 1.0\n",
"93140 1.0\n",
"Length: 93141, dtype: float64\n"
]
}
],
2025-02-21 17:13:13 +00:00
"source": [
"check_var_defined('y_pd')\n",
"print(y_pd)"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "a2a814f95845ae40b8d2e46d3f3b02d2",
"grade": false,
"grade_id": "cell-dba1c66617cc789e",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*Combine your data and targets into a single Pandas DataFrame, labelling the target column 'target'. Call the resulting Pandas DataFrame `X_pd_all`.*"
]
},
{
"cell_type": "code",
"execution_count": 41,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "c9bc7cc9a6cb4270148403c68e9cd022",
"grade": false,
"grade_id": "cell-f80ca9f2573d06fb",
"locked": false,
"schema_version": 3,
"solution": true
}
},
"outputs": [],
"source": [
"# YOUR CODE HERE\n",
2025-02-27 00:32:43 +00:00
"# Copy original dataframe\n",
"X_pd_all = pd.DataFrame(X_pd)\n",
"X_pd_all[\"target\"] = y_pd\n",
"#raise NotImplementedError()"
2025-02-21 17:13:13 +00:00
]
},
{
"cell_type": "code",
"execution_count": 42,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "9b150964210efff48af195349090fff3",
"grade": true,
"grade_id": "cell-694c65584675c6c6",
"locked": true,
"points": 1,
"schema_version": 3,
"solution": false
}
},
2025-02-27 00:32:43 +00:00
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"X_pd_all defined.\n",
" u-g g-r r-i i-z target\n",
"0 1.250999 0.394000 0.137000 0.061999 0.0\n",
"1 1.048000 0.339001 0.151999 0.023001 0.0\n",
"2 1.008001 0.341999 0.129000 0.203001 0.0\n",
"3 0.965000 0.392000 0.149000 0.150000 0.0\n",
"4 1.040001 0.333000 0.125999 0.101999 0.0\n",
"... ... ... ... ... ...\n",
"93136 0.962999 0.059000 -0.025999 -0.025000 1.0\n",
"93137 1.059999 0.185001 0.050999 -0.023998 1.0\n",
"93138 1.044001 0.212000 0.035000 0.002001 1.0\n",
"93139 1.064999 0.172001 0.042000 0.003000 1.0\n",
"93140 1.125999 0.065001 -0.017000 -0.057999 1.0\n",
"\n",
"[93141 rows x 5 columns]\n"
]
}
],
2025-02-21 17:13:13 +00:00
"source": [
"check_var_defined('X_pd_all')\n",
"print(X_pd_all)"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "2a59e1592037847daee2a1d59e66e82e",
"grade": false,
"grade_id": "cell-b2cdd1e8ff443c4f",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*Add a 'target description' column to your existing `X_pd_all` DataFrame, with fields 'Background' and 'RR Lyrae' to specify the target type.*"
]
},
{
"cell_type": "code",
"execution_count": 43,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "0175b69fda7663e0c7f1e816703b8273",
"grade": false,
"grade_id": "cell-f94161f729fadf8c",
"locked": false,
"schema_version": 3,
"solution": true
}
},
"outputs": [],
"source": [
"# YOUR CODE HERE\n",
2025-02-27 00:32:43 +00:00
"X_pd_all['target description'] = X_pd_all.apply(lambda r : 'Background' if r['target'] == 0 else 'RR Lyrae', axis=1)\n",
"#raise NotImplementedError()"
2025-02-21 17:13:13 +00:00
]
},
{
"cell_type": "code",
"execution_count": 44,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "9b02148198742613e8a4ca524e073b71",
"grade": true,
"grade_id": "cell-cb2480e79d82c641",
"locked": true,
"points": 1,
"schema_version": 3,
"solution": false
}
},
2025-02-27 00:32:43 +00:00
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" u-g g-r r-i i-z target target description\n",
"0 1.250999 0.394000 0.137000 0.061999 0.0 Background\n",
"1 1.048000 0.339001 0.151999 0.023001 0.0 Background\n",
"2 1.008001 0.341999 0.129000 0.203001 0.0 Background\n",
"3 0.965000 0.392000 0.149000 0.150000 0.0 Background\n",
"4 1.040001 0.333000 0.125999 0.101999 0.0 Background\n",
"... ... ... ... ... ... ...\n",
"93136 0.962999 0.059000 -0.025999 -0.025000 1.0 RR Lyrae\n",
"93137 1.059999 0.185001 0.050999 -0.023998 1.0 RR Lyrae\n",
"93138 1.044001 0.212000 0.035000 0.002001 1.0 RR Lyrae\n",
"93139 1.064999 0.172001 0.042000 0.003000 1.0 RR Lyrae\n",
"93140 1.125999 0.065001 -0.017000 -0.057999 1.0 RR Lyrae\n",
"\n",
"[93141 rows x 6 columns]\n"
]
}
],
2025-02-21 17:13:13 +00:00
"source": [
"print(X_pd_all)"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "696a530a5fa11fd74e886aad53c4730a",
"grade": false,
"grade_id": "cell-add4e81373265098",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*How many RR Lyrae variable stars are there in the dataset (i.e compute `n_rrlyrae`)?*"
]
},
{
"cell_type": "code",
"execution_count": 45,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "9e45a92c9efcd4a5dc60ae91cfaece89",
"grade": false,
"grade_id": "cell-753a59a39e3df18c",
"locked": false,
"schema_version": 3,
"solution": true
}
},
"outputs": [],
"source": [
"# YOUR CODE HERE\n",
"n_rrlyrae = (X_pd_all[\"target description\"] == \"RR Lyrae\").sum()\n",
2025-02-27 00:32:43 +00:00
"#raise NotImplementedError()"
2025-02-21 17:13:13 +00:00
]
},
{
"cell_type": "code",
"execution_count": 46,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "e4fa379f2a74a6855884a165bc422e87",
"grade": true,
"grade_id": "cell-c7fa425ec227dd04",
"locked": true,
"points": 1,
"schema_version": 3,
"solution": false
}
},
2025-02-27 00:32:43 +00:00
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"n_rrlyrae defined.\n",
"n_rrlyrae = 483\n"
]
}
],
2025-02-21 17:13:13 +00:00
"source": [
"check_var_defined('n_rrlyrae')\n",
"print(\"n_rrlyrae = {0}\".format(n_rrlyrae))"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "8836626b58900a14fc8d5cca59c33d20",
"grade": false,
"grade_id": "cell-a267bf2d5be875a6",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*How many background stars are there in the dataset (i.e. compute `n_background`)?*"
]
},
{
"cell_type": "code",
"execution_count": 47,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "9390ffb5aa4c435a65609e8d9fbbea17",
"grade": false,
"grade_id": "cell-f902e74120d04b39",
"locked": false,
"schema_version": 3,
"solution": true
}
},
"outputs": [],
"source": [
"# YOUR CODE HERE\n",
"n_background = (X_pd_all[\"target description\"] == \"Background\").sum()\n",
2025-02-27 00:32:43 +00:00
"#raise NotImplementedError()"
2025-02-21 17:13:13 +00:00
]
},
{
"cell_type": "code",
"execution_count": 48,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "766f1ee43bdca36d9c5b2eba3f20be3e",
"grade": true,
"grade_id": "cell-dd77ae406ebc1e36",
"locked": true,
"points": 1,
"schema_version": 3,
"solution": false
}
},
2025-02-27 00:32:43 +00:00
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"n_background defined.\n",
"n_background = 92658\n"
]
}
],
2025-02-21 17:13:13 +00:00
"source": [
"check_var_defined('n_background')\n",
"print(\"n_background = {0}\".format(n_background))"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "0ad0f46079adada0deefc29a592e1c23",
"grade": false,
"grade_id": "cell-494facc20b7778b6",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*Plot scatter plot pairs for all colour combinations using `seaborn`. Colour the points by target type. Make sure the distribution plots are normalised to have an area of 1 under the curve for each of the classes.*"
]
},
{
"cell_type": "code",
"execution_count": 49,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "2198169c65059d8e1fd6d52582651c9f",
"grade": false,
"grade_id": "cell-ef57ad845334bbaa",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"outputs": [],
"source": [
"%matplotlib inline\n",
"import seaborn as sns; sns.set()"
]
},
{
"cell_type": "code",
"execution_count": 50,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "9a5b0a8a06c988cba3bac3fb41ee6ca6",
"grade": true,
"grade_id": "cell-6f8c0ce750628d0e",
"locked": false,
"points": 2,
"schema_version": 3,
"solution": true
}
},
2025-02-27 00:32:43 +00:00
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAABIAAAAPXCAYAAABThDCbAAAAOnRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjEwLjAsIGh0dHBzOi8vbWF0cGxvdGxpYi5vcmcvlHJYcgAAAAlwSFlzAAAPYQAAD2EBqD+naQABAABJREFUeJzsnWd4HNXZhu+Z2aZd9d671SzJcpN7ww3TMWDA9F4SICEkXxJSgQQSSAhpBELv3WBswLgbV8mWrGLZ6r33un3m+7Hy2mtJBgIBAnNfl394NX3mzJzznPd9XkFRFAUVFRUVFRUVFRUVFRUVFRUVlW8t4td9ACoqKioqKioqKioqKioqKioq/11UAUhFRUVFRUVFRUVFRUVFRUXlW44qAKmoqKioqKioqKioqKioqKh8y1EFIBUVFRUVFRUVFRUVFRUVFZVvOaoApKKioqKioqKioqKioqKiovItRxWAVFRUVFRUVFRUVFRUVFRUVL7lqAKQioqKioqKioqKioqKioqKyrccVQBSUVFRUVFRUVFRUVFRUVFR+ZajCkAqKioqKioqKioqKioqKioq33I0X/cBfNtwOmV6eoa/7sP4TIiiQGCgiZ6eYWRZ+boP50vj23he38ZzgvHPKyTE5wtt83+pDf43+LY+K5+V7/r5w5dzDdR2+M1Ffca/Xr6q6/9F2yB8ue1Qfe4+P+o1+/x8067Zl9EOVVS+aagRQN9hRFFAEAREUfi6D+VL5dt4Xt/Gc4Jv73l9nXzXr+l3/fxBvQbfdtT7+/XyXb3+39Xz/iKo1+zzo14zFZX/PqoApKKioqKioqKioqKioqKiovItRxWAVFRUVFRUVFRUVFRUVFRUVL7lqAKQioqKioqKioqKioqKioqKyrccVQBSUVFRUVH5BiCKAqIkgigiSSKCIKAIAlYZbAogigiqLYLKV4gggKgRQRSQpG9ml1HSiNgVsDgVHIBG8808TpXvLpIk4kTAIivYFb6RbcndztX2o6LyrUetAqaioqLyP4YkidicCha7E0kUMGhFBEVB+foLZqj8hzgFgeqWAT7YV4fDobBkejTpCYE8s76U/KMdiALkTg7n6rMyMGqEb0R1FBUXkkbEYpexOWS0GhG9RkRxyl/3YX1hZEGga8DCht219AxYmJoayoIpkRikb87zJ4siHx9o4P1Pahi2OAgJ8OKqM9PJiA9AVF+IKl8yggCKIGCxyzhlBYNWQicJOE/T3odG7JTU9vDcxjLaukfw0ms4a248q+bEI8pf/3tCEQT6hm1s2FNLe88IkxMCOWNGLF5aAdmptiEVlW8jqgCkoqKi8r+EIFDXMcS/1pXQ2uUq7zs1JYSbLsjEqM7c/U/iROCJd0s5eLTd/VtJdRcxYT5cfVY6+Uc7kBXYX9rG0boe/vi9+WgF0GolABwOpyr+fU3IgsDBox28vOkY/UM2NJLAoqnRXLY8BQ3/u6KsIghsL2zm5Y+OuX8rq+3h3Z3VPHT7PLx1Esp/+eQ0GlcUnNMpjys4yYLAsxuPsKeo1f1bZ6+ZP79awG2rs5idEYrT8T96A1S+cYiiwIhD5t/vllBY0QlARLCJWy/MIibYxHiNXZQEDh5t40+vFLh/M1sdvL29irrWAW67MAsJ17OuKAoOx1csCAkCBRWd/OPtYvdPZbU9rN9dy+9vnUuQt+4bI/aqqKh8eaijBRUVFZWvAUkSsQO9I3b6zA7sCIjS6fN7BEGga9DKb57a7xZ/AAorOrn38b1YT+o7arWSR5j58RB0B66B05ddYlUQQKfToNNpENQ8pc+MKAo0dQ55iD/HaWwfpLy+l+ykYPdv/UM22npGGLA62bi/nnd319I5aENWr/kXQpIEnAgMWJz0DNuxK7jS8U67jsjB8g7++XYx/UM2ABxOha0HG/nTK4U4cd0TQQBhtL07wCPFQpJEdDoNWq30jWpDFrvsIf4cZ9hs5+n1R1D+m4cnCFicCjuKWnlrZzW1HUM4BWFM+qPZ5vQQf07mpY+OYbGrA1eV/xxXm3V9M0WNhNmpcO/jeymr62FmehhzsyNwOGR+89R+ugat47ZZs03mmfePjLv9IzXd2J0KLb1m3tlVw+ZDzYw4ZJQJ2r4oCqPvB+lLO0ebrPCvdSVjfrfanPztzSJU/VRF5duJGgGkoqKi8lUjCFS2DPC3Nw+7B47+3nruvDSH+FBv90yiKAlYHQr9wzZEAXy99bzw4dFxowp6B620dw8T4GvgvT1ldPQOMz0tlIz4ICQRSmp7eG1LBW3dI8SF+3DlmWnEhnojfAmz+LIg0NFnZtvBRhQFlkyPJizQiDTBtiVJxOKUUWTXwFsngvM7GmouiCIf7a+f8O+7i5r58RUz6OgbYeOeWrKSgjlQ1sYHe+rcy7y9vYrZmeHceO5kvryhwXcHURToHLTx8EuHCPDVc+78RHoGRYwGDaH+RrQiKOPMglscMq9sKh93m8fqe+gbthHsrWPIJvPa5jLyy9rRaUVW5MZx9rx47E6FoqouCso7CAs0MX9KJMfKOymv72XxtGgig0xIfP52IWlErA4ZWQZJFNBrhM8VWSBJIqUVHRP+/XBlJ1aHjP5LEJEFUcDqVOgbtCKKAgHeeiob+3jk5UMcv+Tv7aohJtSbX14/C83o9RAEgaaOoQm3Ozhix2xzoNWr3VyViXFpLYI7mk0UXYKP1S7z7rYKdhe1IAgwLzuSs+clcO6CRKJDfTh4tA2LzcmaZSloNSJvbK3ge6uzsTsV+odsOGWFAB89iqLQO2h17y8+wpeVs+Pw99ETGezNY68Xcqy+1/33Fz88ys0XZDIrIwxRAUVREASwI9DQOsgnh1sweWk4Y0YM/ibdF0pzFASBhrYBnBNE+NS1DmC2OTFp1VgBFZVvG+qXUeUbQ0VjHy9vrqBnwMKczHAuWZyEVqMOZ1S+OUijETpfVKwYMDv43XN5HkJO35CVB57N49EfLMJH5zJjrG0fZmjEhp+3HoNeYsTq4JqzMvhwby07C5txygo+Ri3LZ8UxIy2M1u5hfv3Ufvd2PzncwrkLEjHoJN7cWuneV3VzP799+gB3rMkhNy0Eu835H5+LLAg8s7GMvcUnZuK3HmxkRnoYt16QOaaDKosim/Ia2DDq2REb7sP150wmJsT0ucQoQXDt2+pwCUl6rYjmfyAIRpRELA4Zh1NGpxFd/hGnCbF3ygpldd28+nE5N1+QhclLy++ezSMpyo+Vs+PwMekQEMg/2saR2h78ffQE+RnQjNi+wrP6ZiCKgjtl6PNgcSr84l97yUoOZl52JPtKW1k8LZr2HjPtPWZSYvzRakVkp4JeK7oHZla7k4Fh13XOSAhkWW4sXnoNigJ7S1po6hjCS+/Pj//2CWarAwCr3clHB+qZlh7KIy8d8hgcvr29kpsuyGLY4uDX/95PVlIQd67JmVBIHQ9ZFNhb3Mpb26voG7QSFmjk6lXppMcHYnM43c+dXhJPe52cn+JN8mVkfymCwKGKLp56rxSr3Ymft46fXDWDR14p4NQm0dgxxBvbKrlyeQqyU0YUITLEm59dMxNwvdM+3l9P39CJ66mV1P7DdxmNRsRsl7E7ZTSSiJfmxDMvSi7hsavXwsCwjehQb4x6icqWAXQ6DY++UuB+lsICjRyt6+FwRSc/uWoGP/7bJ+597CpsJj7Clx9ePpUBs5227hHsDplPDjdTWNHB1Wdl8MAtcxk02wj286KhfZA3t1YQGexNbLiPh/hznCffLSUtLpCeATNRIT5oNCIPPZ9HTfOAe5kP9tZx0ZJkVs2O+49FIEH49L7Ml5nmKUki5tFvn1YS8dKKX33Km4qKCqAKQCrfEKqa+3n41UIigoxkxAeyo7CZzj4zd6zO/tJTVVRUPi+KIGBxyByr6QYgLS4Qg0b8j6JnREnknZ3l4w6gZFnh/U9quHJFCsM2J4qi0NVvpn/Ixjs7qugZsKDTiCzLjeWn18zklU3HuObsDN7ZUUVmYhD/eKtozHanpoTwwLN54x7LM+8fISFiDj5G7af6lej0GqwOmcERO3qt5Daerm4Z8BB/jnPwaDvHpkWTFe/v7mTKgsA/3y6
2025-02-27 00:32:43 +00:00
"text/plain": [
"<Figure size 1160.35x1000 with 20 Axes>"
2025-02-27 00:32:43 +00:00
]
},
"metadata": {},
"output_type": "display_data"
}
],
2025-02-21 17:13:13 +00:00
"source": [
"# YOUR CODE HERE\n",
"sns.pairplot(X_pd_all.drop(\"target\", axis=1), hue=\"target description\", diag_kind=\"kde\")\n",
2025-02-27 00:32:43 +00:00
"plt.show()\n",
"#raise NotImplementedError()"
2025-02-21 17:13:13 +00:00
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "cf6915bf6aa3ce7f12e619edc09b06e1",
"grade": false,
"grade_id": "cell-149d6b589054b26b",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"Let's separate the data into training and test sets, keeping 25% of the data for testing. "
]
},
{
"cell_type": "code",
"execution_count": 51,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "386593a707ee1226d8721ea92970f653",
"grade": false,
"grade_id": "cell-22b31f7602338d7f",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"outputs": [],
"source": [
"from sklearn.model_selection import train_test_split\n",
"X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=1)"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "55ce997145aece349fa6ae7401d4cb1d",
"grade": false,
"grade_id": "cell-34fc23b040e948f7",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"First let's consider 1D classification for the zeroth colour, i.e. $u-g$. "
]
},
{
"cell_type": "code",
"execution_count": 52,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "7c77a8433e4898de59b775624cf05767",
"grade": false,
"grade_id": "cell-53b81bcac2b85a55",
"locked": true,
"schema_version": 3,
"solution": false
}
},
2025-02-27 00:32:43 +00:00
"outputs": [
{
"data": {
"text/plain": [
"'u-g'"
]
},
"execution_count": 52,
2025-02-27 00:32:43 +00:00
"metadata": {},
"output_type": "execute_result"
}
],
2025-02-21 17:13:13 +00:00
"source": [
"ind = 0\n",
"col=cols[ind]\n",
"col"
]
},
{
"cell_type": "code",
"execution_count": 53,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "b3578a7b38fba9d5a52458bf8df86c7f",
"grade": false,
"grade_id": "cell-7a87a3946325c16c",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"outputs": [],
"source": [
"X_train_1d = X_train[:, ind]\n",
"X_train_1d = X_train_1d.reshape(-1,1)\n",
"X_test_1d = X_test[:, ind]\n",
"X_test_1d = X_test_1d.reshape(-1,1)"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "838b5dad42b1e778c11cd408b6987daf",
"grade": false,
"grade_id": "cell-bb6c2985470ef60d",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"To get some further intuition about the 1D classiciation problem consider a 1D plot of\n",
"class against colour."
]
},
{
"cell_type": "code",
"execution_count": 54,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "c32035c4ae5b83d6b7f73e39bda0d7a6",
"grade": false,
"grade_id": "cell-aac19aaa4019fefc",
"locked": true,
"schema_version": 3,
"solution": false
}
},
2025-02-27 00:32:43 +00:00
"outputs": [
{
"data": {
"text/plain": [
2025-03-02 00:11:57 +00:00
"<matplotlib.legend.Legend at 0x783473d50490>"
2025-02-27 00:32:43 +00:00
]
},
"execution_count": 54,
2025-02-27 00:32:43 +00:00
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA1gAAAHICAYAAABTZkvCAAAAOnRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjEwLjAsIGh0dHBzOi8vbWF0cGxvdGxpYi5vcmcvlHJYcgAAAAlwSFlzAAAPYQAAD2EBqD+naQAAX+NJREFUeJzt3Xd8FNX6x/Hv7IY0woYkdJASlGIBVCwIBhULqBQRFQuCFbxRKeJPsGAXxHalKKIoyFUR9YqiEAVUIqBYaFIVQlGQACFkE0jdnd8fuawsaZvNhGTg8369eMGenXPm2Xl2lzw5M2cM0zRNAQAAAAAqzFHVAQAAAADA8YICCwAAAAAsQoEFAAAAABahwAIAAAAAi1BgAQAAAIBFKLAAAAAAwCIUWAAAAABgEQosAAAAALBISFUHUJ2Zpimvt3rch9nhMKpNLAgMObMX8mUv5Mt+yJm9kC/7IWeVy+EwZBhGQNtSYJXC6zW1f//Bqg5DISEOxcTUlNt9SAUF3qoOBwEgZ/ZCvuyFfNkPObMX8mU/5KzyxcbWlNMZWIHFKYIAAAAAYBEKLAAAAACwCAUWAAAAAFiEAgsAAAAALEKBBQAAAAAWYRVBC3i9Xnk8BZU4vqGcHKfy8nLl8bD8ph0EkjOnM0QOB7/jAAAAOJ5QYFWAaZpyu/crOzur0ve1b59DXi/LbtpJIDmLiIiSyxUb8H0VAAAAUL1RYFXA4eIqKipGoaFhlfpDstNpMHtlM6XlzDRN5eXlKisrXZIUHR13LEMDAABAJaHACpLX6/EVV1FRrkrfX0iIgxvH2UxZOQsNDZMkZWWlq1atGE4XBAAAOA7wE12QPB6PpH9+SAaCcfj9U5nX8AEAAODYqVYF1vbt2zVmzBj17t1bp556qq6++uqA+pmmqalTp+qiiy5Su3btdMMNN2jVqlWVG+z/cO0MKoL3DwAAwPGlWp0i+Mcff2jx4sVq3769vF6vTDOwa47efPNNTZgwQSNHjlTr1q313nvv6fbbb9dnn32mk046qZKjBlBdmaapvbszVbdBLV8xW1zbkdvv+dtd7HPB7q86K+n47Pk7U5Kpeg1dAR23ksaz4nh4vV79vi5VrU6rL8MwtHd3pho2iQ7otZS2jaRi/11ZeTv6uAayz9LiTd3lVnraQbU+vYHf6cVer1eb1qYqOiZcGenZan16A0nyHUOHw+Ebt079KO35O9M3jmEY2vO3W5Kheg0L4/J4PPp12Q6deX4Tbdm4z5eH1F1u7d93ULVjI3Qg7ZAkQzF1ImUYhgxDRd47f/91QNt+3ydnqPTL0h2KqRMpp9OhqFphynRnyzAciq4doTbtGvryXKd+lPalZql2XLi+nrNBYeFOndq+ofbvO6Rtm9MUV6+mWpxSV3UbROmXJduUlZmnug0jdSizQAUFBdq5PUOdLmmhndsylL4vW+3ObaisjALF1IlUnfo19fP327R3d6bCwkPkNT06mJmvlm3jtHlDmsLCQnTmeU20c4dbTZrX1s7tbp3VqYk2r9+n3Nwcrfn5b+Xl56uWK0Lh4SFqfkqcdqSkq8N5TbRze4aaNI/W72v36ryLmurz99eqSYto5RwqUGTNUBkOQ4ZhqEnz2lr9019q1jJWhsMh94EcRceEy+v1KisjV+6MbOXletXunEbasmG/Wp1eR+70PJmmV+6MHDU/uY4aNI6WaZratDZVMXGRqtsgSn+s36OWbepoxQ9/qUnzaB1IO6Qsd77O7NRYK3/8S03jY/93zHZIhqmzL2iqTWtTtf2P/bqsTxul78uR5NWqH//UGec0ksfj1a/LduisTicpbc9BSYbq1K+p39elqnZspCTpwP5stTqtnvalHlRx3x2puzKUnpat1qfXD+h0+MI+br/3ZrDfLUeO1eq0+tqXmuX3Hg+E3b7bUfUMM9Aq5hjwer2+D96oUaO0du1affHFF6X2yc3N1QUXXKCbb75ZI0aMkCTl5eWpe/fuSkhI0BNPPBF0PB6PV/v3Hyz2ufz8PKWl/a24uIaqUSM06H0Eimuw7CeQnB3r99GJZtPaVH3zxUZ1u7qNWp1ev8S2kBCHYmJq6ofFW7Tg8w1+z1V0f9VZacdHUpnHrazxrDge3yX9rg2r/tapHRqqQZNoffPFRl3Wu606JbRUevpB32esvPGZUrH/rqy8HX1cA9lnWfFK0qkdGqpr91a+PoeP12GndmgoU/Idw67dW/nGPbVDQ63/37ZHHt/DMbY6vb4+e3+Vdu3IUGRUDR3Kyi+yXUlKek+V5cjxD8cXGuZUXq6nxD614yJ0IC07oPEr0ufwMahOul3dRrv+yvDlvFHTaL98HenI43jk6z9y26OPdZNmteU1Te3akeHX5/B+jnRkW0n5P/r9WpKj+xx+TwTz3XLkWCXFGGg81fm7/fD/Y0d+L8JasbE15XQGdvJftTpFMJiL/FesWKGsrCz16NHD1xYaGqrLLrtMycnJVoZ33Jo27Q116dLR9+fKK7vpnnvu0A8/LCmybb9+PX3bde16nvr166lnnnlcqam7y9zPvHlz1aVLRx04cKASXgXgz+s19fP32yRJPy/ZJq/XLLbNt73Hq+XJW4t9Ltj9VWdlHR9J+un7rWUet5LGKyjwVvh4FBR4tXF14XfLhtV/66f/5Wf54q3yev75AaK88f30/TbfWEf+u7LydvRxXZ68tcx9+se71f/f/+srSetX/+37YarweP3tN876VX/7fvjesGa38vI8vrE2rN7tt93yxSm+xz99v005OQW+H0YP/wC+/og8lObI986R45Zlw+p/4tiwpjC+0oorSeUulILtU92KK6nwvXRkQX10vo505HE88vUfue3Rx/qv7Qd8Yx7Z5+ji6ui2kvK/YfXuMn/4L3zvH/EeX/W373F5v1uOHss/xsA+73b7bkf1UK0KrGCkpBR+cOPj4/3aW7ZsqV27diknJ6cqwrKdsLAwTZnyjqZMeUcPPfSI8vLy9NBDI/Tbb6uLbHvRRd00Zco7mjBhivr1u0GLF3+r//u/YSooYKEGVB9/rN+jzIzCz7/7QI42r99TbNtha1fukvtA8c8Fu7/qrKzjI0mZGbllHreSxlu6cHOFj8eShZt9p4qbppTlzvWNt3bVrlJfS2nxZWbk+MY68t+Vlbejj2uWO7fMffrHm+v/7//1lSSZ0tKFmyUdPl4lx2F6Tc3/+DffWEefwHIwM8/378yMHH06c0Uxg/yTh9Ic+d45ctyymOY/cZj8IFumQHJRFUrKv2mavvdrSQrf+/6v6/Dj8n63FDfWP2MG9nm323c7qodqdQ1WMNxut0JDQxUW5r+an8vlkmmaysjIUHh4eNDjh4QUX4N6vdafg1vSOb6H/2kYKvU/z4pwOBw6/fQzfI9PPfV09e17lebP/0JnnNHeb9vY2Fjftu3bn6m8vDxNnfqaNm5cr9NPb1c5AZYgNzdHYWHB57eylDdnTqdR4nsN5ef1evXLkm1+bT99v1WG/D+3Py/Zpjbt6sswpO++2lTsc4HMrBe3v/L0P9ZKire4H2h/+r7obMXRr6248Q7PPpTUpywFBR5tXFPyzPjipN81IPE8ORwq89gXF19JrM6b1+vVL0fMXgWyz/LEKxXOKnW+tGWR2aviFDfzUJJgZnmOVDiLwalKJ6rlySlSMf//bVizW127n6KQEGeR57xer99sb3EC/W4JZKyyPu92+m4/fOpaoKewoXLZvsCqTA6HoZiYmsU+l5Pj1L59Dkt/MN64ZrcWfL5Bl/VuqzZnNCjyfGV9aByOwh86j3wdDRs2UO3aMdqzJ7XI63M4/F9zmzZtJUl79xbdtvj9+Pc/cCBdPXt21wMPPKQ+ffr69bn99lvVsGFDPfvs8/rii8/1zDNP6M0339HUqVP022+rdfXVvTRy5Ci9995MLVz4lXbs2KHQ0Bo69dTTNHToA2ratJnfeL/9tlpTpkzWunVr5XSGqHPnLho2bKRiY2PLc8gCVlbOvF5DDodD0dGRFfpFAPyt+eUv32zUYcX9FtN9IEc7t2VIZuFF2sU91+7sJkHtrzz9j7WS4i1Oacft8Gsrbryji7XyHo8
"text/plain": [
"<Figure size 1000x500 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
2025-02-21 17:13:13 +00:00
"source": [
"def plot_scatter():\n",
" plt.figure(figsize=(10,5))\n",
" plt.scatter(X_train_1d[y_train==1], y_train[y_train==1], c='m', marker='^', label='RR Lyrae')\n",
" plt.scatter(X_train_1d[y_train==0], y_train[y_train==0], c='c', marker='v', label='Background')\n",
" plt.xlabel('$' + col + '$')\n",
" plt.ylabel('Probability of type RR Lyrae')\n",
"plot_scatter() \n",
"plt.legend()"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "e9cb694a42ffb3138b18c2e405645ce1",
"grade": false,
"grade_id": "cell-bd01fa3c7086288f",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*Given the plot shown above, comment on how well you expect logistic regression to perform.*"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "cf842b131e1b494992c17edc41569b9d",
"grade": true,
"grade_id": "cell-1cad643fb7816037",
"locked": false,
"points": 1,
"schema_version": 3,
"solution": true
}
},
"source": [
2025-02-27 00:32:43 +00:00
"I think logistic regression would perform badly because the background class extends beyond both the minimum and maximum instances of the RR Lyrae class. There is nowhere to place a boundary in one dimension which separates the RR Lyrae group from the Background group."
2025-02-21 17:13:13 +00:00
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "639ffbefef37614d5671b1df92842700",
"grade": false,
"grade_id": "cell-00dca71454bb5330",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*Where would you guess the decision bounary should lie? Set the variable `decision_boundary_guess` to your guess.*"
]
},
{
"cell_type": "code",
"execution_count": 55,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "1f8f91ff116877b36b5f768c4e7bbf53",
"grade": false,
"grade_id": "cell-5eef717d4fab1828",
"locked": false,
"schema_version": 3,
"solution": true
}
},
"outputs": [],
"source": [
"# YOUR CODE HERE\n",
2025-02-27 00:32:43 +00:00
"decision_boundary_guess = .88\n",
"#raise NotImplementedError()"
2025-02-21 17:13:13 +00:00
]
},
{
"cell_type": "code",
"execution_count": 56,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "59d1034042c8531f42243eb3de1e6de7",
"grade": true,
"grade_id": "cell-00ef975a7880050f",
"locked": true,
"points": 1,
"schema_version": 3,
"solution": false
}
},
2025-02-27 00:32:43 +00:00
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"decision_boundary_guess defined.\n",
"decision_boundary_guess = 0.8800\n"
]
}
],
2025-02-21 17:13:13 +00:00
"source": [
"check_var_defined('decision_boundary_guess')\n",
"print(\"decision_boundary_guess = {0:.4f}\".format(decision_boundary_guess))"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "f6e3c3c0ceece7871bf22c9255a51295",
"grade": false,
"grade_id": "cell-4c704d78b7b22e68",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"Use Scikit-Learn to perform logistic regression to classify the two classes for this 1D problem."
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "1acac14798c5dc7542fc6ea9406938bc",
"grade": false,
"grade_id": "cell-618989081fddad31",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"First, set the inverse regularation strength `C` such that regularisation is effecitvely not performed."
]
},
{
"cell_type": "code",
"execution_count": 57,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "58f98ba91242ad172c6817655bedd385",
"grade": false,
"grade_id": "cell-d7b94ebcadcc6111",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"outputs": [],
"source": [
"C = 1e10"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "c9845c6ff70b8fbeaf82ce21e36fc34b",
"grade": false,
"grade_id": "cell-70bcad6835ff868d",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*Second, fit the model using Scikit-Learn. Use the variable `clf` for your classification model.*"
]
},
{
"cell_type": "code",
"execution_count": 58,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "c1620a833ae3b05de4b1a57a1ce9ab03",
"grade": false,
"grade_id": "cell-f1790c24720c07d8",
"locked": false,
"schema_version": 3,
"solution": true
}
},
"outputs": [
{
"data": {
"text/html": [
"<style>#sk-container-id-3 {\n",
" /* Definition of color scheme common for light and dark mode */\n",
" --sklearn-color-text: #000;\n",
" --sklearn-color-text-muted: #666;\n",
" --sklearn-color-line: gray;\n",
" /* Definition of color scheme for unfitted estimators */\n",
" --sklearn-color-unfitted-level-0: #fff5e6;\n",
" --sklearn-color-unfitted-level-1: #f6e4d2;\n",
" --sklearn-color-unfitted-level-2: #ffe0b3;\n",
" --sklearn-color-unfitted-level-3: chocolate;\n",
" /* Definition of color scheme for fitted estimators */\n",
" --sklearn-color-fitted-level-0: #f0f8ff;\n",
" --sklearn-color-fitted-level-1: #d4ebff;\n",
" --sklearn-color-fitted-level-2: #b3dbfd;\n",
" --sklearn-color-fitted-level-3: cornflowerblue;\n",
"\n",
" /* Specific color for light theme */\n",
" --sklearn-color-text-on-default-background: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, black)));\n",
" --sklearn-color-background: var(--sg-background-color, var(--theme-background, var(--jp-layout-color0, white)));\n",
" --sklearn-color-border-box: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, black)));\n",
" --sklearn-color-icon: #696969;\n",
"\n",
" @media (prefers-color-scheme: dark) {\n",
" /* Redefinition of color scheme for dark theme */\n",
" --sklearn-color-text-on-default-background: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, white)));\n",
" --sklearn-color-background: var(--sg-background-color, var(--theme-background, var(--jp-layout-color0, #111)));\n",
" --sklearn-color-border-box: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, white)));\n",
" --sklearn-color-icon: #878787;\n",
" }\n",
"}\n",
"\n",
"#sk-container-id-3 {\n",
" color: var(--sklearn-color-text);\n",
"}\n",
"\n",
"#sk-container-id-3 pre {\n",
" padding: 0;\n",
"}\n",
"\n",
"#sk-container-id-3 input.sk-hidden--visually {\n",
" border: 0;\n",
" clip: rect(1px 1px 1px 1px);\n",
" clip: rect(1px, 1px, 1px, 1px);\n",
" height: 1px;\n",
" margin: -1px;\n",
" overflow: hidden;\n",
" padding: 0;\n",
" position: absolute;\n",
" width: 1px;\n",
"}\n",
"\n",
"#sk-container-id-3 div.sk-dashed-wrapped {\n",
" border: 1px dashed var(--sklearn-color-line);\n",
" margin: 0 0.4em 0.5em 0.4em;\n",
" box-sizing: border-box;\n",
" padding-bottom: 0.4em;\n",
" background-color: var(--sklearn-color-background);\n",
"}\n",
"\n",
"#sk-container-id-3 div.sk-container {\n",
" /* jupyter's `normalize.less` sets `[hidden] { display: none; }`\n",
" but bootstrap.min.css set `[hidden] { display: none !important; }`\n",
" so we also need the `!important` here to be able to override the\n",
" default hidden behavior on the sphinx rendered scikit-learn.org.\n",
" See: https://github.com/scikit-learn/scikit-learn/issues/21755 */\n",
" display: inline-block !important;\n",
" position: relative;\n",
"}\n",
"\n",
"#sk-container-id-3 div.sk-text-repr-fallback {\n",
" display: none;\n",
"}\n",
"\n",
"div.sk-parallel-item,\n",
"div.sk-serial,\n",
"div.sk-item {\n",
" /* draw centered vertical line to link estimators */\n",
" background-image: linear-gradient(var(--sklearn-color-text-on-default-background), var(--sklearn-color-text-on-default-background));\n",
" background-size: 2px 100%;\n",
" background-repeat: no-repeat;\n",
" background-position: center center;\n",
"}\n",
"\n",
"/* Parallel-specific style estimator block */\n",
"\n",
"#sk-container-id-3 div.sk-parallel-item::after {\n",
" content: \"\";\n",
" width: 100%;\n",
" border-bottom: 2px solid var(--sklearn-color-text-on-default-background);\n",
" flex-grow: 1;\n",
"}\n",
"\n",
"#sk-container-id-3 div.sk-parallel {\n",
" display: flex;\n",
" align-items: stretch;\n",
" justify-content: center;\n",
" background-color: var(--sklearn-color-background);\n",
" position: relative;\n",
"}\n",
"\n",
"#sk-container-id-3 div.sk-parallel-item {\n",
" display: flex;\n",
" flex-direction: column;\n",
"}\n",
"\n",
"#sk-container-id-3 div.sk-parallel-item:first-child::after {\n",
" align-self: flex-end;\n",
" width: 50%;\n",
"}\n",
"\n",
"#sk-container-id-3 div.sk-parallel-item:last-child::after {\n",
" align-self: flex-start;\n",
" width: 50%;\n",
"}\n",
"\n",
"#sk-container-id-3 div.sk-parallel-item:only-child::after {\n",
" width: 0;\n",
"}\n",
"\n",
"/* Serial-specific style estimator block */\n",
"\n",
"#sk-container-id-3 div.sk-serial {\n",
" display: flex;\n",
" flex-direction: column;\n",
" align-items: center;\n",
" background-color: var(--sklearn-color-background);\n",
" padding-right: 1em;\n",
" padding-left: 1em;\n",
"}\n",
"\n",
"\n",
"/* Toggleable style: style used for estimator/Pipeline/ColumnTransformer box that is\n",
"clickable and can be expanded/collapsed.\n",
"- Pipeline and ColumnTransformer use this feature and define the default style\n",
"- Estimators will overwrite some part of the style using the `sk-estimator` class\n",
"*/\n",
"\n",
"/* Pipeline and ColumnTransformer style (default) */\n",
"\n",
"#sk-container-id-3 div.sk-toggleable {\n",
" /* Default theme specific background. It is overwritten whether we have a\n",
" specific estimator or a Pipeline/ColumnTransformer */\n",
" background-color: var(--sklearn-color-background);\n",
"}\n",
"\n",
"/* Toggleable label */\n",
"#sk-container-id-3 label.sk-toggleable__label {\n",
" cursor: pointer;\n",
" display: flex;\n",
" width: 100%;\n",
" margin-bottom: 0;\n",
" padding: 0.5em;\n",
" box-sizing: border-box;\n",
" text-align: center;\n",
" align-items: start;\n",
" justify-content: space-between;\n",
" gap: 0.5em;\n",
"}\n",
"\n",
"#sk-container-id-3 label.sk-toggleable__label .caption {\n",
" font-size: 0.6rem;\n",
" font-weight: lighter;\n",
" color: var(--sklearn-color-text-muted);\n",
"}\n",
"\n",
"#sk-container-id-3 label.sk-toggleable__label-arrow:before {\n",
" /* Arrow on the left of the label */\n",
" content: \"▸\";\n",
" float: left;\n",
" margin-right: 0.25em;\n",
" color: var(--sklearn-color-icon);\n",
"}\n",
"\n",
"#sk-container-id-3 label.sk-toggleable__label-arrow:hover:before {\n",
" color: var(--sklearn-color-text);\n",
"}\n",
"\n",
"/* Toggleable content - dropdown */\n",
"\n",
"#sk-container-id-3 div.sk-toggleable__content {\n",
" max-height: 0;\n",
" max-width: 0;\n",
" overflow: hidden;\n",
" text-align: left;\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-0);\n",
"}\n",
"\n",
"#sk-container-id-3 div.sk-toggleable__content.fitted {\n",
" /* fitted */\n",
" background-color: var(--sklearn-color-fitted-level-0);\n",
"}\n",
"\n",
"#sk-container-id-3 div.sk-toggleable__content pre {\n",
" margin: 0.2em;\n",
" border-radius: 0.25em;\n",
" color: var(--sklearn-color-text);\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-0);\n",
"}\n",
"\n",
"#sk-container-id-3 div.sk-toggleable__content.fitted pre {\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-fitted-level-0);\n",
"}\n",
"\n",
"#sk-container-id-3 input.sk-toggleable__control:checked~div.sk-toggleable__content {\n",
" /* Expand drop-down */\n",
" max-height: 200px;\n",
" max-width: 100%;\n",
" overflow: auto;\n",
"}\n",
"\n",
"#sk-container-id-3 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {\n",
" content: \"▾\";\n",
"}\n",
"\n",
"/* Pipeline/ColumnTransformer-specific style */\n",
"\n",
"#sk-container-id-3 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
" color: var(--sklearn-color-text);\n",
" background-color: var(--sklearn-color-unfitted-level-2);\n",
"}\n",
"\n",
"#sk-container-id-3 div.sk-label.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
" background-color: var(--sklearn-color-fitted-level-2);\n",
"}\n",
"\n",
"/* Estimator-specific style */\n",
"\n",
"/* Colorize estimator box */\n",
"#sk-container-id-3 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-2);\n",
"}\n",
"\n",
"#sk-container-id-3 div.sk-estimator.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
" /* fitted */\n",
" background-color: var(--sklearn-color-fitted-level-2);\n",
"}\n",
"\n",
"#sk-container-id-3 div.sk-label label.sk-toggleable__label,\n",
"#sk-container-id-3 div.sk-label label {\n",
" /* The background is the default theme color */\n",
" color: var(--sklearn-color-text-on-default-background);\n",
"}\n",
"\n",
"/* On hover, darken the color of the background */\n",
"#sk-container-id-3 div.sk-label:hover label.sk-toggleable__label {\n",
" color: var(--sklearn-color-text);\n",
" background-color: var(--sklearn-color-unfitted-level-2);\n",
"}\n",
"\n",
"/* Label box, darken color on hover, fitted */\n",
"#sk-container-id-3 div.sk-label.fitted:hover label.sk-toggleable__label.fitted {\n",
" color: var(--sklearn-color-text);\n",
" background-color: var(--sklearn-color-fitted-level-2);\n",
"}\n",
"\n",
"/* Estimator label */\n",
"\n",
"#sk-container-id-3 div.sk-label label {\n",
" font-family: monospace;\n",
" font-weight: bold;\n",
" display: inline-block;\n",
" line-height: 1.2em;\n",
"}\n",
"\n",
"#sk-container-id-3 div.sk-label-container {\n",
" text-align: center;\n",
"}\n",
"\n",
"/* Estimator-specific */\n",
"#sk-container-id-3 div.sk-estimator {\n",
" font-family: monospace;\n",
" border: 1px dotted var(--sklearn-color-border-box);\n",
" border-radius: 0.25em;\n",
" box-sizing: border-box;\n",
" margin-bottom: 0.5em;\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-0);\n",
"}\n",
"\n",
"#sk-container-id-3 div.sk-estimator.fitted {\n",
" /* fitted */\n",
" background-color: var(--sklearn-color-fitted-level-0);\n",
"}\n",
"\n",
"/* on hover */\n",
"#sk-container-id-3 div.sk-estimator:hover {\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-2);\n",
"}\n",
"\n",
"#sk-container-id-3 div.sk-estimator.fitted:hover {\n",
" /* fitted */\n",
" background-color: var(--sklearn-color-fitted-level-2);\n",
"}\n",
"\n",
"/* Specification for estimator info (e.g. \"i\" and \"?\") */\n",
"\n",
"/* Common style for \"i\" and \"?\" */\n",
"\n",
".sk-estimator-doc-link,\n",
"a:link.sk-estimator-doc-link,\n",
"a:visited.sk-estimator-doc-link {\n",
" float: right;\n",
" font-size: smaller;\n",
" line-height: 1em;\n",
" font-family: monospace;\n",
" background-color: var(--sklearn-color-background);\n",
" border-radius: 1em;\n",
" height: 1em;\n",
" width: 1em;\n",
" text-decoration: none !important;\n",
" margin-left: 0.5em;\n",
" text-align: center;\n",
" /* unfitted */\n",
" border: var(--sklearn-color-unfitted-level-1) 1pt solid;\n",
" color: var(--sklearn-color-unfitted-level-1);\n",
"}\n",
"\n",
".sk-estimator-doc-link.fitted,\n",
"a:link.sk-estimator-doc-link.fitted,\n",
"a:visited.sk-estimator-doc-link.fitted {\n",
" /* fitted */\n",
" border: var(--sklearn-color-fitted-level-1) 1pt solid;\n",
" color: var(--sklearn-color-fitted-level-1);\n",
"}\n",
"\n",
"/* On hover */\n",
"div.sk-estimator:hover .sk-estimator-doc-link:hover,\n",
".sk-estimator-doc-link:hover,\n",
"div.sk-label-container:hover .sk-estimator-doc-link:hover,\n",
".sk-estimator-doc-link:hover {\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-3);\n",
" color: var(--sklearn-color-background);\n",
" text-decoration: none;\n",
"}\n",
"\n",
"div.sk-estimator.fitted:hover .sk-estimator-doc-link.fitted:hover,\n",
".sk-estimator-doc-link.fitted:hover,\n",
"div.sk-label-container:hover .sk-estimator-doc-link.fitted:hover,\n",
".sk-estimator-doc-link.fitted:hover {\n",
" /* fitted */\n",
" background-color: var(--sklearn-color-fitted-level-3);\n",
" color: var(--sklearn-color-background);\n",
" text-decoration: none;\n",
"}\n",
"\n",
"/* Span, style for the box shown on hovering the info icon */\n",
".sk-estimator-doc-link span {\n",
" display: none;\n",
" z-index: 9999;\n",
" position: relative;\n",
" font-weight: normal;\n",
" right: .2ex;\n",
" padding: .5ex;\n",
" margin: .5ex;\n",
" width: min-content;\n",
" min-width: 20ex;\n",
" max-width: 50ex;\n",
" color: var(--sklearn-color-text);\n",
" box-shadow: 2pt 2pt 4pt #999;\n",
" /* unfitted */\n",
" background: var(--sklearn-color-unfitted-level-0);\n",
" border: .5pt solid var(--sklearn-color-unfitted-level-3);\n",
"}\n",
"\n",
".sk-estimator-doc-link.fitted span {\n",
" /* fitted */\n",
" background: var(--sklearn-color-fitted-level-0);\n",
" border: var(--sklearn-color-fitted-level-3);\n",
"}\n",
"\n",
".sk-estimator-doc-link:hover span {\n",
" display: block;\n",
"}\n",
"\n",
"/* \"?\"-specific style due to the `<a>` HTML tag */\n",
"\n",
"#sk-container-id-3 a.estimator_doc_link {\n",
" float: right;\n",
" font-size: 1rem;\n",
" line-height: 1em;\n",
" font-family: monospace;\n",
" background-color: var(--sklearn-color-background);\n",
" border-radius: 1rem;\n",
" height: 1rem;\n",
" width: 1rem;\n",
" text-decoration: none;\n",
" /* unfitted */\n",
" color: var(--sklearn-color-unfitted-level-1);\n",
" border: var(--sklearn-color-unfitted-level-1) 1pt solid;\n",
"}\n",
"\n",
"#sk-container-id-3 a.estimator_doc_link.fitted {\n",
" /* fitted */\n",
" border: var(--sklearn-color-fitted-level-1) 1pt solid;\n",
" color: var(--sklearn-color-fitted-level-1);\n",
"}\n",
"\n",
"/* On hover */\n",
"#sk-container-id-3 a.estimator_doc_link:hover {\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-3);\n",
" color: var(--sklearn-color-background);\n",
" text-decoration: none;\n",
"}\n",
"\n",
"#sk-container-id-3 a.estimator_doc_link.fitted:hover {\n",
" /* fitted */\n",
" background-color: var(--sklearn-color-fitted-level-3);\n",
"}\n",
"</style><div id=\"sk-container-id-3\" class=\"sk-top-container\"><div class=\"sk-text-repr-fallback\"><pre>LogisticRegression()</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class=\"sk-container\" hidden><div class=\"sk-item\"><div class=\"sk-estimator fitted sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-3\" type=\"checkbox\" checked><label for=\"sk-estimator-id-3\" class=\"sk-toggleable__label fitted sk-toggleable__label-arrow\"><div><div>LogisticRegression</div></div><div><a class=\"sk-estimator-doc-link fitted\" rel=\"noreferrer\" target=\"_blank\" href=\"https://scikit-learn.org/1.6/modules/generated/sklearn.linear_model.LogisticRegression.html\">?<span>Documentation for LogisticRegression</span></a><span class=\"sk-estimator-doc-link fitted\">i<span>Fitted</span></span></div></label><div class=\"sk-toggleable__content fitted\"><pre>LogisticRegression()</pre></div> </div></div></div></div>"
],
"text/plain": [
"LogisticRegression()"
]
},
"execution_count": 58,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from sklearn.linear_model import LogisticRegression\n",
"# YOUR CODE HERE\n",
"clf = LogisticRegression()\n",
"clf.fit(X_train_1d, y_train)\n",
"\n",
"#raise NotImplementedError()"
]
},
{
"cell_type": "code",
"execution_count": 59,
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "4af20705b1b5e8ba2ccb810b4ac4a8f6",
"grade": true,
"grade_id": "cell-f6edcb4e5f610518",
"locked": true,
"points": 1,
"schema_version": 3,
"solution": false
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"clf defined.\n"
]
}
],
"source": [
"check_var_defined('clf')"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "8241cadb1bf5fa6ca1e650cf83a16181",
"grade": false,
"grade_id": "cell-1aafef5deaf49404",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*Compute the decision boundary of the logistic regression model fitted by Scikit-Learn. User variable `decision_boundary_sklearn` for your result.*\n",
"\n",
"(Ensure your result is a scalar and not an array of length 1.)"
]
},
{
"cell_type": "code",
"execution_count": 60,
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "3ada1a115b287bb9f88afb2c54886805",
"grade": false,
"grade_id": "cell-01cd8a3ebc69de43",
"locked": false,
"schema_version": 3,
"solution": true
}
},
"outputs": [],
"source": [
"# YOUR CODE HERE\n",
"decision_boundary_sklearn = -clf.intercept_[0]/clf.coef_[0][0]\n",
"#raise NotImplementedError()"
]
},
{
"cell_type": "code",
"execution_count": 61,
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "d8f8933baaf6b9eec389763501544d46",
"grade": true,
"grade_id": "cell-0ed39065189e2fae",
"locked": true,
"points": 1,
"schema_version": 3,
"solution": false
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"decision_boundary_sklearn = 1.4335\n"
]
}
],
"source": [
"assert not hasattr(decision_boundary_sklearn, \"__len__\")\n",
"print(\"decision_boundary_sklearn = {0:.4f}\".format(decision_boundary_sklearn))"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "61747e5a45a3f44ed6fecb3d4840739f",
"grade": false,
"grade_id": "cell-b634a6057f675df8",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*Evaluate the probabilities predicted by your logistic regression model over the domain specified by the variable `X_1d_new`. Use variable `y_1d_proba` for your computed probabilities.*"
]
},
{
"cell_type": "code",
"execution_count": 62,
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "713631c608bc415a882d456ebd580a72",
"grade": false,
"grade_id": "cell-b3f7cdf5d4698ad6",
"locked": false,
"schema_version": 3,
"solution": true
}
},
"outputs": [],
"source": [
"X_1d_new = np.linspace(0.3, 2.0, 1000).reshape(-1, 1)\n",
"# YOUR CODE HERE\n",
"y_1d_proba = clf.predict_proba(X_1d_new)\n",
"#raise NotImplementedError()"
]
},
{
"cell_type": "code",
"execution_count": 63,
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "8d32cd84ae52ee452b0b5d3d026c81e7",
"grade": true,
"grade_id": "cell-bb76289d4e36fcb0",
"locked": true,
"points": 1,
"schema_version": 3,
"solution": false
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"y_1d_proba defined.\n"
]
}
],
"source": [
"check_var_defined('y_1d_proba')"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "d257001f04351ccc61372e6e3ef4ed24",
"grade": false,
"grade_id": "cell-84d06c82f79d0657",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*Plot the probability of a star being of type RR Lyrae against the colour variable considered. Also plot the probability of being a Background star. Overlay these plots on the scatter plot of class types. Also plot the decision boundary that you guessed previously and the one computed by Scikit-Learn.*"
]
},
{
"cell_type": "code",
"execution_count": 64,
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "9406aeb0f224d10e25c85c23229913ed",
"grade": true,
"grade_id": "cell-1c623b6df631aa69",
"locked": false,
"points": 3,
"schema_version": 3,
"solution": true
}
},
"outputs": [
{
"data": {
"text/plain": [
2025-03-02 00:11:57 +00:00
"<matplotlib.legend.Legend at 0x783471bcec90>"
]
},
"execution_count": 64,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA1gAAAHICAYAAABTZkvCAAAAOnRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjEwLjAsIGh0dHBzOi8vbWF0cGxvdGxpYi5vcmcvlHJYcgAAAAlwSFlzAAAPYQAAD2EBqD+naQAAuz9JREFUeJzs3Xd4FNXXwPHvzLb0TgKEXkLoXXpRFCw0y6tYUKyggIIVRUUsgKioCIoFQbGi/ixIEQEBKSK999BrSK/bZt4/FlZCAkk2ZVPO53ny7O7MnZmTvdnNnr1N0XVdRwghhBBCCCFEkaneDkAIIYQQQgghKgpJsIQQQgghhBCimEiCJYQQQgghhBDFRBIsIYQQQgghhCgmkmAJIYQQQgghRDGRBEsIIYQQQgghiokkWEIIIYQQQghRTCTBEkIIIYQQQohiYvR2AGWZrutommfrMKuq4vGxomRJ3ZRtUj9ll9RN2Sb1U7ZJ/ZRdUjdlW1mpH1VVUBSlQGUlwboCTdNJTMwo9HFGo0poqD+pqZk4HFoJRCY8JXVTtkn9lF1SN2Wb1E/ZJvVTdkndlG1lqX7CwvwxGAqWYEkXQSGEEEIIIYQoJpJgCSGEEEIIIUQxkQRLCCGEEEIIIYqJJFhCCCGEEEIIUUwkwRJCCCGEEEKIYiIJlhBCCCGEEEIUE0mwhBBCCCGEEKKYSIIlhBBCCCGEEMVEEiwhhBBCCCGEKCaSYAkhhBBCCCFEMSlTCdaRI0d4+eWXGTBgAE2aNKFv374FOk7XdT755BN69uxJixYtuOOOO9iyZUvJBiuEEEIIIYQQlzB6O4CL7d+/nxUrVtCyZUs0TUPX9QId9+mnnzJ16lSefvppGjVqxNdff80DDzzAr7/+Ss2aNUs4aiGEKBxd14k/nUZEVADnzqRTpWogAPGn06hSNRBFUXKVvXS7J9cLj/Rn/66zNGwSScLZjBzXvTiWi6+jaRr7dp6hcYuqnDiajI+/gbOnUt3ldF3n7Kk0dF1HUSCyWtBl48yr7MXXP3MylaNxSbTpVIP9O8+SkpxNUIiFRs2qsn9nPE7NiUE10KBJBBvXHMU/wExqSjaJ8RnUbRhOeGQguq5xcM9Zdm09Q1iEBV8/H1p1qIHT4eTP3/ZiMOmYjCaMJgWHw0lmqhP/YCOqqpKZZsdmdcUWVcsHW6ZOYoIVNDBZILK6HycOZ+LjD04bBAZb8A80cfp4Ok4NgkN9CA33Q1EgMMgH/0ALZ06kkJnpwOJjJC0li/qNI4jbk0hgsJnAYF8Cgy2gK5w9lU7jFlGoBhVdh+TELBo1i0JVVXRd58zJVBLPpRMWEUBktcAcdeV0Oln/9yGiqgdRp2F4sf7tCCGEKDxFL2gWUwo0TUNVXY1qY8aMYceOHfz+++9XPMZqtdK5c2fuvvtunnzySQBsNhvXX3893bt355VXXvE4HqdTIzExo9DHGY0qoaH+JCVl4HBoHl9fFD+pm7KtstTP3h1nWPb7Hpq0qsauLafo1TcWHVj2+x569Y0lpllUrrKXbvfketVrBXPyaIr79uLrXhzLxddZvmgfu7ecIrp2CCeOJNOsdXV2bD7pLnfh3BdcKc68yl56fQC/ABOZ6XZ3uZBwX5ITstyPL91fUTVpVY0e18fket4uratfv9nCyaMpADRrU51uvRsCxfO3I4pHZXlvK4+kbsq2slQ/YWH+GAwF6/xXproIXkiuCmPTpk2kp6dzww03uLeZzWauu+46Vq5cWZzhCSFEkWmazvq/DwOwe9tpAP79+zD/rjwEwPpVh9E0PVfZi7d7er0LH8Iv3F583QuxXHwdh0Njz1bX9hNHkgHYufWUu5zDobnPfcG/fx/KM86L47hg3cpD7utfSK6AXMnTxclVXvsrqt1bT2OzuVqncm7/rw6ysx3u+gTYsekkDodWLH87QgghPFOmugh6Ii4uDoB69erl2F6/fn2++OILsrOz8fHx8UZoxSbL6mD/8ZQrdpm8cu+Py+/Mr9eIh6dFufJOz66XbzxX+D3P3xoMCoEJWaSlZeF06vkdli9Pu91c6bCSeu48rS9Pn5/8/7ZyFzAYFBIy7K76ceT99+55PAqK4noa3Pcvvj1/bvX8BXKVOb9NVc5Hf6GsQp7bLpS/1P5dZ0lLyQZAP/+h98JjgNTkbA7sOktMs6gcZS/eXhgXn+NSF2+/EMvF11m15ECu952Ly61eciDXudNSrHnGmVcc6anWQv0ulY2u6yz8cTtpKdZLtrtuU5Oz+XnOplzHrV5ygKo1gov8tyOEEKXJ9f9Gd73J6TroGjoq4O/t0Aqt3CdYqampmM1mLBZLju1BQUHouk5KSkqREiyjsfCtaheaDwvajJifmf/bzaZ98cVyLiFE6Tmfd6GeT9JAQXNqOfZf+Ln48c4Fuwj95zApiVloaO7t+xfupvqOkxhUFVVVMJz/UfO4NRlUDAaF3ZtPYT9/DvX8ef67VfLcvmJFHGE1A9mx9RQKOuplEvALLSmX+vfvQ8S2iHL3StA0jQ2XtF6Jgrm4dSovl7bugas18FhcYo5t61cdzlEnonQV9+cCUXwqWt3omhMcNvTzP677VnSHHS7cag50pwM0Bzid6JoDnA73LZrTvT9XOc2B7nSC7gRdA01zJULnb13bnOgX7btQ7r9tzpzHXFwur9+pY39M7W8v5WeyaMp9glWSVFUhNNTzrDkoyLdY4ri6fS3Ss+1ctodHPsPorrQ33xF4V9ivX2FnUUb25TcssEi/TxFivvL+fGK+4vPo8WmLVAdXvm5J/k1dIWbPD73i0bp+fq8Omq6j6/r5L8j0868r162uuZ5R93b3fs/oAIoDc7slAGRtuJYCve1qOpnn8hj/6dRIPpzkeUB5R5hbWhZrP/rH/fBC8qUChotvdVeSZrhovxE4m5LFwhWHaNKkKv6+Rk4cTCQhJet8OZlsoTSkXdI6mJqczYnDKbRoW8NLEZUdzuxs/rnjbgA6fv81hlLs5VJcnwtE8fNm3WgOG1pmGlp2Opo18/xPluvWlnXR4yx020X77Nnodhu63YrusKHZba5kqMJRyt1rp9wnWEFBQdhsNqxWa45WrNTUVBRFITg42ONza5pOampmoY8zGFSCgnxJTc3C6Sz6gLyWdUNpWbddkc8jir9uRPEqq/Wj6/8lXu6eC5fZpp3PAvXzyZzNYeOlf10JVsdAXzJTnO605uLbS39Q/ksM3dvQ8fEz07lXPTTd9R7l1PRct05Nx+5wsm3jCbKtTrTzx7tu9UseX/724vic539yj37KO0k7svoQrD6Ua/uF5Mx4/seEcv7W9WNEOX/remyQhKzYLFuwh+g6wZW+FUuz/pd8JidnolqcJX7NsvreJkqmbnSnHS0jGT0zGS0jGS0zGT0rDT07HS07HT07HT37v8c4Sqi7tNGMYjS7bxWjGQwmFIMJVCOKwQCqEQxGFPetAQymi+7/t899jGIA1YCiqqBc9KOqKO77BlBUFFVxlT+/37XtkjKK4r7v+lFAUVAUFYPRSHB4SJl47QQF+Ra4pbPcJ1gXxl4dOnSI2NhY9/a4uDiqV69e5PFXRZmxxOnUvD7jicib1E3ZVtbrJ8f4qnwGhVmd/yUg9jQbloK+7eaZtyiQ6SBcV/MdT7N3xxnOWHWKMpeRdj4Z03AlV7nv6+77F986zm/3DfIh2+4kI8uO031O189/idqlv2jOxwbA7P5R8rwvrWIFk5qczZ5tZyr9WCztovcWh0NDNZTee01Zf2+rzApaN7rmQE9PQks/h54aj5Z+Di09AT0jGT0zBS0zCayFn4EaRUWx+IPFD8Xki2L2RTH5gNnX/Rizz3/3TT4oJp//EiiDOUdChcFUZpdnuPK7fk7K+S+Eyttrp9wnWG3atCEgIICFCxe6Eyy73c7ixYvp3r27l6MTQojit37VYRo0iURV8/7nmdeMfZ5QUfJJz678zzsABcVsIi1LQz+fdLkSsAs/OnZcyZYDsKOfv3U9vpCwZZ3/yflv+L/7ZsAHsKBccivJ16Xy+9sRQrjGMempZ3Emn0JLPomWfBo9LR4t7Rx6RtJlxwrloBpR/ENQ/EJ
"text/plain": [
"<Figure size 1000x500 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"plot_scatter()\n",
"# YOUR CODE HERE\n",
"# Probabilities are listed in order of classes, so 0, 1 -> Background, RR Lyrae\n",
"#print(clf.classes_)\n",
"plt.plot(X_1d_new, y_1d_proba, label=[\"Background\", \"RR Lyrae\"])\n",
"plt.plot([decision_boundary_guess, decision_boundary_guess], [0, 1], linestyle=\"dashed\", label=\"guess\")\n",
"plt.plot([decision_boundary_sklearn, decision_boundary_sklearn], [0, 1], linestyle=\"dashed\", label=\"sklearn\")\n",
"plt.legend()\n",
"#raise NotImplementedError()"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "38c50ceb30eb397312f8c8781e61c8ce",
"grade": false,
"grade_id": "cell-47b93e984622610a",
2025-02-21 17:13:13 +00:00
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*From inspection of your plot, how would all objects in the training set be classified?*"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "935b844dc811a7bb002362530ea1382c",
"grade": true,
"grade_id": "cell-dff437e03665f571",
"locked": false,
"points": 1,
"schema_version": 3,
"solution": true
}
},
"source": [
"All objects would be classified as background stars."
2025-02-21 17:13:13 +00:00
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "57c4ff384e0f250284b57a31fdd4d676",
"grade": false,
"grade_id": "cell-8bd241aeb91446bd",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*Use your logistic regression model fitted by Scikit-Learn to predict the class of all objects in the test set. Use variable `y_test_1d_pred` to specify your answer.*"
]
},
{
"cell_type": "code",
"execution_count": 65,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "2c6b24344bcddc3f7b486346219e61af",
"grade": false,
"grade_id": "cell-bf444b0d8690c876",
"locked": false,
"schema_version": 3,
"solution": true
}
},
"outputs": [],
"source": [
"# YOUR CODE HERE\n",
"y_test_1d_pred = clf.predict(X_train_1d)\n",
"#raise NotImplementedError()"
2025-02-21 17:13:13 +00:00
]
},
{
"cell_type": "code",
"execution_count": 66,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "78f7aec829aca85dea759adc10e7c1dc",
"grade": true,
"grade_id": "cell-d69905ed477cb96f",
"locked": true,
"points": 1,
"schema_version": 3,
"solution": false
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"y_test_1d_pred defined.\n"
]
}
],
2025-02-21 17:13:13 +00:00
"source": [
"check_var_defined('y_test_1d_pred')"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "a6947196762b67730c6fc399918e0f5b",
"grade": false,
"grade_id": "cell-71d78cb3b65a5d2d",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*How many objects are classified as of type RR Lyrae? Use variable `n_rrlyrae_pred` to specify your answer.*"
]
},
{
"cell_type": "code",
"execution_count": 67,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "8346a9ea30670eef34617e29278b7e57",
"grade": false,
"grade_id": "cell-16f880b76044c462",
"locked": false,
"schema_version": 3,
"solution": true
}
},
"outputs": [],
"source": [
"# YOUR CODE HERE\n",
"n_rrlyrae_pred = len([y for y in y_test_1d_pred if y == 1])\n",
"#raise NotImplementedError()"
2025-02-21 17:13:13 +00:00
]
},
{
"cell_type": "code",
"execution_count": 68,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "63011f49012188bb2c43c31b5583d0ab",
"grade": true,
"grade_id": "cell-357fb80562d278c5",
"locked": true,
"points": 1,
"schema_version": 3,
"solution": false
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"n_rrlyrae_pred defined.\n",
"n_rrlyrae_pred = 0\n"
]
}
],
2025-02-21 17:13:13 +00:00
"source": [
"check_var_defined('n_rrlyrae_pred')\n",
"assert n_rrlyrae_pred % 1 == 0 # check integer\n",
"print(\"n_rrlyrae_pred = {0}\".format(n_rrlyrae_pred))"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "60515433771f989e3f319af605095ce3",
"grade": false,
"grade_id": "cell-1c52ae8c8d62b5c1",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*How many objects are classified as of type Background? Use variable `n_background_pred` to specify your answer.*"
]
},
{
"cell_type": "code",
"execution_count": 69,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "d27d505271b0382875458a6a858a5a99",
"grade": false,
"grade_id": "cell-ba43adb513abebcd",
"locked": false,
"schema_version": 3,
"solution": true
}
},
"outputs": [],
"source": [
"# YOUR CODE HERE\n",
"n_background_pred = len([y for y in y_test_1d_pred if y == 0])\n",
"#raise NotImplementedError()"
2025-02-21 17:13:13 +00:00
]
},
{
"cell_type": "code",
"execution_count": 70,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "92dc794912b509ba3c7666f5adfc5121",
"grade": true,
"grade_id": "cell-5280ae78f8605c96",
"locked": true,
"points": 1,
"schema_version": 3,
"solution": false
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"n_background_pred defined.\n",
"n_background_pred = 69855\n"
]
}
],
2025-02-21 17:13:13 +00:00
"source": [
"check_var_defined('n_background_pred')\n",
"assert n_background_pred % 1 == 0 # check integer\n",
"print(\"n_background_pred = {0}\".format(n_background_pred))"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "5a4bf6f4b7b42873c50a01847c6a3ccd",
"grade": false,
"grade_id": "cell-8c852f5f04910102",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"Let's check the Scikit-Learn result by solving the logistic regression problem (without regularisation) manually."
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "80cfc52c440f87820d3df9ce756d0420",
"grade": false,
"grade_id": "cell-297b86c040caaaa6",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"Recall that the cost function for logistic regression is given by\n",
"$$\n",
"C(\\theta) = -\\frac{1}{m} \\sum_{i=1}^m \n",
"\\left [ \n",
"y^{(i)} \\log(\\hat{p}^{(i)})\n",
"+\n",
"(1 - y^{(i)}) \\log(1 - \\hat{p}^{(i)})\n",
"\\right],\n",
"$$\n",
"\n",
"\n",
"where\n",
"\n",
"$$\\hat{p} = \\sigma(\\theta^\\text{T} x) = \\frac{1}{1+\\exp{(-\\theta^\\text{T} x)}}. $$"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "0be2a2a142b58454e9eee6c6a629faf3",
"grade": false,
"grade_id": "cell-4f5546691b1d0ce4",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"Show analytically that the derivative of the cost function is given by\n",
"$$\\begin{eqnarray}\n",
"\\frac{\\partial C}{\\partial \\theta} \n",
"&=& \n",
"\\frac{1}{m} \\sum_{i=1}^m \n",
"\\left[ \\sigma\\left(\\theta^{\\rm T} x^{(i)} \\right) - y^{(i)} \\right]\n",
"x^{(i)}\\\\\n",
"&=&\n",
"\\frac{1}{m} \n",
"X^{\\rm T}\n",
"\\left[ \\sigma\\left(X \\theta \\right) - y \\right]\n",
"\\end{eqnarray}$$\n",
"\n",
"(use latex mathematics expressions)."
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "aac5b115296dd81b6f6b134cdbfc8ab9",
"grade": false,
"grade_id": "cell-e9f16916c6a0b264",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*First, simplify the cost function terms $\\log(\\hat{p})$ and $\\log(1-\\hat{p})$ to express in terms linear in $\\log\\left({1+{\\rm e}^{-\\theta^{\\rm T}x}}\\right)$.*\n",
"\n",
"(You may drop $i$ superscripts for notational brevity.)"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "460abf3f41632681b67e865162d4cef3",
"grade": true,
"grade_id": "cell-5fc3a8343ec24488",
"locked": false,
"points": 2,
"schema_version": 3,
"solution": true
}
},
"source": [
"Simplify $\\log(\\hat{p})$:\n",
"\n",
2025-03-02 00:11:57 +00:00
"$$\\begin{eqnarray}\n",
"\\log(\\hat{p}) &=& \\log\\left(\\frac{1}{1+e^{-\\theta^Tx}}\\right)\\\\\n",
"&=& -\\log\\left(1+e^{-\\theta^Tx}\\right)\n",
"\\end{eqnarray}$$\n",
"\n",
"Simplify $\\log(1-\\hat{p})$:\n",
"\n",
"*Reduce terms...*\n",
"\n",
"$$1-\\hat{p} = 1-\\frac{1}{1+e^{-\\theta^Tx}}$$\n",
"$$=\\frac{1+e^{-\\theta^Tx}}{1+e^{-\\theta^Tx}}-\\frac{1}{1+e^{-\\theta^Tx}}$$\n",
"$$=\\frac{e^{-\\theta^Tx}}{1+e^{-\\theta^Tx}}$$\n",
"\n",
"*Take the logarithm...*\n",
"\n",
"$$\\log{(1-\\hat{p})}=\\log{\\left(\\frac{e^{-\\theta^Tx}}{1+e^{-\\theta^Tx}}\\right)}$$\n",
"$$=\\log{\\left(e^{-\\theta^Tx}\\right)}-\\log{\\left(1+e^{-\\theta^Tx}\\right)}$$\n",
"$$=-\\theta^Tx-\\log{\\left(1+e^{-\\theta^Tx}\\right)}$$"
2025-02-21 17:13:13 +00:00
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "460d773d236b0ccb40e769762703284e",
"grade": false,
"grade_id": "cell-171037df1a01a3f4",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*Next, substitute these terms into the cost function and simplify to also express the cost function in terms linear in $\\log\\left({1+{\\rm e}^{-\\theta^{\\rm T}x}}\\right)$.*"
]
},
2025-03-02 00:11:57 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## "
]
},
2025-02-21 17:13:13 +00:00
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "e9a37a1c8dc9250fa7070f97ba5adf58",
"grade": true,
"grade_id": "cell-dcdb0de863dc8931",
"locked": false,
"points": 2,
"schema_version": 3,
"solution": true
}
},
"source": [
"Substitute terms into cost function:\n",
"$$\n",
"C(\\theta) = -\\frac{1}{m} \\sum_{i=1}^m \n",
"\\left [ \n",
2025-03-02 00:11:57 +00:00
"y^{(i)} (-\\log(1+e^{-\\theta^Tx^{(i)}}))\n",
"+\n",
"(1 - y^{(i)}) (-\\theta^Tx^{(i)}-\\log{(1+e^{-\\theta^Tx^{(i)}})})\n",
"\\right]\n",
"$$\n",
"\n",
2025-03-02 00:11:57 +00:00
"We extract the variable $u$, consider only the inner expression and drop the $(i)$ notation for brevity:\n",
"\n",
"$$ u = \\theta^Tx^{(i)} $$\n",
"\n",
"$$\n",
2025-03-02 00:11:57 +00:00
"y (-\\log(1+e^{-u}))\n",
"+\n",
2025-03-02 00:11:57 +00:00
"(1 - y) (-u-\\log{(1+e^{-u})})\n",
"$$\n",
"\n",
2025-03-02 00:11:57 +00:00
"*Expand the terms...*\n",
"\n",
"$$\n",
2025-03-02 00:11:57 +00:00
"-y\\log(1+e^{-u})\n",
"-u -\\log(1+e^{-u}) + yu +y\\log(1+e^{-u})\n",
"$$\n",
"\n",
2025-03-02 00:11:57 +00:00
"*Delete cancelling terms...*\n",
"\n",
"$$\n",
"-u -\\log(1+e^{-u}) + yu\n",
"$$\n",
"\n",
"*Simplify...*\n",
"\n",
"$$\n",
2025-03-02 00:11:57 +00:00
"(y-1)u - \\log(1+e^{-u})\n",
"$$\n",
"\n",
"Yielding a simplified cost function:\n",
"\n",
"$$\n",
"C(\\theta) = -\\frac{1}{m} \\sum_{i=1}^m \n",
"\\left [ \n",
"(y^{(i)}-1)\\theta^Tx^{(i)} - \\log{(1+e^{-\\theta^Tx^{(i)}})}\n",
"\\right]\n",
2025-03-02 00:11:57 +00:00
"$$\n"
2025-02-21 17:13:13 +00:00
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "e1fdb7cfcd2cb7f1b88deb585ec01297",
"grade": false,
"grade_id": "cell-6f608ac000ec6c3b",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*Now compute the derivative of the cost function with respect to variable $\\theta_j$, i.e. compute $\\partial C / \\partial \\theta_j$.*"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "6d001523ce8324ab2ed170b153b96003",
"grade": true,
"grade_id": "cell-c386ea220c086ace",
"locked": false,
"points": 1,
"schema_version": 3,
"solution": true
}
},
"source": [
2025-03-02 00:11:57 +00:00
"We will differentiate $C(\\theta)$ term-by-term, writing only the inner expression and dropping the $(i)$ notation for brevity.\n",
"\n",
"\n",
2025-03-02 00:11:57 +00:00
"First term: $ (y - 1) \\theta^Tx $\n",
"\n",
2025-03-02 00:11:57 +00:00
"$$ \\frac{\\partial}{\\partial\\theta_j} \\left[ (y-1) \\theta^Tx \\right] = (y-1) \\frac{\\partial}{\\partial\\theta_j} \\theta^Tx $$ \n",
"\n",
2025-03-02 00:11:57 +00:00
"$$ \\frac{\\partial}{\\partial\\theta_j} \\theta^Tx = x_j $$\n",
"\n",
2025-03-02 00:11:57 +00:00
"$$ \\frac{\\partial}{\\partial\\theta_j} \\left[ (y-1) \\theta^Tx \\right] = (y-1) \\cdot x_j $$ \n",
"\n",
"\n",
"Second term: $ -\\log{(1+e^{-\\theta^Tx^{(i)}})} $\n",
"\n",
2025-03-02 00:11:57 +00:00
"Extract the negation to simplify the differentiation:\n",
"\n",
2025-03-02 00:11:57 +00:00
"$$ \\frac{\\partial}{\\partial\\theta_j} \\left[ -\\log(1+e^{-\\theta^Tx}) \\right] = -\\frac{\\partial}{\\partial\\theta_j} \\left[ \\log(1+e^{-\\theta^Tx}) \\right] $$\n",
"\n",
2025-03-02 00:11:57 +00:00
"Extract functions for the chain rule:\n",
"\n",
2025-03-02 00:11:57 +00:00
"$$ f(g) = \\log(g) $$\n",
"$$ g(u) = 1+e^{-u} $$\n",
"$$ u(\\theta_j) = \\theta^Tx $$\n",
"\n",
2025-03-02 00:11:57 +00:00
"$$ -\\frac{\\partial}{\\partial\\theta_j} \\left[ \\log(1+e^{-\\theta^Tx}) \\right] = -\\frac{d}{d\\theta_j} f(g(u(\\theta_j))) $$\n",
"\n",
2025-03-02 00:11:57 +00:00
"Differentiate those functions:\n",
"\n",
2025-03-02 00:11:57 +00:00
"$$ \\frac{df}{dg} = \\frac{1}{g} $$\n",
"$$ \\frac{dg}{du} = -e^{-u} $$\n",
"$$ \\frac{du}{d\\theta_j} = x $$\n",
"\n",
2025-03-02 00:11:57 +00:00
"Combine the derivatives and return the negation:\n",
"\n",
2025-03-02 00:11:57 +00:00
"$$ \\frac{\\partial}{\\partial\\theta_j} \\left[ -\\log(1+e^{-\\theta^Tx}) \\right] = -\\frac{df}{dg} \\cdot \\frac{dg}{du} \\cdot \\frac{du}{d\\theta_j} = -\\frac{1}{1+e^{-\\theta^Tx}} \\cdot -e^{-\\theta^Tx} \\cdot x $$\n",
"\n",
2025-03-02 00:11:57 +00:00
"$$ = \\frac{e^{-\\theta^Tx}}{1+e^{-\\theta^Tx}} \\cdot x $$\n",
"\n",
2025-03-02 00:11:57 +00:00
"At this point we recognise $ \\sigma(x) = \\frac{1}{1+e^{-x}} $ and show that $ \\frac{e^{-x}}{1+e^{-x}} = 1 -\\sigma(x) $:\n",
"\n",
2025-03-02 00:11:57 +00:00
"$$ 1 -\\sigma(x) = 1 -\\frac{1}{1+e^{-x}} = \\frac{1+e^{-x}}{1+e^{-x}} -\\frac{1}{1+e^{-x}} = \\frac{1+e^{-x}-1}{1+e^{-x}} = \\frac{e^{-x}}{1+e^{-x}} $$\n",
"\n",
2025-03-02 00:11:57 +00:00
"Thus:\n",
"\n",
2025-03-02 00:11:57 +00:00
"$$ \\frac{\\partial}{\\partial\\theta_j} \\left[ -\\log(1+e^{-\\theta^Tx}) \\right] = (1 -\\sigma(\\theta^Tx)) \\cdot x $$\n",
"\n",
2025-03-02 00:11:57 +00:00
"We can now recombine the two terms:\n",
"\n",
2025-03-02 00:11:57 +00:00
"$$ \\frac{\\partial C}{\\partial\\theta_j} = (y-1)x + (1-\\sigma\\left(\\theta^Tx\\right))x = \\left[ y -\\sigma(\\theta^Tx) \\right]x $$\n"
2025-02-21 17:13:13 +00:00
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "0a496a1ff30721c92959260555798802",
"grade": false,
"grade_id": "cell-89b9177d7dde5e70",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*Combine terms for all $\\theta_j$ to give the overall derivative with respect to $\\theta$, i.e. $\\partial C / \\partial \\theta$.*"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "d456331004b76307432816c30e8c1fdf",
"grade": true,
"grade_id": "cell-331a74ac412db42b",
"locked": false,
"points": 1,
"schema_version": 3,
"solution": true
}
},
"source": [
2025-03-02 00:11:57 +00:00
"We re-include the outer expression and $(i)$ notation:\n",
"\n",
"$$\n",
"\\frac{\\partial C}{\\partial \\theta} = \n",
"-\\frac{1}{m} \\sum_{i=1}^m \n",
"\\left[ \\sigma\\left(\\theta^{\\rm T} x^{(i)} \\right) - y^{(i)} \\right]\n",
"x^{(i)} =\n",
"\\frac{1}{m} \\sum_{i=1}^m \n",
"\\left[ y^{(i)} - \\sigma\\left(\\theta^{\\rm T} x^{(i)} \\right) \\right]\n",
"x^{(i)}\n",
"$$"
2025-02-21 17:13:13 +00:00
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "54da6d5b63818d49cf17905b39e8bfab",
"grade": false,
"grade_id": "cell-722790463e0f0312",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"Using the analytically expression for the derivative of the cost function, we will solve the logistic regression problem by implementing a gradient descent algorithm."
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "56b427758004808ef8810be4f28e3f57",
"grade": false,
"grade_id": "cell-c4321f09bf73ba33",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*First, define the sigmoid function.*"
]
},
{
"cell_type": "code",
"execution_count": 71,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "529913786cc0781f24e5c75054c8261b",
"grade": false,
"grade_id": "cell-e12fc0aa65b673b1",
"locked": false,
"schema_version": 3,
"solution": true
}
},
"outputs": [],
"source": [
"def sigmoid(x):\n",
" # YOUR CODE HERE\n",
" return 1/(1+np.exp(-x))\n",
" #raise NotImplementedError()"
2025-02-21 17:13:13 +00:00
]
},
{
"cell_type": "code",
"execution_count": 72,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "797ba69ea2672a081856123e1d4cab2f",
"grade": true,
"grade_id": "cell-a5a50f4ec07d05fd",
"locked": true,
"points": 1,
"schema_version": 3,
"solution": false
}
},
"outputs": [],
"source": [
"assert np.isclose(sigmoid(0), 0.5)"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "915639b69935959b525f62d5bd0b3c00",
"grade": false,
"grade_id": "cell-ba2bb8821f4e75ad",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*Next, extend the training data to account for a bias term in your model. Use variable `X_train_1d_b` to specify your result.*"
]
},
{
"cell_type": "code",
"execution_count": 73,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "e23f14ea753621e64da00f591268a851",
"grade": false,
"grade_id": "cell-463d94ffced62fba",
"locked": false,
"schema_version": 3,
"solution": true
}
},
"outputs": [],
"source": [
"# YOUR CODE HERE\n",
"X_train_1d_b = np.c_[np.ones((X_train_1d.shape[0], 1)), X_train_1d]\n",
"#raise NotImplementedError()"
2025-02-21 17:13:13 +00:00
]
},
{
"cell_type": "code",
"execution_count": 74,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "b3f12428f02c09090a800fe5d46ef8f5",
"grade": true,
"grade_id": "cell-c6f32c5137a9f302",
"locked": true,
"points": 1,
"schema_version": 3,
"solution": false
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"X_train_1d_b defined.\n"
]
}
],
2025-02-21 17:13:13 +00:00
"source": [
"check_var_defined('X_train_1d_b')"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "9e30b76258ae09d0d57ee57525cab5fa",
"grade": false,
"grade_id": "cell-15322bd5d7e6c8bf",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*Implement batch gradient descent to fit the parameters of your logistic regression model. Consider `n_iterations = 4000` iterations and a learning rate of `alpha = 100.0`. Consider a starting point of $\\theta_0 = (1, 1)$, i.e. `theta = np.array([[1], [1]])`. Use variable `theta` to specify your estimated parameters.*\n",
"\n",
"*(Make sure your implementation is reasonably efficient. If it takes longer than 2 minutes to execute when running on our server it may not complete and you will not be awarded grades. The solution answer runs in under 10 seconds.)*"
]
},
{
"cell_type": "code",
"execution_count": 75,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "e09887c1b2b8d4411a068063095582ec",
"grade": false,
"grade_id": "cell-aee503a999e27cf6",
"locked": false,
"schema_version": 3,
"solution": true
}
},
2025-03-02 00:11:57 +00:00
"outputs": [],
2025-02-21 17:13:13 +00:00
"source": [
"n_iterations = 4000\n",
"alpha = 100.0\n",
"theta = np.array([[1], [1]])\n",
"\n",
"# YOUR CODE HERE\n",
2025-03-02 00:11:57 +00:00
"m = len(X_train_1d_b)\n",
"\n",
2025-03-02 00:11:57 +00:00
"for n in range(n_iterations):\n",
" residuals = sigmoid(X_train_1d_b @ theta) - y_train.reshape(-1, 1)\n",
" gradients = 1/m * X_train_1d_b.T @ residuals\n",
" theta = theta - alpha * gradients\n"
2025-02-21 17:13:13 +00:00
]
},
{
"cell_type": "code",
2025-03-02 00:11:57 +00:00
"execution_count": 76,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "b29368af60c38637709a67ba63372233",
"grade": true,
"grade_id": "cell-db0d3866ab6ed5aa",
"locked": true,
"points": 4,
"schema_version": 3,
"solution": false
}
},
2025-03-02 00:11:57 +00:00
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"theta defined.\n",
"theta[0] = -21.7012\n",
"theta[1] = 15.8177\n"
]
}
],
2025-02-21 17:13:13 +00:00
"source": [
"check_var_defined('theta')\n",
"print(\"theta[0] = {0:.4f}\".format(theta[0][0]))\n",
"print(\"theta[1] = {0:.4f}\".format(theta[1][0]))"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "2c3275c68a087f4b57825087f2c0e024",
"grade": false,
"grade_id": "cell-d6efe104a72bb532",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*Compute the difference between the logistic regression model intercept computed by Scikit-Learn and manually. Use variable `intercept_diff` for your result.*"
]
},
{
"cell_type": "code",
2025-03-02 00:11:57 +00:00
"execution_count": 77,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "609ee4a21d8b86b1bbf931c8313b38d9",
"grade": false,
"grade_id": "cell-b761dbdc7668fb7d",
"locked": false,
"schema_version": 3,
"solution": true
}
},
"outputs": [],
"source": [
"# YOUR CODE HERE\n",
2025-03-02 00:11:57 +00:00
"intercept_diff = (theta[0] - clf.coef_[0])[0]\n",
"#raise NotImplementedError()"
2025-02-21 17:13:13 +00:00
]
},
{
"cell_type": "code",
2025-03-02 00:11:57 +00:00
"execution_count": 78,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "77cebdb4b7cf0653da556a19347d7921",
"grade": true,
"grade_id": "cell-eda44b051be24b4c",
"locked": true,
"points": 1,
"schema_version": 3,
"solution": false
}
},
2025-03-02 00:11:57 +00:00
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"intercept_diff defined.\n",
"intercept_diff = -3.4476E+01\n"
]
}
],
2025-02-21 17:13:13 +00:00
"source": [
"check_var_defined('intercept_diff')\n",
"print(\"intercept_diff = {0:.4E}\".format(intercept_diff))"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "59efb6665ec993c52e2262eee838fc25",
"grade": false,
"grade_id": "cell-3ff8e6906407e9fe",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*Compute the difference between the logistic regression model* slope *(i.e. coefficient) computed by Scikit-Learn and manually. Use variable `coeff_diff` for your result.*"
]
},
{
"cell_type": "code",
2025-03-02 00:11:57 +00:00
"execution_count": 79,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "ec3e8455318b44f96b4cc4ff259aaaad",
"grade": false,
"grade_id": "cell-17bd3970318abda0",
"locked": false,
"schema_version": 3,
"solution": true
}
},
"outputs": [],
"source": [
"# YOUR CODE HERE\n",
2025-03-02 00:11:57 +00:00
"coeff_diff = (theta[1] - clf.coef_[0])[0]\n",
"######raise NotImplementedError()"
2025-02-21 17:13:13 +00:00
]
},
{
"cell_type": "code",
2025-03-02 00:11:57 +00:00
"execution_count": 80,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "bcc12d94a1fa9b22a5710d5cc64b9399",
"grade": true,
"grade_id": "cell-830185c3c51f3f91",
"locked": true,
"points": 1,
"schema_version": 3,
"solution": false
}
},
2025-03-02 00:11:57 +00:00
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"coeff_diff defined.\n",
"coeff_diff = 3.0429E+00\n"
]
}
],
2025-02-21 17:13:13 +00:00
"source": [
"check_var_defined('coeff_diff')\n",
"print(\"coeff_diff = {0:.4E}\".format(coeff_diff))"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "8eebb530913c2d78ea43e8d6cfccb9a7",
"grade": false,
"grade_id": "cell-0d8a45598ebad1aa",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"You should find that the solution from your gradient descent algorithm is close (although not identical) to that recovered by Scikit-Learn. "
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "ebb30f7db5829cfb68123d51796994cd",
"grade": false,
"grade_id": "cell-8b404d163f645ffd",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"Both fitted logistic regression models, however, are not effective. The reason for this is because of class imbalance. *Describe the class imbalance problem in your own words and how it manifests itself in the classification problem at hand.*"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "bbf0e320d4f9d168daa3b4b6397a6a6b",
"grade": true,
"grade_id": "cell-73126eae7fcd4d45",
"locked": false,
"points": 3,
"schema_version": 3,
"solution": true
}
},
"source": [
2025-03-02 00:11:57 +00:00
"There are far more elements in one class than the other which causes a bias in the model which makes it difficult to detect istances of the smaller class. In the classification problem at hand, we find that the decision boundary, given by $\\beta_0/\\beta_1$, is 1.4. All training data lies on one side of this boundary, and thus every prediction is a background star."
2025-02-21 17:13:13 +00:00
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "d2c5242206ed7f763289951a7d5b1acd",
"grade": false,
"grade_id": "cell-32339ef70667c4de",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"The class imbalance problem can be addressed by weighting the training data in a manner that is inversely proportional to their frequency.\n",
"\n",
"*Repeat the fitting of your linear regression model but this time perform class weighting. Use variable `clf_balanced` for your new model.*\n",
"\n",
"See the `class_weight` argument of the Scikit-Learn [Logistic Regression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) classifier for further details on how to perform class weighting."
]
},
{
"cell_type": "code",
2025-03-02 00:11:57 +00:00
"execution_count": 81,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "ca93af45fe489f2970d9465be9f1f227",
"grade": false,
"grade_id": "cell-043f89d606f8da67",
"locked": false,
"schema_version": 3,
"solution": true
}
},
2025-03-02 00:11:57 +00:00
"outputs": [
{
"data": {
"text/html": [
"<style>#sk-container-id-4 {\n",
" /* Definition of color scheme common for light and dark mode */\n",
" --sklearn-color-text: #000;\n",
" --sklearn-color-text-muted: #666;\n",
" --sklearn-color-line: gray;\n",
" /* Definition of color scheme for unfitted estimators */\n",
" --sklearn-color-unfitted-level-0: #fff5e6;\n",
" --sklearn-color-unfitted-level-1: #f6e4d2;\n",
" --sklearn-color-unfitted-level-2: #ffe0b3;\n",
" --sklearn-color-unfitted-level-3: chocolate;\n",
" /* Definition of color scheme for fitted estimators */\n",
" --sklearn-color-fitted-level-0: #f0f8ff;\n",
" --sklearn-color-fitted-level-1: #d4ebff;\n",
" --sklearn-color-fitted-level-2: #b3dbfd;\n",
" --sklearn-color-fitted-level-3: cornflowerblue;\n",
"\n",
" /* Specific color for light theme */\n",
" --sklearn-color-text-on-default-background: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, black)));\n",
" --sklearn-color-background: var(--sg-background-color, var(--theme-background, var(--jp-layout-color0, white)));\n",
" --sklearn-color-border-box: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, black)));\n",
" --sklearn-color-icon: #696969;\n",
"\n",
" @media (prefers-color-scheme: dark) {\n",
" /* Redefinition of color scheme for dark theme */\n",
" --sklearn-color-text-on-default-background: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, white)));\n",
" --sklearn-color-background: var(--sg-background-color, var(--theme-background, var(--jp-layout-color0, #111)));\n",
" --sklearn-color-border-box: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, white)));\n",
" --sklearn-color-icon: #878787;\n",
" }\n",
"}\n",
"\n",
"#sk-container-id-4 {\n",
" color: var(--sklearn-color-text);\n",
"}\n",
"\n",
"#sk-container-id-4 pre {\n",
" padding: 0;\n",
"}\n",
"\n",
"#sk-container-id-4 input.sk-hidden--visually {\n",
" border: 0;\n",
" clip: rect(1px 1px 1px 1px);\n",
" clip: rect(1px, 1px, 1px, 1px);\n",
" height: 1px;\n",
" margin: -1px;\n",
" overflow: hidden;\n",
" padding: 0;\n",
" position: absolute;\n",
" width: 1px;\n",
"}\n",
"\n",
"#sk-container-id-4 div.sk-dashed-wrapped {\n",
" border: 1px dashed var(--sklearn-color-line);\n",
" margin: 0 0.4em 0.5em 0.4em;\n",
" box-sizing: border-box;\n",
" padding-bottom: 0.4em;\n",
" background-color: var(--sklearn-color-background);\n",
"}\n",
"\n",
"#sk-container-id-4 div.sk-container {\n",
" /* jupyter's `normalize.less` sets `[hidden] { display: none; }`\n",
" but bootstrap.min.css set `[hidden] { display: none !important; }`\n",
" so we also need the `!important` here to be able to override the\n",
" default hidden behavior on the sphinx rendered scikit-learn.org.\n",
" See: https://github.com/scikit-learn/scikit-learn/issues/21755 */\n",
" display: inline-block !important;\n",
" position: relative;\n",
"}\n",
"\n",
"#sk-container-id-4 div.sk-text-repr-fallback {\n",
" display: none;\n",
"}\n",
"\n",
"div.sk-parallel-item,\n",
"div.sk-serial,\n",
"div.sk-item {\n",
" /* draw centered vertical line to link estimators */\n",
" background-image: linear-gradient(var(--sklearn-color-text-on-default-background), var(--sklearn-color-text-on-default-background));\n",
" background-size: 2px 100%;\n",
" background-repeat: no-repeat;\n",
" background-position: center center;\n",
"}\n",
"\n",
"/* Parallel-specific style estimator block */\n",
"\n",
"#sk-container-id-4 div.sk-parallel-item::after {\n",
" content: \"\";\n",
" width: 100%;\n",
" border-bottom: 2px solid var(--sklearn-color-text-on-default-background);\n",
" flex-grow: 1;\n",
"}\n",
"\n",
"#sk-container-id-4 div.sk-parallel {\n",
" display: flex;\n",
" align-items: stretch;\n",
" justify-content: center;\n",
" background-color: var(--sklearn-color-background);\n",
" position: relative;\n",
"}\n",
"\n",
"#sk-container-id-4 div.sk-parallel-item {\n",
" display: flex;\n",
" flex-direction: column;\n",
"}\n",
"\n",
"#sk-container-id-4 div.sk-parallel-item:first-child::after {\n",
" align-self: flex-end;\n",
" width: 50%;\n",
"}\n",
"\n",
"#sk-container-id-4 div.sk-parallel-item:last-child::after {\n",
" align-self: flex-start;\n",
" width: 50%;\n",
"}\n",
"\n",
"#sk-container-id-4 div.sk-parallel-item:only-child::after {\n",
" width: 0;\n",
"}\n",
"\n",
"/* Serial-specific style estimator block */\n",
"\n",
"#sk-container-id-4 div.sk-serial {\n",
" display: flex;\n",
" flex-direction: column;\n",
" align-items: center;\n",
" background-color: var(--sklearn-color-background);\n",
" padding-right: 1em;\n",
" padding-left: 1em;\n",
"}\n",
"\n",
"\n",
"/* Toggleable style: style used for estimator/Pipeline/ColumnTransformer box that is\n",
"clickable and can be expanded/collapsed.\n",
"- Pipeline and ColumnTransformer use this feature and define the default style\n",
"- Estimators will overwrite some part of the style using the `sk-estimator` class\n",
"*/\n",
"\n",
"/* Pipeline and ColumnTransformer style (default) */\n",
"\n",
"#sk-container-id-4 div.sk-toggleable {\n",
" /* Default theme specific background. It is overwritten whether we have a\n",
" specific estimator or a Pipeline/ColumnTransformer */\n",
" background-color: var(--sklearn-color-background);\n",
"}\n",
"\n",
"/* Toggleable label */\n",
"#sk-container-id-4 label.sk-toggleable__label {\n",
" cursor: pointer;\n",
" display: flex;\n",
" width: 100%;\n",
" margin-bottom: 0;\n",
" padding: 0.5em;\n",
" box-sizing: border-box;\n",
" text-align: center;\n",
" align-items: start;\n",
" justify-content: space-between;\n",
" gap: 0.5em;\n",
"}\n",
"\n",
"#sk-container-id-4 label.sk-toggleable__label .caption {\n",
" font-size: 0.6rem;\n",
" font-weight: lighter;\n",
" color: var(--sklearn-color-text-muted);\n",
"}\n",
"\n",
"#sk-container-id-4 label.sk-toggleable__label-arrow:before {\n",
" /* Arrow on the left of the label */\n",
" content: \"▸\";\n",
" float: left;\n",
" margin-right: 0.25em;\n",
" color: var(--sklearn-color-icon);\n",
"}\n",
"\n",
"#sk-container-id-4 label.sk-toggleable__label-arrow:hover:before {\n",
" color: var(--sklearn-color-text);\n",
"}\n",
"\n",
"/* Toggleable content - dropdown */\n",
"\n",
"#sk-container-id-4 div.sk-toggleable__content {\n",
" max-height: 0;\n",
" max-width: 0;\n",
" overflow: hidden;\n",
" text-align: left;\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-0);\n",
"}\n",
"\n",
"#sk-container-id-4 div.sk-toggleable__content.fitted {\n",
" /* fitted */\n",
" background-color: var(--sklearn-color-fitted-level-0);\n",
"}\n",
"\n",
"#sk-container-id-4 div.sk-toggleable__content pre {\n",
" margin: 0.2em;\n",
" border-radius: 0.25em;\n",
" color: var(--sklearn-color-text);\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-0);\n",
"}\n",
"\n",
"#sk-container-id-4 div.sk-toggleable__content.fitted pre {\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-fitted-level-0);\n",
"}\n",
"\n",
"#sk-container-id-4 input.sk-toggleable__control:checked~div.sk-toggleable__content {\n",
" /* Expand drop-down */\n",
" max-height: 200px;\n",
" max-width: 100%;\n",
" overflow: auto;\n",
"}\n",
"\n",
"#sk-container-id-4 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {\n",
" content: \"▾\";\n",
"}\n",
"\n",
"/* Pipeline/ColumnTransformer-specific style */\n",
"\n",
"#sk-container-id-4 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
" color: var(--sklearn-color-text);\n",
" background-color: var(--sklearn-color-unfitted-level-2);\n",
"}\n",
"\n",
"#sk-container-id-4 div.sk-label.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
" background-color: var(--sklearn-color-fitted-level-2);\n",
"}\n",
"\n",
"/* Estimator-specific style */\n",
"\n",
"/* Colorize estimator box */\n",
"#sk-container-id-4 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-2);\n",
"}\n",
"\n",
"#sk-container-id-4 div.sk-estimator.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
" /* fitted */\n",
" background-color: var(--sklearn-color-fitted-level-2);\n",
"}\n",
"\n",
"#sk-container-id-4 div.sk-label label.sk-toggleable__label,\n",
"#sk-container-id-4 div.sk-label label {\n",
" /* The background is the default theme color */\n",
" color: var(--sklearn-color-text-on-default-background);\n",
"}\n",
"\n",
"/* On hover, darken the color of the background */\n",
"#sk-container-id-4 div.sk-label:hover label.sk-toggleable__label {\n",
" color: var(--sklearn-color-text);\n",
" background-color: var(--sklearn-color-unfitted-level-2);\n",
"}\n",
"\n",
"/* Label box, darken color on hover, fitted */\n",
"#sk-container-id-4 div.sk-label.fitted:hover label.sk-toggleable__label.fitted {\n",
" color: var(--sklearn-color-text);\n",
" background-color: var(--sklearn-color-fitted-level-2);\n",
"}\n",
"\n",
"/* Estimator label */\n",
"\n",
"#sk-container-id-4 div.sk-label label {\n",
" font-family: monospace;\n",
" font-weight: bold;\n",
" display: inline-block;\n",
" line-height: 1.2em;\n",
"}\n",
"\n",
"#sk-container-id-4 div.sk-label-container {\n",
" text-align: center;\n",
"}\n",
"\n",
"/* Estimator-specific */\n",
"#sk-container-id-4 div.sk-estimator {\n",
" font-family: monospace;\n",
" border: 1px dotted var(--sklearn-color-border-box);\n",
" border-radius: 0.25em;\n",
" box-sizing: border-box;\n",
" margin-bottom: 0.5em;\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-0);\n",
"}\n",
"\n",
"#sk-container-id-4 div.sk-estimator.fitted {\n",
" /* fitted */\n",
" background-color: var(--sklearn-color-fitted-level-0);\n",
"}\n",
"\n",
"/* on hover */\n",
"#sk-container-id-4 div.sk-estimator:hover {\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-2);\n",
"}\n",
"\n",
"#sk-container-id-4 div.sk-estimator.fitted:hover {\n",
" /* fitted */\n",
" background-color: var(--sklearn-color-fitted-level-2);\n",
"}\n",
"\n",
"/* Specification for estimator info (e.g. \"i\" and \"?\") */\n",
"\n",
"/* Common style for \"i\" and \"?\" */\n",
"\n",
".sk-estimator-doc-link,\n",
"a:link.sk-estimator-doc-link,\n",
"a:visited.sk-estimator-doc-link {\n",
" float: right;\n",
" font-size: smaller;\n",
" line-height: 1em;\n",
" font-family: monospace;\n",
" background-color: var(--sklearn-color-background);\n",
" border-radius: 1em;\n",
" height: 1em;\n",
" width: 1em;\n",
" text-decoration: none !important;\n",
" margin-left: 0.5em;\n",
" text-align: center;\n",
" /* unfitted */\n",
" border: var(--sklearn-color-unfitted-level-1) 1pt solid;\n",
" color: var(--sklearn-color-unfitted-level-1);\n",
"}\n",
"\n",
".sk-estimator-doc-link.fitted,\n",
"a:link.sk-estimator-doc-link.fitted,\n",
"a:visited.sk-estimator-doc-link.fitted {\n",
" /* fitted */\n",
" border: var(--sklearn-color-fitted-level-1) 1pt solid;\n",
" color: var(--sklearn-color-fitted-level-1);\n",
"}\n",
"\n",
"/* On hover */\n",
"div.sk-estimator:hover .sk-estimator-doc-link:hover,\n",
".sk-estimator-doc-link:hover,\n",
"div.sk-label-container:hover .sk-estimator-doc-link:hover,\n",
".sk-estimator-doc-link:hover {\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-3);\n",
" color: var(--sklearn-color-background);\n",
" text-decoration: none;\n",
"}\n",
"\n",
"div.sk-estimator.fitted:hover .sk-estimator-doc-link.fitted:hover,\n",
".sk-estimator-doc-link.fitted:hover,\n",
"div.sk-label-container:hover .sk-estimator-doc-link.fitted:hover,\n",
".sk-estimator-doc-link.fitted:hover {\n",
" /* fitted */\n",
" background-color: var(--sklearn-color-fitted-level-3);\n",
" color: var(--sklearn-color-background);\n",
" text-decoration: none;\n",
"}\n",
"\n",
"/* Span, style for the box shown on hovering the info icon */\n",
".sk-estimator-doc-link span {\n",
" display: none;\n",
" z-index: 9999;\n",
" position: relative;\n",
" font-weight: normal;\n",
" right: .2ex;\n",
" padding: .5ex;\n",
" margin: .5ex;\n",
" width: min-content;\n",
" min-width: 20ex;\n",
" max-width: 50ex;\n",
" color: var(--sklearn-color-text);\n",
" box-shadow: 2pt 2pt 4pt #999;\n",
" /* unfitted */\n",
" background: var(--sklearn-color-unfitted-level-0);\n",
" border: .5pt solid var(--sklearn-color-unfitted-level-3);\n",
"}\n",
"\n",
".sk-estimator-doc-link.fitted span {\n",
" /* fitted */\n",
" background: var(--sklearn-color-fitted-level-0);\n",
" border: var(--sklearn-color-fitted-level-3);\n",
"}\n",
"\n",
".sk-estimator-doc-link:hover span {\n",
" display: block;\n",
"}\n",
"\n",
"/* \"?\"-specific style due to the `<a>` HTML tag */\n",
"\n",
"#sk-container-id-4 a.estimator_doc_link {\n",
" float: right;\n",
" font-size: 1rem;\n",
" line-height: 1em;\n",
" font-family: monospace;\n",
" background-color: var(--sklearn-color-background);\n",
" border-radius: 1rem;\n",
" height: 1rem;\n",
" width: 1rem;\n",
" text-decoration: none;\n",
" /* unfitted */\n",
" color: var(--sklearn-color-unfitted-level-1);\n",
" border: var(--sklearn-color-unfitted-level-1) 1pt solid;\n",
"}\n",
"\n",
"#sk-container-id-4 a.estimator_doc_link.fitted {\n",
" /* fitted */\n",
" border: var(--sklearn-color-fitted-level-1) 1pt solid;\n",
" color: var(--sklearn-color-fitted-level-1);\n",
"}\n",
"\n",
"/* On hover */\n",
"#sk-container-id-4 a.estimator_doc_link:hover {\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-3);\n",
" color: var(--sklearn-color-background);\n",
" text-decoration: none;\n",
"}\n",
"\n",
"#sk-container-id-4 a.estimator_doc_link.fitted:hover {\n",
" /* fitted */\n",
" background-color: var(--sklearn-color-fitted-level-3);\n",
"}\n",
"</style><div id=\"sk-container-id-4\" class=\"sk-top-container\"><div class=\"sk-text-repr-fallback\"><pre>LogisticRegression(class_weight=&#x27;balanced&#x27;)</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class=\"sk-container\" hidden><div class=\"sk-item\"><div class=\"sk-estimator fitted sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-4\" type=\"checkbox\" checked><label for=\"sk-estimator-id-4\" class=\"sk-toggleable__label fitted sk-toggleable__label-arrow\"><div><div>LogisticRegression</div></div><div><a class=\"sk-estimator-doc-link fitted\" rel=\"noreferrer\" target=\"_blank\" href=\"https://scikit-learn.org/1.6/modules/generated/sklearn.linear_model.LogisticRegression.html\">?<span>Documentation for LogisticRegression</span></a><span class=\"sk-estimator-doc-link fitted\">i<span>Fitted</span></span></div></label><div class=\"sk-toggleable__content fitted\"><pre>LogisticRegression(class_weight=&#x27;balanced&#x27;)</pre></div> </div></div></div></div>"
],
"text/plain": [
"LogisticRegression(class_weight='balanced')"
]
},
"execution_count": 81,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from sklearn.linear_model import LogisticRegression\n",
"# YOUR CODE HERE\n",
"clf_balanced = LogisticRegression(class_weight=\"balanced\")\n",
"clf_balanced.fit(X_train_1d, y_train)\n",
"#raise NotImplementedError()"
]
},
{
"cell_type": "code",
"execution_count": 82,
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "1fea7d2b3c6c89155867794f7c43b1b3",
"grade": true,
"grade_id": "cell-5dea5e84c6b3f90f",
"locked": true,
"points": 0,
"schema_version": 3,
"solution": false
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"clf_balanced defined.\n"
]
}
],
"source": [
"check_var_defined('clf_balanced')"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "56698708c8d07b85c23e96e74a5f7d02",
"grade": false,
"grade_id": "cell-0e177c0c3236c200",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*Compute the decision boundary of the logistic regression model fitted by Scikit-Learn when weighting classes.* \n",
"\n",
"(Ensure your result is a scalar and not an array of length 1.)"
]
},
{
"cell_type": "code",
"execution_count": 83,
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "4e68b25af62c3c3b9ce3d88fe30e7246",
"grade": false,
"grade_id": "cell-8789a822ce94928b",
"locked": false,
"schema_version": 3,
"solution": true
}
},
"outputs": [],
"source": [
"# YOUR CODE HERE\n",
"decision_boundary_sklearn_balanced = (clf_balanced.intercept_/clf_balanced.coef_)[0][0]\n",
"#raise NotImplementedError()"
]
},
{
"cell_type": "code",
"execution_count": 84,
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "b1830b90407271f2c1a3d024750d1368",
"grade": true,
"grade_id": "cell-6d9e2c731edfff2f",
"locked": true,
"points": 2,
"schema_version": 3,
"solution": false
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"decision_boundary_sklearn_balanced defined.\n",
"decision_boundary_sklearn_balanced = -1.0591\n"
]
}
],
"source": [
"check_var_defined('decision_boundary_sklearn_balanced')\n",
"assert not hasattr(decision_boundary_sklearn_balanced, \"__len__\")\n",
"print(\"decision_boundary_sklearn_balanced = {0:.4f}\".format(decision_boundary_sklearn_balanced))"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "5c9b36e7743bfde53557b5426c4fb77f",
"grade": false,
"grade_id": "cell-3361e275fac9beac",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*Evaluate the probabilities prediced by your new logistic regression model over the domain specified by the variable `X_1d_new`. Use variable `y_1d_proba_balanced` for your computed probabilities.*"
]
},
{
"cell_type": "code",
"execution_count": 85,
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "e9f94c6ad6d1ed961f39c11ae5b8ac9e",
"grade": false,
"grade_id": "cell-3db9585a121a321b",
"locked": false,
"schema_version": 3,
"solution": true
}
},
"outputs": [],
"source": [
"# YOUR CODE HERE\n",
"y_1d_proba_balanced = clf_balanced.predict_proba(X_1d_new)\n",
"#raise NotImplementedError()"
]
},
{
"cell_type": "code",
"execution_count": 86,
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "344d65536b5d89cf8b1fd6e93f527f66",
"grade": true,
"grade_id": "cell-b5df6903e536bc3e",
"locked": true,
"points": 1,
"schema_version": 3,
"solution": false
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"y_1d_proba_balanced defined.\n"
]
}
],
"source": [
"check_var_defined('y_1d_proba_balanced')"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "acc5329742bc3574c6acb59d20110cb6",
"grade": false,
"grade_id": "cell-cef66593b1ed2e90",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*For your new balanced model, plot the probability of a star being of type RR Lyrae against the colour variable considered. Also plot the probability of being a Background star. Overlay these plots on the scatter plot of class types. Also plot the decision boundary that you guessed previously, the one computed by Scikit-Learn initially, and the one computed by Scikit-Learn for your new balanced model.*"
]
},
{
"cell_type": "code",
"execution_count": 87,
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "48606796161d04d156d3cba41a082cba",
"grade": true,
"grade_id": "cell-9ce627c16d3996e6",
"locked": false,
"points": 2,
"schema_version": 3,
"solution": true
}
},
"outputs": [
{
"data": {
"text/plain": [
"<matplotlib.legend.Legend at 0x783471a78f10>"
]
},
"execution_count": 87,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA1gAAAHICAYAAABTZkvCAAAAOnRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjEwLjAsIGh0dHBzOi8vbWF0cGxvdGxpYi5vcmcvlHJYcgAAAAlwSFlzAAAPYQAAD2EBqD+naQAAlrtJREFUeJzs3Xd8VFX6x/HPnZlkUie90QktIE1BBUVAUUAFsf0UK6xl0cW+umIXV4XFiopiB9HVZde1YEEFFRTUVVGUKiV0EtJ7m5n7+yNkJIaSTCaZSfJ9v14hzJ1zz32Sk/bMOee5hmmaJiIiIiIiItJoFn8HICIiIiIi0loowRIREREREfERJVgiIiIiIiI+ogRLRERERETER5RgiYiIiIiI+IgSLBERERERER9RgiUiIiIiIuIjSrBERERERER8xObvAAKZaZq43d7dh9liMbw+V5qWxiawaXwCl8YmsGl8ApvGJ3BpbAJboIyPxWJgGEa92irBOgy32yQ3t6TB59lsFmJiwiksLMXpdDdBZOItjU1g0/gELo1NYNP4BDaNT+DS2AS2QBqf2NhwrNb6JVhaIigiIiIiIuIjSrBERERERER8RAmWiIiIiIiIjyjBEhERERER8RElWCIiIiIiIj6iBEtERERERMRHlGCJiIiIiIj4iBIsERERERERH1GCJSIiIiIi4iNKsERERERERHwkoBKs7du3c++99zJhwgT69OnDuHHj6nWeaZq88MILjBw5kv79+3PhhRfy888/N22wIiIiIiIif2DzdwAH2rRpE8uWLWPAgAG43W5M06zXeS+++CJPPfUUt956K7169eKNN97giiuu4L333qNjx45NHLWISMOYpklWRhHxSRFkZxaTkBwJQFZGEQnJkRiGUaftH497c724xHA2rdtHjz6J5OwrqXXdA2M58Dput5vf1mbSu38yu3fkExJuZd/eQk870zTZt7cI0zQxDEhMcRwyzoO1PfD6mXsK2bE1j2OGdmDT2n0U5JfjiLbTq28ym9Zm4XK7sFqsdO8Tz48rdxAeEUxhQTm5WSV07RFHXGIkpulmy4Z9rFudSWy8ndCwEAYe3wGX08Vn72/EGmQSZAvCFmTgdLooLXQRHmXDYrFQWlRFZUV1bEmdQqgsNcnNqQA3BNkhsV0Yu7eVEhIOrkqIjLITHhlExq5iXG6IigkhJi4Mw4BIRwjhkXYydxdQWurEHmKjqKCMbr3j2bohl8ioYCKjQomMsoNpsG9vMb37J2GxWjBNyM8to1ffJCwWC6ZpkrmnkNzsYmLjI0hMiaw1Vi6Xi++/SiepnYMuPeJ8+rUjIiINZ5j1zWKagdvtxmKpnlSbNm0aa9as4YMPPjjsORUVFZxwwglccskl3HLLLQBUVlYyduxYhg8fzv333+91PC6Xm9zckgafZ7NZiIkJJy+vBKfT7fX1xfc0NoGtrYzPxjWZfP7BBvoMTGHdz3sZNS4NE/j8gw2MGpdGz75Jddr+8bg312vXKYo9Owo87w+87oGxHHidLxf/xvqf99K+czS7t+fT9+h2rPlpj6ddTd81Dhfnwdr+8foAYRFBlBZXedpFx4WSn1PmefzH51urPgNTGDG2Z53P2x/H6r1//syeHQUA9D2mHSeN7gH45mtHfKOt/GxriTQ2gS2Qxic2NhyrtX6L/wJqiWBNctUQq1atori4mNNPP91zLDg4mNNOO43ly5f7MjwRkUZzu02+/2obAOt/yQDgf19t43/L0wH4/uttuN1mnbYHHvf2ejV/hNe8P/C6NbEceB2n082G1dXHd2/PB2Dt6r2edk6n29N3jf99lX7QOA+Mo8Z3y9M9169JroA6ydOBydXBnm+t1q/OoLKyenaq9vHfx6C83OkZT4A1q/bgdLp98rUjIiLeCaglgt7YunUrAKmpqbWOd+vWjfnz51NeXk5ISIg/QvOZsgonm3YVHHLJ5KFXfhx6ScihzvGiK4xDPXnYcxp6/cN8LA04xWq1EJlbRlFROS5X3VdCGrqM5nDNG/p5OeyVD3nOwZ/wZjXQob8mGt7ZIfsyDCwGWCwGFsPAsOx/bBhYLAbBQVaswUGUlFfhdpmedhZLdZvWsMxp07p9FBWUA2Du/6O35jFAYX45m9fto2ffpFptDzzu7fX+6MDjNbEceJ2vl2yu83PnwHYrlmyu03dRQcVB4zxYHMWFFQ36WNoa0zT5+D+/UlRQ8Yfj1e8L88t5Z8GqOuetWLKZ5A5Rjf7aaQqmaYLp/v3NXfN/E9NzzOU5htv9+3HTBMzfPwHVPYLJIY6bNRfFPMRxz7EDHx/qeGM+bqtBaW4IVcXluFx/iLOpNGlO3ZSdN++LAabFQmn2/rFxawYr0JhWK+6wo/0dRoO1+ASrsLCQ4OBg7HZ7reMOhwPTNCkoKGhUgmWzNXxWrWb6sL7TiEfy8n/Xs+q3LJ/0JdKSGXBA0mVgsxoE2SzYrBaCbAe8WS3YDvh/9XEr9mArocFWQu02QoJthNpr/z/MbsMRHkxwkLVJ4ne73fzw9bYjtvv+62307JtYp+33X28jrX9SvWf763u9g12/e58ENuyf1TqUmpmUP/rfV+m14nS73fzwh9krqZ8DZ6cO5o+ze1A9G7hza26tY0f62jFNE6rKcZcXY5YXY1aUYFaWYVZVYDrLoari98dV5ZhVFeCswHQ5we2sfu+q8rzH7cJ0VYHLWf3e7apOppr5j+dAUuzvAOSQNDaBzb3pOMLG3ODvMBqkxSdYTcliMYiJCff6fIcj1CdxnHxsJ4rLqzjoCo/DvLJ2qGcO+2LcYZ4zD/GkNy/uHW7r32HDO+STh+nvEE8dPu6Gf6xedHfIz+nhrnX46/j66+EQnwevQjBxm9VLxdxuE7dp1v3/Eb6WTMDlNnFhggsqmmilWEiwlagIO1ERwTjC7URH2Ilx2EmIDiUhJozEmFASY8IIsTfsR+gvP+yiMP/gs0kHKswv57tl2+q0LcwvZ/e2AvoP6uDT6x3s+p+8vc4zW3UohxrrooKKWnH+8sMuCg8xiyZNo8gzO2gSapQRVJRFxrfldEgAZ1EuzqIcXEV5uMoKcZcW4SorBrfTrzEDYFjAsGBYqt9jsWB43ht4pvNr/X//P8b++XbPTLdR6znPJWrONfhDf7/3+3tzo875LUqTxt10fbfUT7c0AcNCRO8TiPDR39TNpcUnWA6Hg8rKSioqKmrNYhUWFmIYBlFRUV737XabFBaWNvg8q9WCwxFKYWHZQZehNdSArjEM6Dq40f2I78dGfMtiMYiICCG/oIwqp6tW8mXWJGf7H7vcJk63idPppsrpxulyU+l0U+V04XSZVDldVDlNqlwunE6TSqeLiioXZRUuyiuclFU6Ka9wUVbhpKyy+lhJeRVOl0l5pYvy3FIycw///R8RGkRSbBjt48NpnxBOu/hw2seHExcVUmc5o9vt5vOPNxyip7p++m7nQY9//tEG2neJOuIsVkOv90fbtuR4fS7A0g/X075L9c/fzz/yPg6pHwM3DmsBMdZcoqz5RFoKcVgLibQWEmTsT5x+hiOOqjUIIyQCiz0CgkMwgmre7BjBIRi2EKh5bAsGqw3DGlTrPRYbhtVW3VfNe4sVapIm4/ekqeZx9VvrWAZ8MPrdE7g0NoHNarUQESDj43CE1nt1WotPsGr2XqWnp5OWluY5vnXrVtq1a9fo/VeNqVjicrn9XvFEDk5jE5hsNgtWqwWLAVbDwGpt3j+2TLM6uSosraSotIqiA97nF1WSU1hOdkE5OYXllFU4KS6ronh3AVt2117GFR5io2s7B6kpDlLbRdGtvYNdm3MbNJt0qNmjwvxyNvySecT9NBvXZHo1e+UrRQUVbPglExM0e9UEwowSEoMySLRlEmvLIdqah81wHbSt2zQodYdR5g4jIjkZR7t2WMKjMcJiMEIiMUIifn+z2Q/ah7f+sJvp0I3MWv9ptfS7J3BpbAJbSxufFp9gHXPMMURERPDxxx97Eqyqqio+/fRThg8f7ufoRKQlMQyDULuNULuNpJjDty0td5JdUEZmXhm7s4rZk1PK3uwSMnJLKSl3smZrLmv
"text/plain": [
"<Figure size 1000x500 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"plot_scatter()\n",
"# YOUR CODE HERE\n",
"plt.plot(X_1d_new, y_1d_proba_balanced, label=[\"Background\", \"RR Lyrae\"])\n",
"plt.legend()\n",
"#raise NotImplementedError()"
]
2025-02-21 17:13:13 +00:00
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "a29c1deb02f416bb5859ea868d5f1ecf",
"grade": false,
"grade_id": "cell-b91a8e6743a0e23f",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*Comment on the decision boundary of the balanced model compared to the unbalanced models fitted previously.*"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "963c36d881b371659186c7739c8a1a43",
"grade": true,
"grade_id": "cell-45725a6e2c43d0f1",
"locked": false,
"points": 1,
"schema_version": 3,
"solution": true
}
},
"source": [
2025-03-02 00:11:57 +00:00
"Now half the data would be predicted as RR Lyrae, which is still incorrect as there are far more background stars. Predicting in 1D will not work as the RR Lyrae class is intermixed in this dimension with bacground stars."
2025-02-21 17:13:13 +00:00
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "f1f1759a78aff4f424286f2f577be5ec",
"grade": false,
"grade_id": "cell-a5126bea92958ffe",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"Now that we've built up good intuition surrounding the subtleties of the classification problem at hand in 1D, let's consider the 2D problem (we will keep to 2D for plotting convenience)."
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "b853b8b820ed7b06e86b709642fba4c3",
"grade": false,
"grade_id": "cell-46205b5da6dd0e77",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"For the 2D case we consider the following colours."
]
},
{
"cell_type": "code",
2025-03-02 00:11:57 +00:00
"execution_count": 88,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "92a30f4f2db3010902ab7d521054bbff",
"grade": false,
"grade_id": "cell-11a94502070606e6",
"locked": true,
"schema_version": 3,
"solution": false
}
},
2025-03-02 00:11:57 +00:00
"outputs": [
{
"data": {
"text/plain": [
"['u-g', 'g-r']"
]
},
"execution_count": 88,
"metadata": {},
"output_type": "execute_result"
}
],
2025-02-21 17:13:13 +00:00
"source": [
"ind = 1\n",
"cols[:ind+1]"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "24955f013f80e7be059375dcbbecec30",
"grade": false,
"grade_id": "cell-ea66a8b2540e3455",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"Consider the following training and test data for the 2D problem."
]
},
{
"cell_type": "code",
2025-03-02 00:11:57 +00:00
"execution_count": 89,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "62e64cbf875c25da99467cb26c1f6c7a",
"grade": false,
"grade_id": "cell-374dd2ec4c108d9c",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"outputs": [],
"source": [
"X_train_2d = X_train[:, :ind+1]\n",
"X_train_2d = X_train_2d.reshape(-1,ind+1)\n",
"X_test_2d = X_test[:, :ind+1]\n",
"X_test_2d = X_test_2d.reshape(-1,ind+1)"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "be00d6c3427598d2d4a80466e56e4221",
"grade": false,
"grade_id": "cell-ead79764fa5bdc91",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*Train a logistic regression model for this 2D problem. Use variable `clf_2d_logistic` for your classifier.*"
]
},
{
"cell_type": "code",
2025-03-02 00:11:57 +00:00
"execution_count": 90,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "084e3730086edc9f3a88b357a32f13b3",
"grade": false,
"grade_id": "cell-807bce2068c8c513",
"locked": false,
"schema_version": 3,
"solution": true
}
2025-03-02 00:11:57 +00:00
},
"outputs": [
{
"data": {
"text/html": [
"<style>#sk-container-id-5 {\n",
" /* Definition of color scheme common for light and dark mode */\n",
" --sklearn-color-text: #000;\n",
" --sklearn-color-text-muted: #666;\n",
" --sklearn-color-line: gray;\n",
" /* Definition of color scheme for unfitted estimators */\n",
" --sklearn-color-unfitted-level-0: #fff5e6;\n",
" --sklearn-color-unfitted-level-1: #f6e4d2;\n",
" --sklearn-color-unfitted-level-2: #ffe0b3;\n",
" --sklearn-color-unfitted-level-3: chocolate;\n",
" /* Definition of color scheme for fitted estimators */\n",
" --sklearn-color-fitted-level-0: #f0f8ff;\n",
" --sklearn-color-fitted-level-1: #d4ebff;\n",
" --sklearn-color-fitted-level-2: #b3dbfd;\n",
" --sklearn-color-fitted-level-3: cornflowerblue;\n",
"\n",
" /* Specific color for light theme */\n",
" --sklearn-color-text-on-default-background: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, black)));\n",
" --sklearn-color-background: var(--sg-background-color, var(--theme-background, var(--jp-layout-color0, white)));\n",
" --sklearn-color-border-box: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, black)));\n",
" --sklearn-color-icon: #696969;\n",
"\n",
" @media (prefers-color-scheme: dark) {\n",
" /* Redefinition of color scheme for dark theme */\n",
" --sklearn-color-text-on-default-background: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, white)));\n",
" --sklearn-color-background: var(--sg-background-color, var(--theme-background, var(--jp-layout-color0, #111)));\n",
" --sklearn-color-border-box: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, white)));\n",
" --sklearn-color-icon: #878787;\n",
" }\n",
"}\n",
"\n",
"#sk-container-id-5 {\n",
" color: var(--sklearn-color-text);\n",
"}\n",
"\n",
"#sk-container-id-5 pre {\n",
" padding: 0;\n",
"}\n",
"\n",
"#sk-container-id-5 input.sk-hidden--visually {\n",
" border: 0;\n",
" clip: rect(1px 1px 1px 1px);\n",
" clip: rect(1px, 1px, 1px, 1px);\n",
" height: 1px;\n",
" margin: -1px;\n",
" overflow: hidden;\n",
" padding: 0;\n",
" position: absolute;\n",
" width: 1px;\n",
"}\n",
"\n",
"#sk-container-id-5 div.sk-dashed-wrapped {\n",
" border: 1px dashed var(--sklearn-color-line);\n",
" margin: 0 0.4em 0.5em 0.4em;\n",
" box-sizing: border-box;\n",
" padding-bottom: 0.4em;\n",
" background-color: var(--sklearn-color-background);\n",
"}\n",
"\n",
"#sk-container-id-5 div.sk-container {\n",
" /* jupyter's `normalize.less` sets `[hidden] { display: none; }`\n",
" but bootstrap.min.css set `[hidden] { display: none !important; }`\n",
" so we also need the `!important` here to be able to override the\n",
" default hidden behavior on the sphinx rendered scikit-learn.org.\n",
" See: https://github.com/scikit-learn/scikit-learn/issues/21755 */\n",
" display: inline-block !important;\n",
" position: relative;\n",
"}\n",
"\n",
"#sk-container-id-5 div.sk-text-repr-fallback {\n",
" display: none;\n",
"}\n",
"\n",
"div.sk-parallel-item,\n",
"div.sk-serial,\n",
"div.sk-item {\n",
" /* draw centered vertical line to link estimators */\n",
" background-image: linear-gradient(var(--sklearn-color-text-on-default-background), var(--sklearn-color-text-on-default-background));\n",
" background-size: 2px 100%;\n",
" background-repeat: no-repeat;\n",
" background-position: center center;\n",
"}\n",
"\n",
"/* Parallel-specific style estimator block */\n",
"\n",
"#sk-container-id-5 div.sk-parallel-item::after {\n",
" content: \"\";\n",
" width: 100%;\n",
" border-bottom: 2px solid var(--sklearn-color-text-on-default-background);\n",
" flex-grow: 1;\n",
"}\n",
"\n",
"#sk-container-id-5 div.sk-parallel {\n",
" display: flex;\n",
" align-items: stretch;\n",
" justify-content: center;\n",
" background-color: var(--sklearn-color-background);\n",
" position: relative;\n",
"}\n",
"\n",
"#sk-container-id-5 div.sk-parallel-item {\n",
" display: flex;\n",
" flex-direction: column;\n",
"}\n",
"\n",
"#sk-container-id-5 div.sk-parallel-item:first-child::after {\n",
" align-self: flex-end;\n",
" width: 50%;\n",
"}\n",
"\n",
"#sk-container-id-5 div.sk-parallel-item:last-child::after {\n",
" align-self: flex-start;\n",
" width: 50%;\n",
"}\n",
"\n",
"#sk-container-id-5 div.sk-parallel-item:only-child::after {\n",
" width: 0;\n",
"}\n",
"\n",
"/* Serial-specific style estimator block */\n",
"\n",
"#sk-container-id-5 div.sk-serial {\n",
" display: flex;\n",
" flex-direction: column;\n",
" align-items: center;\n",
" background-color: var(--sklearn-color-background);\n",
" padding-right: 1em;\n",
" padding-left: 1em;\n",
"}\n",
"\n",
"\n",
"/* Toggleable style: style used for estimator/Pipeline/ColumnTransformer box that is\n",
"clickable and can be expanded/collapsed.\n",
"- Pipeline and ColumnTransformer use this feature and define the default style\n",
"- Estimators will overwrite some part of the style using the `sk-estimator` class\n",
"*/\n",
"\n",
"/* Pipeline and ColumnTransformer style (default) */\n",
"\n",
"#sk-container-id-5 div.sk-toggleable {\n",
" /* Default theme specific background. It is overwritten whether we have a\n",
" specific estimator or a Pipeline/ColumnTransformer */\n",
" background-color: var(--sklearn-color-background);\n",
"}\n",
"\n",
"/* Toggleable label */\n",
"#sk-container-id-5 label.sk-toggleable__label {\n",
" cursor: pointer;\n",
" display: flex;\n",
" width: 100%;\n",
" margin-bottom: 0;\n",
" padding: 0.5em;\n",
" box-sizing: border-box;\n",
" text-align: center;\n",
" align-items: start;\n",
" justify-content: space-between;\n",
" gap: 0.5em;\n",
"}\n",
"\n",
"#sk-container-id-5 label.sk-toggleable__label .caption {\n",
" font-size: 0.6rem;\n",
" font-weight: lighter;\n",
" color: var(--sklearn-color-text-muted);\n",
"}\n",
"\n",
"#sk-container-id-5 label.sk-toggleable__label-arrow:before {\n",
" /* Arrow on the left of the label */\n",
" content: \"▸\";\n",
" float: left;\n",
" margin-right: 0.25em;\n",
" color: var(--sklearn-color-icon);\n",
"}\n",
"\n",
"#sk-container-id-5 label.sk-toggleable__label-arrow:hover:before {\n",
" color: var(--sklearn-color-text);\n",
"}\n",
"\n",
"/* Toggleable content - dropdown */\n",
"\n",
"#sk-container-id-5 div.sk-toggleable__content {\n",
" max-height: 0;\n",
" max-width: 0;\n",
" overflow: hidden;\n",
" text-align: left;\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-0);\n",
"}\n",
"\n",
"#sk-container-id-5 div.sk-toggleable__content.fitted {\n",
" /* fitted */\n",
" background-color: var(--sklearn-color-fitted-level-0);\n",
"}\n",
"\n",
"#sk-container-id-5 div.sk-toggleable__content pre {\n",
" margin: 0.2em;\n",
" border-radius: 0.25em;\n",
" color: var(--sklearn-color-text);\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-0);\n",
"}\n",
"\n",
"#sk-container-id-5 div.sk-toggleable__content.fitted pre {\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-fitted-level-0);\n",
"}\n",
"\n",
"#sk-container-id-5 input.sk-toggleable__control:checked~div.sk-toggleable__content {\n",
" /* Expand drop-down */\n",
" max-height: 200px;\n",
" max-width: 100%;\n",
" overflow: auto;\n",
"}\n",
"\n",
"#sk-container-id-5 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {\n",
" content: \"▾\";\n",
"}\n",
"\n",
"/* Pipeline/ColumnTransformer-specific style */\n",
"\n",
"#sk-container-id-5 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
" color: var(--sklearn-color-text);\n",
" background-color: var(--sklearn-color-unfitted-level-2);\n",
"}\n",
"\n",
"#sk-container-id-5 div.sk-label.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
" background-color: var(--sklearn-color-fitted-level-2);\n",
"}\n",
"\n",
"/* Estimator-specific style */\n",
"\n",
"/* Colorize estimator box */\n",
"#sk-container-id-5 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-2);\n",
"}\n",
"\n",
"#sk-container-id-5 div.sk-estimator.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
" /* fitted */\n",
" background-color: var(--sklearn-color-fitted-level-2);\n",
"}\n",
"\n",
"#sk-container-id-5 div.sk-label label.sk-toggleable__label,\n",
"#sk-container-id-5 div.sk-label label {\n",
" /* The background is the default theme color */\n",
" color: var(--sklearn-color-text-on-default-background);\n",
"}\n",
"\n",
"/* On hover, darken the color of the background */\n",
"#sk-container-id-5 div.sk-label:hover label.sk-toggleable__label {\n",
" color: var(--sklearn-color-text);\n",
" background-color: var(--sklearn-color-unfitted-level-2);\n",
"}\n",
"\n",
"/* Label box, darken color on hover, fitted */\n",
"#sk-container-id-5 div.sk-label.fitted:hover label.sk-toggleable__label.fitted {\n",
" color: var(--sklearn-color-text);\n",
" background-color: var(--sklearn-color-fitted-level-2);\n",
"}\n",
"\n",
"/* Estimator label */\n",
"\n",
"#sk-container-id-5 div.sk-label label {\n",
" font-family: monospace;\n",
" font-weight: bold;\n",
" display: inline-block;\n",
" line-height: 1.2em;\n",
"}\n",
"\n",
"#sk-container-id-5 div.sk-label-container {\n",
" text-align: center;\n",
"}\n",
"\n",
"/* Estimator-specific */\n",
"#sk-container-id-5 div.sk-estimator {\n",
" font-family: monospace;\n",
" border: 1px dotted var(--sklearn-color-border-box);\n",
" border-radius: 0.25em;\n",
" box-sizing: border-box;\n",
" margin-bottom: 0.5em;\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-0);\n",
"}\n",
"\n",
"#sk-container-id-5 div.sk-estimator.fitted {\n",
" /* fitted */\n",
" background-color: var(--sklearn-color-fitted-level-0);\n",
"}\n",
"\n",
"/* on hover */\n",
"#sk-container-id-5 div.sk-estimator:hover {\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-2);\n",
"}\n",
"\n",
"#sk-container-id-5 div.sk-estimator.fitted:hover {\n",
" /* fitted */\n",
" background-color: var(--sklearn-color-fitted-level-2);\n",
"}\n",
"\n",
"/* Specification for estimator info (e.g. \"i\" and \"?\") */\n",
"\n",
"/* Common style for \"i\" and \"?\" */\n",
"\n",
".sk-estimator-doc-link,\n",
"a:link.sk-estimator-doc-link,\n",
"a:visited.sk-estimator-doc-link {\n",
" float: right;\n",
" font-size: smaller;\n",
" line-height: 1em;\n",
" font-family: monospace;\n",
" background-color: var(--sklearn-color-background);\n",
" border-radius: 1em;\n",
" height: 1em;\n",
" width: 1em;\n",
" text-decoration: none !important;\n",
" margin-left: 0.5em;\n",
" text-align: center;\n",
" /* unfitted */\n",
" border: var(--sklearn-color-unfitted-level-1) 1pt solid;\n",
" color: var(--sklearn-color-unfitted-level-1);\n",
"}\n",
"\n",
".sk-estimator-doc-link.fitted,\n",
"a:link.sk-estimator-doc-link.fitted,\n",
"a:visited.sk-estimator-doc-link.fitted {\n",
" /* fitted */\n",
" border: var(--sklearn-color-fitted-level-1) 1pt solid;\n",
" color: var(--sklearn-color-fitted-level-1);\n",
"}\n",
"\n",
"/* On hover */\n",
"div.sk-estimator:hover .sk-estimator-doc-link:hover,\n",
".sk-estimator-doc-link:hover,\n",
"div.sk-label-container:hover .sk-estimator-doc-link:hover,\n",
".sk-estimator-doc-link:hover {\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-3);\n",
" color: var(--sklearn-color-background);\n",
" text-decoration: none;\n",
"}\n",
"\n",
"div.sk-estimator.fitted:hover .sk-estimator-doc-link.fitted:hover,\n",
".sk-estimator-doc-link.fitted:hover,\n",
"div.sk-label-container:hover .sk-estimator-doc-link.fitted:hover,\n",
".sk-estimator-doc-link.fitted:hover {\n",
" /* fitted */\n",
" background-color: var(--sklearn-color-fitted-level-3);\n",
" color: var(--sklearn-color-background);\n",
" text-decoration: none;\n",
"}\n",
"\n",
"/* Span, style for the box shown on hovering the info icon */\n",
".sk-estimator-doc-link span {\n",
" display: none;\n",
" z-index: 9999;\n",
" position: relative;\n",
" font-weight: normal;\n",
" right: .2ex;\n",
" padding: .5ex;\n",
" margin: .5ex;\n",
" width: min-content;\n",
" min-width: 20ex;\n",
" max-width: 50ex;\n",
" color: var(--sklearn-color-text);\n",
" box-shadow: 2pt 2pt 4pt #999;\n",
" /* unfitted */\n",
" background: var(--sklearn-color-unfitted-level-0);\n",
" border: .5pt solid var(--sklearn-color-unfitted-level-3);\n",
"}\n",
"\n",
".sk-estimator-doc-link.fitted span {\n",
" /* fitted */\n",
" background: var(--sklearn-color-fitted-level-0);\n",
" border: var(--sklearn-color-fitted-level-3);\n",
"}\n",
"\n",
".sk-estimator-doc-link:hover span {\n",
" display: block;\n",
"}\n",
"\n",
"/* \"?\"-specific style due to the `<a>` HTML tag */\n",
"\n",
"#sk-container-id-5 a.estimator_doc_link {\n",
" float: right;\n",
" font-size: 1rem;\n",
" line-height: 1em;\n",
" font-family: monospace;\n",
" background-color: var(--sklearn-color-background);\n",
" border-radius: 1rem;\n",
" height: 1rem;\n",
" width: 1rem;\n",
" text-decoration: none;\n",
" /* unfitted */\n",
" color: var(--sklearn-color-unfitted-level-1);\n",
" border: var(--sklearn-color-unfitted-level-1) 1pt solid;\n",
"}\n",
"\n",
"#sk-container-id-5 a.estimator_doc_link.fitted {\n",
" /* fitted */\n",
" border: var(--sklearn-color-fitted-level-1) 1pt solid;\n",
" color: var(--sklearn-color-fitted-level-1);\n",
"}\n",
"\n",
"/* On hover */\n",
"#sk-container-id-5 a.estimator_doc_link:hover {\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-3);\n",
" color: var(--sklearn-color-background);\n",
" text-decoration: none;\n",
"}\n",
"\n",
"#sk-container-id-5 a.estimator_doc_link.fitted:hover {\n",
" /* fitted */\n",
" background-color: var(--sklearn-color-fitted-level-3);\n",
"}\n",
"</style><div id=\"sk-container-id-5\" class=\"sk-top-container\"><div class=\"sk-text-repr-fallback\"><pre>LogisticRegression(class_weight=&#x27;balanced&#x27;)</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class=\"sk-container\" hidden><div class=\"sk-item\"><div class=\"sk-estimator fitted sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-5\" type=\"checkbox\" checked><label for=\"sk-estimator-id-5\" class=\"sk-toggleable__label fitted sk-toggleable__label-arrow\"><div><div>LogisticRegression</div></div><div><a class=\"sk-estimator-doc-link fitted\" rel=\"noreferrer\" target=\"_blank\" href=\"https://scikit-learn.org/1.6/modules/generated/sklearn.linear_model.LogisticRegression.html\">?<span>Documentation for LogisticRegression</span></a><span class=\"sk-estimator-doc-link fitted\">i<span>Fitted</span></span></div></label><div class=\"sk-toggleable__content fitted\"><pre>LogisticRegression(class_weight=&#x27;balanced&#x27;)</pre></div> </div></div></div></div>"
],
"text/plain": [
"LogisticRegression(class_weight='balanced')"
]
},
"execution_count": 90,
"metadata": {},
"output_type": "execute_result"
}
],
2025-02-21 17:13:13 +00:00
"source": [
"# YOUR CODE HERE\n",
2025-03-02 00:11:57 +00:00
"clf_2d_logistic = LogisticRegression(class_weight=\"balanced\")\n",
"clf_2d_logistic.fit(X_train_2d, y_train)\n",
"#raise NotImplementedError()"
2025-02-21 17:13:13 +00:00
]
},
{
"cell_type": "code",
2025-03-02 00:11:57 +00:00
"execution_count": 91,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "e54724d2562c049fb17447ee0955fdcd",
"grade": true,
"grade_id": "cell-6d3421df624a8839",
"locked": true,
"points": 0,
"schema_version": 3,
"solution": false
}
2025-03-02 00:11:57 +00:00
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"clf_2d_logistic defined.\n"
]
}
],
2025-02-21 17:13:13 +00:00
"source": [
"check_var_defined('clf_2d_logistic')"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "7b0cbcffdf52227ee932131fe0594f4d",
"grade": false,
"grade_id": "cell-d8af8aa1d07b0c2e",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*Compute the precision and recall of your 2D logistic regression model. Use variables `precision_logistic` and `recall_logistic` for your results.*"
]
},
{
"cell_type": "code",
2025-03-02 00:11:57 +00:00
"execution_count": 97,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "a497d2e0107dba9c7bf6b8d6890b5d75",
"grade": false,
"grade_id": "cell-5aa20025d9dcd3de",
"locked": false,
"schema_version": 3,
"solution": true
}
},
"outputs": [],
"source": [
"# YOUR CODE HERE\n",
2025-03-02 00:11:57 +00:00
"\n",
"# Make a Pandas dataframe of the test data and predictions\n",
"df = pd.DataFrame({\n",
" \"u-g\": [x[0] for x in X_test_2d],\n",
" \"g-r\": [x[1] for x in X_test_2d],\n",
" \"target\": pd.Series(y_test)\n",
"})\n",
"\n",
"# Make a prediction from the test data\n",
"y_pred_2d = clf_2d_logistic.predict(X_test_2d)\n",
"\n",
"df[\"prediction\"] = y_pred_2d\n",
"#print(test_df)\n",
"\n",
"# Precision = Tp/(Tp+Fp)\n",
"# Count number of true positives\n",
"positives = df[df[\"prediction\"] == 1]\n",
"true_positives = positives[positives[\"target\"] == 1]\n",
"false_positives = positives[positives[\"target\"] == 0]\n",
"#print(positives, true_positives, false_positives)\n",
"\n",
"Tp = len(true_positives)\n",
"Fp = len(false_positives)\n",
"precision_logistic = Tp/(Tp+Fp)\n",
"#print(precision_logistic)\n",
"\n",
"# Recall = Tp/(Tp+Fn)\n",
"negatives = df[df[\"prediction\"] == 0]\n",
"false_negatives = df[df[\"target\"] == 1]\n",
"\n",
"Fn = len(false_negatives)\n",
"recall_logistic = Tp/(Tp+Fn)\n",
"#print(recall_logistic)\n",
"\n",
"#raise NotImplementedError()"
2025-02-21 17:13:13 +00:00
]
},
{
"cell_type": "code",
2025-03-02 00:11:57 +00:00
"execution_count": 98,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "89d58d5042e1fc392b449b7f0cb03e3b",
"grade": true,
"grade_id": "cell-901a95a71017f77a",
"locked": true,
"points": 1,
"schema_version": 3,
"solution": false
}
},
2025-03-02 00:11:57 +00:00
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"precision_logistic defined.\n",
"precision_logistic = 0.136511\n"
]
}
],
2025-02-21 17:13:13 +00:00
"source": [
"check_var_defined('precision_logistic')\n",
"print(\"precision_logistic = {0:.6f}\".format(precision_logistic))"
]
},
{
"cell_type": "code",
2025-03-02 00:11:57 +00:00
"execution_count": 99,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "a8c0b5936c41a742cddef03152ffa57a",
"grade": true,
"grade_id": "cell-c829df4fb5b5646d",
"locked": true,
"points": 1,
"schema_version": 3,
"solution": false
}
},
2025-03-02 00:11:57 +00:00
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"recall_logistic defined.\n",
"recall_logistic = 0.498024\n"
]
}
],
2025-02-21 17:13:13 +00:00
"source": [
"check_var_defined('recall_logistic')\n",
"print(\"recall_logistic = {0:.6f}\".format(recall_logistic))"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "9ba63808879ae626d04958a07df7be2e",
"grade": false,
"grade_id": "cell-e7757c51155e4dfc",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"Consider the following meshgrid defining the u-g and g-r colour domain of interest."
]
},
{
"cell_type": "code",
2025-03-02 00:11:57 +00:00
"execution_count": 100,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "936ae179349f4afb9f2e84f901b1ac39",
"grade": false,
"grade_id": "cell-c1045a44e953f715",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"outputs": [],
"source": [
"xlim = (0.7, 1.45) # u-g\n",
"ylim = (-0.15, 0.4) # g-r\n",
"xx, yy = np.meshgrid(np.linspace(xlim[0], xlim[1], 100),\n",
" np.linspace(ylim[0], ylim[1], 100))"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "f4172020874b0ca0808cef832529bb1c",
"grade": false,
"grade_id": "cell-7b3fd3dd9fb85196",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*Over the domain specified above plot the predicted classification probability. Overlay on your plot the data instances, highlighting whether a RR Lyrae or background star, and the decision boundary.*"
]
},
{
"cell_type": "code",
2025-03-02 00:11:57 +00:00
"execution_count": 148,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "f86ab2a5e0c7bbae351fa52f20f7f928",
"grade": true,
"grade_id": "cell-cb93ed5cd3864d37",
"locked": false,
"points": 5,
"schema_version": 3,
"solution": true
}
},
2025-03-02 00:11:57 +00:00
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAkQAAAHPCAYAAACyf8XcAAAAOnRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjEwLjAsIGh0dHBzOi8vbWF0cGxvdGxpYi5vcmcvlHJYcgAAAAlwSFlzAAAPYQAAD2EBqD+naQABAABJREFUeJzsnXd8U+X+x9/nnCRNkzRdlFUohWLLHrJFUVHkIqiIqDgA98KFOO/Q63VxcYte9edVQdwDvQxFwauirOsARDaUXSilK23TrHPO74+0adMkbTqAFp7368WL5jnPec6TNM355DslXdd1BAKBQCAQCE5i5OO9AYFAIBAIBILjjRBEAoFAIBAITnqEIBIIBAKBQHDSIwSRQCAQCASCkx4hiAQCgUAgEJz0CEEkEAgEAoHgpEcIIoFAIBAIBCc9QhAJBAKBQCA46RGCSCAQCAQCwUmPEESCFkVWVhb/+Mc/mmy9+fPnk5WVxYYNG+qcO3nyZCZPnhx4vH//frKyspg/f35gbPbs2WRlZTXZ/pqCI0eOcOeddzJkyBCysrKYM2fO8d6S4CiwZs0asrKyWLNmTWDswQcfZOTIkU12jcq/l/379zfZmgJBc8FwvDcgaPnMnz+fhx56KPDYZDLRvn17hg8fzm233UarVq2O4+6OP6+99hpdu3bl3HPPPS7Xf+qpp/jxxx+5/fbbadWqFb169Yo4t6aYs1qt9OjRgxtuuIGzzjor6FjN37uiKCQnJzN8+HCmT59OmzZt6tzb7Nmzefnll1m1ahVJSUn1e2KCo8bxfs8KBMcDIYgETcadd95Jhw4d8Hg8/Prrr3zwwQf88MMPLFq0iNjY2OO9vUbz5ptv1jnn1ltv5aabbgoae/311xk9evRxu7msXr2ac845h+uvvz6q+cOHD+eiiy5C13VycnL44IMPuOWWW3jjjTc444wzQuZX/72vW7eOzz//nF9//ZVFixYRExPT1E9HUA8ee+wxGtKuMtJ79qKLLmLs2LGYTKam2qJA0GwQgkjQZIwYMYLevXsDcOmll5KQkMDbb7/Nt99+y7hx48Ke43Q6sVgsx3KbDSaam4DBYMBgaF5/Vvn5+djt9qjnp6enc9FFFwUejx49mvPPP5933nknrCCq+XtPTEzkjTfe4Ntvv+X8889v/BNoAJqm4fV6W4QgO5p7NRqNTbqeoigoitKkawoEzQURQyQ4agwdOhQgEG/w4IMP0r9/f/bu3cuNN95I//79uffeewG/MJo5cyZnnnkmvXr1YvTo0bz55psRv90uWLCA0aNH07t3byZMmMDPP/8cdPzAgQP8/e9/Z/To0fTp04chQ4Zw5513Rox9cLlcPPzwwwwZMoRTTz2V+++/n+Li4qA5NWOIwlEzhigrKwun08nnn39OVlYWWVlZPPjgg6xevZqsrCyWLl0assbChQvJyspi7dq1tV5r37593HnnnQwePJi+ffty2WWX8f333weOV8Z76LrOe++9F7h+fcnIyCAxMZG9e/dGNX/gwIGB/TWGBx54gCFDhuD1ekOOXXfddYwePTrwuDK2bMGCBYwdO5bevXvz448/An7L3qRJkxgyZAh9+vRhwoQJLFmyJOw1//Of/zBhwgT69OnD4MGDmT59OgcPHqxzr5W/9507d3LXXXdx6qmnMmTIEB5//HHcbnfQ3Nr2mpuby0MPPcRpp51Gr169GDt2LJ9++mnI9Q4dOsRtt91Gv379GDZsGE8++SQejydkXrgYIk3TmDt3LhdccAG9e/dm6NChXH/99YE4ukjvWYgcQ/Tee+8xduxYevXqxemnn86jjz6Kw+EImjN58mTGjRvHjh07mDx5Mn379uWMM87gjTfeqPP1FQiOBc3rq6zghKLyBpqQkBAY8/l8XH/99QwYMIAHHngAs9mMruvceuutrFmzhokTJ9K9e3d+/PFHZs2aRW5uLn/+85+D1v3555/58ssvmTx5MiaTiQ8++IAbbriBTz75hMzMTAA2bNjA2rVrGTt2LG3btuXAgQN88MEHTJkyhcWLF4e48P7xj39gt9u5/fbb2bVrFx988AE5OTnMmzcPSZIa/BrMmjWLv/71r/Tp04fLLrsMgLS0NPr160e7du1YuHAho0aNCjpn4cKFpKWl0b9//4jrHjlyhEmTJlFeXs7kyZNJTEzk888/59Zbb+Wll15i1KhRDBo0iFmzZnH//fcH3GANoaSkBIfDQVpaWlTzDxw4AFAvq1Q4LrroIr744gt++uknzj777MB4Xl4eq1evZtq0aUHzV69ezVdffcVVV11FYmIiqampALzzzjuMHDmSCy64AK/Xy+LFi7nrrrt4/fXXg+KiXn31VV588UXGjBnDxIkTKSgo4N133+Wqq67iiy++iOr53H333aSmpjJjxgzWrVvHvHnzcDgczJo1q869HjlyhMsuuwxJkrjqqqtISkpi+fLl/OUvf6G0tJRrrrkG8Iv3qVOncvDgQSZPnkzr1q35z3/+w+rVq6N6Xf/yl78wf/58RowYwcSJE1FVlV9++YX169fTu3fviO/ZSFTGgZ122mlcccUVgb+fDRs28MEHHwRZqYqLi7nhhhsYNWoUY8aM4euvv+aZZ54hMzOTM888M6r9CwRHDV0gaCSfffaZnpmZqa9cuVLPz8/XDx48qC9evFgfPHiw3qdPH/3QoUO6ruv6Aw88oGdmZurPPPNM0PlLly7VMzMz9X/9619B43fccYeelZWl79mzJzCWmZmpZ2Zm6hs2bAiMHThwQO/du7c+bdq0wFh5eXnIPteuXatnZmbqn3/+ecjeL774Yt3j8QTG33jjDT0zM1NftmxZYOzqq6/Wr7766sDjffv26ZmZmfpnn30WGHvppZf0zMzMoOv269dPf+CBB0L28+yzz+q9evXSHQ5HYCw/P1/v0aOH/tJLL4XMr84TTzyhZ2Zm6j///HNgrLS0VB85cqR+9tln66qqBsYzMzP1Rx99tNb1qs/985//rOfn5+v5+fn6hg0b9Ouvv17PzMzU//3vfwfNDfd7X7JkiT506FC9V69e+sGDB+u8XuXrlZ+fH3JMVVV9xIgR+t133x00/vbbb+tZWVn63r17g/bdrVs3ffv27SHr1HwveDwefdy4cfqUKVMCY/v379e7d++uv/rqq0Fzt27dqvfo0SNkPNLzuOWWW4LG//73v+uZmZn65s2b69zrn//8Z3348OF6QUFB0Pj06dP1AQMGBJ7HnDlz9MzMTP3LL78MzHE6nfqoUaP0zMxMffXq1YHxBx54QD/77LMDj1etWqVnZmbqjz32WMhz0DQt8HOk92zl73zfvn26rvvfrz179tSvu+66oPfcu+++q2dmZuqffvppYOzqq68O+ftzu9368OHD9TvuuCPkWgLBsUa4zARNxjXXXMOwYcM488wzmT59OlarlZdffjkk2+iKK64Ierx8+XIURQlxR1133XXous7y5cuDxvv37x+UKdW+fXvOOeccfvrpJ1RVBcBsNgeOe71eCgsLSUtLw263s2nTppC9X3755UHfZK+44goMBgM//PBDPV+F6LnooovweDxB7psvv/wSn8/HhRdeWOu5P/zwA3369Am4p8CfEXb55Zdz4MABduzY0eB9ffrppwwbNoxhw4ZxySWXsHr1am644QauvfbasPOr/97vvPNOYmNjefXVV2nbtm2D9wAgyzIXXHAB//3vfyktLQ2ML1iwgP79+9OxY8eg+YMGDaJr164h61R/LxQXF1NSUsKAAQOC3gdLly5F0zTGjBlDQUFB4F+rVq3o1KlTUCp7bVx11VVBj6+++mqAkPdwzb3qus4333zDyJEj0XU9aA+nn346JSUlbNy4MbBWSkoKf/rTnwLnx8bGBqw5tfHNN98gSRK33357yLGGWEJXrlyJ1+tlypQpyHLV7eTSSy/FZrOF/P1YLJYgS6XJZKJ3796Ndq8KBE2BcJkJmoyHH36Yzp07oygKrVq1onPnzkEfkuAPOq55ozxw4ACtW7fGZrMFjWdkZASOV6dTp04h105PT6e8vJyCggJSUlJwuVy8/vrrzJ8/n9zc3KBYpJKSkpD
"text/plain": [
"<Figure size 640x480 with 2 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
2025-02-21 17:13:13 +00:00
"source": [
"# YOUR CODE HERE\n",
2025-03-02 00:11:57 +00:00
"from matplotlib import cm\n",
"\n",
"x_in = np.c_[xx.ravel(), yy.ravel()]\n",
"\n",
"def plot_classification_probability(probas):\n",
"\n",
" contour = plt.contourf(xx, yy, np.array([p[1] for p in probas]).reshape(xx.shape), levels=100, cmap=\"coolwarm\")\n",
" plt.xlabel(\"u-g\")\n",
" plt.ylabel(\"g-r\")\n",
" \n",
" background = df[df[\"target\"] == 0]\n",
" plt.scatter(background[\"u-g\"], background[\"g-r\"], label=\"Background\")\n",
" \n",
" rrlyrae = df[df[\"target\"] == 1]\n",
" plt.scatter(rrlyrae[\"u-g\"], rrlyrae[\"g-r\"], label=\"RR Lyrae\")\n",
" \n",
" plt.colorbar(contour)\n",
" \n",
" plt.ylim(ylim)\n",
" plt.title(\"Probability of RR Lyrae prediction\")\n",
" plt.legend()\n",
" \n",
"B0 = clf_2d_logistic.intercept_\n",
"B1 = clf_2d_logistic.coef_[0][0]\n",
"B2 = clf_2d_logistic.coef_[0][1]\n",
"\n",
"# Plot decision boundary\n",
"c = -B0/B2\n",
"m = -B1/B2\n",
"y0=m*xlim[0]+c\n",
"y1=m*xlim[1]+c\n",
"plt.plot(list(xlim), [y0, y1], color=\"yellow\", label=\"Decision boundary\")\n",
"\n",
"plot_classification_probability(clf_2d_logistic.predict_proba(x_in))\n",
"plt.show()\n",
"#raise NotImplementedError()"
2025-02-21 17:13:13 +00:00
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "9d7c9da1a39851de5856c8e6353c3aa7",
"grade": false,
"grade_id": "cell-a39275a441142214",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*Now train an SVM classifier that can support a non-linear decision boundary on the same problem. Use the variable `clf_2d_svm` for your model.*"
]
},
{
"cell_type": "code",
2025-03-02 00:11:57 +00:00
"execution_count": 147,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "e95438410123c8be26e3caeee498b988",
"grade": true,
"grade_id": "cell-c15b2b7d9de2fc9c",
"locked": false,
"points": 3,
"schema_version": 3,
"solution": true
}
2025-03-02 00:11:57 +00:00
},
"outputs": [
{
"data": {
"text/html": [
"<style>#sk-container-id-10 {\n",
" /* Definition of color scheme common for light and dark mode */\n",
" --sklearn-color-text: #000;\n",
" --sklearn-color-text-muted: #666;\n",
" --sklearn-color-line: gray;\n",
" /* Definition of color scheme for unfitted estimators */\n",
" --sklearn-color-unfitted-level-0: #fff5e6;\n",
" --sklearn-color-unfitted-level-1: #f6e4d2;\n",
" --sklearn-color-unfitted-level-2: #ffe0b3;\n",
" --sklearn-color-unfitted-level-3: chocolate;\n",
" /* Definition of color scheme for fitted estimators */\n",
" --sklearn-color-fitted-level-0: #f0f8ff;\n",
" --sklearn-color-fitted-level-1: #d4ebff;\n",
" --sklearn-color-fitted-level-2: #b3dbfd;\n",
" --sklearn-color-fitted-level-3: cornflowerblue;\n",
"\n",
" /* Specific color for light theme */\n",
" --sklearn-color-text-on-default-background: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, black)));\n",
" --sklearn-color-background: var(--sg-background-color, var(--theme-background, var(--jp-layout-color0, white)));\n",
" --sklearn-color-border-box: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, black)));\n",
" --sklearn-color-icon: #696969;\n",
"\n",
" @media (prefers-color-scheme: dark) {\n",
" /* Redefinition of color scheme for dark theme */\n",
" --sklearn-color-text-on-default-background: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, white)));\n",
" --sklearn-color-background: var(--sg-background-color, var(--theme-background, var(--jp-layout-color0, #111)));\n",
" --sklearn-color-border-box: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, white)));\n",
" --sklearn-color-icon: #878787;\n",
" }\n",
"}\n",
"\n",
"#sk-container-id-10 {\n",
" color: var(--sklearn-color-text);\n",
"}\n",
"\n",
"#sk-container-id-10 pre {\n",
" padding: 0;\n",
"}\n",
"\n",
"#sk-container-id-10 input.sk-hidden--visually {\n",
" border: 0;\n",
" clip: rect(1px 1px 1px 1px);\n",
" clip: rect(1px, 1px, 1px, 1px);\n",
" height: 1px;\n",
" margin: -1px;\n",
" overflow: hidden;\n",
" padding: 0;\n",
" position: absolute;\n",
" width: 1px;\n",
"}\n",
"\n",
"#sk-container-id-10 div.sk-dashed-wrapped {\n",
" border: 1px dashed var(--sklearn-color-line);\n",
" margin: 0 0.4em 0.5em 0.4em;\n",
" box-sizing: border-box;\n",
" padding-bottom: 0.4em;\n",
" background-color: var(--sklearn-color-background);\n",
"}\n",
"\n",
"#sk-container-id-10 div.sk-container {\n",
" /* jupyter's `normalize.less` sets `[hidden] { display: none; }`\n",
" but bootstrap.min.css set `[hidden] { display: none !important; }`\n",
" so we also need the `!important` here to be able to override the\n",
" default hidden behavior on the sphinx rendered scikit-learn.org.\n",
" See: https://github.com/scikit-learn/scikit-learn/issues/21755 */\n",
" display: inline-block !important;\n",
" position: relative;\n",
"}\n",
"\n",
"#sk-container-id-10 div.sk-text-repr-fallback {\n",
" display: none;\n",
"}\n",
"\n",
"div.sk-parallel-item,\n",
"div.sk-serial,\n",
"div.sk-item {\n",
" /* draw centered vertical line to link estimators */\n",
" background-image: linear-gradient(var(--sklearn-color-text-on-default-background), var(--sklearn-color-text-on-default-background));\n",
" background-size: 2px 100%;\n",
" background-repeat: no-repeat;\n",
" background-position: center center;\n",
"}\n",
"\n",
"/* Parallel-specific style estimator block */\n",
"\n",
"#sk-container-id-10 div.sk-parallel-item::after {\n",
" content: \"\";\n",
" width: 100%;\n",
" border-bottom: 2px solid var(--sklearn-color-text-on-default-background);\n",
" flex-grow: 1;\n",
"}\n",
"\n",
"#sk-container-id-10 div.sk-parallel {\n",
" display: flex;\n",
" align-items: stretch;\n",
" justify-content: center;\n",
" background-color: var(--sklearn-color-background);\n",
" position: relative;\n",
"}\n",
"\n",
"#sk-container-id-10 div.sk-parallel-item {\n",
" display: flex;\n",
" flex-direction: column;\n",
"}\n",
"\n",
"#sk-container-id-10 div.sk-parallel-item:first-child::after {\n",
" align-self: flex-end;\n",
" width: 50%;\n",
"}\n",
"\n",
"#sk-container-id-10 div.sk-parallel-item:last-child::after {\n",
" align-self: flex-start;\n",
" width: 50%;\n",
"}\n",
"\n",
"#sk-container-id-10 div.sk-parallel-item:only-child::after {\n",
" width: 0;\n",
"}\n",
"\n",
"/* Serial-specific style estimator block */\n",
"\n",
"#sk-container-id-10 div.sk-serial {\n",
" display: flex;\n",
" flex-direction: column;\n",
" align-items: center;\n",
" background-color: var(--sklearn-color-background);\n",
" padding-right: 1em;\n",
" padding-left: 1em;\n",
"}\n",
"\n",
"\n",
"/* Toggleable style: style used for estimator/Pipeline/ColumnTransformer box that is\n",
"clickable and can be expanded/collapsed.\n",
"- Pipeline and ColumnTransformer use this feature and define the default style\n",
"- Estimators will overwrite some part of the style using the `sk-estimator` class\n",
"*/\n",
"\n",
"/* Pipeline and ColumnTransformer style (default) */\n",
"\n",
"#sk-container-id-10 div.sk-toggleable {\n",
" /* Default theme specific background. It is overwritten whether we have a\n",
" specific estimator or a Pipeline/ColumnTransformer */\n",
" background-color: var(--sklearn-color-background);\n",
"}\n",
"\n",
"/* Toggleable label */\n",
"#sk-container-id-10 label.sk-toggleable__label {\n",
" cursor: pointer;\n",
" display: flex;\n",
" width: 100%;\n",
" margin-bottom: 0;\n",
" padding: 0.5em;\n",
" box-sizing: border-box;\n",
" text-align: center;\n",
" align-items: start;\n",
" justify-content: space-between;\n",
" gap: 0.5em;\n",
"}\n",
"\n",
"#sk-container-id-10 label.sk-toggleable__label .caption {\n",
" font-size: 0.6rem;\n",
" font-weight: lighter;\n",
" color: var(--sklearn-color-text-muted);\n",
"}\n",
"\n",
"#sk-container-id-10 label.sk-toggleable__label-arrow:before {\n",
" /* Arrow on the left of the label */\n",
" content: \"▸\";\n",
" float: left;\n",
" margin-right: 0.25em;\n",
" color: var(--sklearn-color-icon);\n",
"}\n",
"\n",
"#sk-container-id-10 label.sk-toggleable__label-arrow:hover:before {\n",
" color: var(--sklearn-color-text);\n",
"}\n",
"\n",
"/* Toggleable content - dropdown */\n",
"\n",
"#sk-container-id-10 div.sk-toggleable__content {\n",
" max-height: 0;\n",
" max-width: 0;\n",
" overflow: hidden;\n",
" text-align: left;\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-0);\n",
"}\n",
"\n",
"#sk-container-id-10 div.sk-toggleable__content.fitted {\n",
" /* fitted */\n",
" background-color: var(--sklearn-color-fitted-level-0);\n",
"}\n",
"\n",
"#sk-container-id-10 div.sk-toggleable__content pre {\n",
" margin: 0.2em;\n",
" border-radius: 0.25em;\n",
" color: var(--sklearn-color-text);\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-0);\n",
"}\n",
"\n",
"#sk-container-id-10 div.sk-toggleable__content.fitted pre {\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-fitted-level-0);\n",
"}\n",
"\n",
"#sk-container-id-10 input.sk-toggleable__control:checked~div.sk-toggleable__content {\n",
" /* Expand drop-down */\n",
" max-height: 200px;\n",
" max-width: 100%;\n",
" overflow: auto;\n",
"}\n",
"\n",
"#sk-container-id-10 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {\n",
" content: \"▾\";\n",
"}\n",
"\n",
"/* Pipeline/ColumnTransformer-specific style */\n",
"\n",
"#sk-container-id-10 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
" color: var(--sklearn-color-text);\n",
" background-color: var(--sklearn-color-unfitted-level-2);\n",
"}\n",
"\n",
"#sk-container-id-10 div.sk-label.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
" background-color: var(--sklearn-color-fitted-level-2);\n",
"}\n",
"\n",
"/* Estimator-specific style */\n",
"\n",
"/* Colorize estimator box */\n",
"#sk-container-id-10 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-2);\n",
"}\n",
"\n",
"#sk-container-id-10 div.sk-estimator.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
" /* fitted */\n",
" background-color: var(--sklearn-color-fitted-level-2);\n",
"}\n",
"\n",
"#sk-container-id-10 div.sk-label label.sk-toggleable__label,\n",
"#sk-container-id-10 div.sk-label label {\n",
" /* The background is the default theme color */\n",
" color: var(--sklearn-color-text-on-default-background);\n",
"}\n",
"\n",
"/* On hover, darken the color of the background */\n",
"#sk-container-id-10 div.sk-label:hover label.sk-toggleable__label {\n",
" color: var(--sklearn-color-text);\n",
" background-color: var(--sklearn-color-unfitted-level-2);\n",
"}\n",
"\n",
"/* Label box, darken color on hover, fitted */\n",
"#sk-container-id-10 div.sk-label.fitted:hover label.sk-toggleable__label.fitted {\n",
" color: var(--sklearn-color-text);\n",
" background-color: var(--sklearn-color-fitted-level-2);\n",
"}\n",
"\n",
"/* Estimator label */\n",
"\n",
"#sk-container-id-10 div.sk-label label {\n",
" font-family: monospace;\n",
" font-weight: bold;\n",
" display: inline-block;\n",
" line-height: 1.2em;\n",
"}\n",
"\n",
"#sk-container-id-10 div.sk-label-container {\n",
" text-align: center;\n",
"}\n",
"\n",
"/* Estimator-specific */\n",
"#sk-container-id-10 div.sk-estimator {\n",
" font-family: monospace;\n",
" border: 1px dotted var(--sklearn-color-border-box);\n",
" border-radius: 0.25em;\n",
" box-sizing: border-box;\n",
" margin-bottom: 0.5em;\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-0);\n",
"}\n",
"\n",
"#sk-container-id-10 div.sk-estimator.fitted {\n",
" /* fitted */\n",
" background-color: var(--sklearn-color-fitted-level-0);\n",
"}\n",
"\n",
"/* on hover */\n",
"#sk-container-id-10 div.sk-estimator:hover {\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-2);\n",
"}\n",
"\n",
"#sk-container-id-10 div.sk-estimator.fitted:hover {\n",
" /* fitted */\n",
" background-color: var(--sklearn-color-fitted-level-2);\n",
"}\n",
"\n",
"/* Specification for estimator info (e.g. \"i\" and \"?\") */\n",
"\n",
"/* Common style for \"i\" and \"?\" */\n",
"\n",
".sk-estimator-doc-link,\n",
"a:link.sk-estimator-doc-link,\n",
"a:visited.sk-estimator-doc-link {\n",
" float: right;\n",
" font-size: smaller;\n",
" line-height: 1em;\n",
" font-family: monospace;\n",
" background-color: var(--sklearn-color-background);\n",
" border-radius: 1em;\n",
" height: 1em;\n",
" width: 1em;\n",
" text-decoration: none !important;\n",
" margin-left: 0.5em;\n",
" text-align: center;\n",
" /* unfitted */\n",
" border: var(--sklearn-color-unfitted-level-1) 1pt solid;\n",
" color: var(--sklearn-color-unfitted-level-1);\n",
"}\n",
"\n",
".sk-estimator-doc-link.fitted,\n",
"a:link.sk-estimator-doc-link.fitted,\n",
"a:visited.sk-estimator-doc-link.fitted {\n",
" /* fitted */\n",
" border: var(--sklearn-color-fitted-level-1) 1pt solid;\n",
" color: var(--sklearn-color-fitted-level-1);\n",
"}\n",
"\n",
"/* On hover */\n",
"div.sk-estimator:hover .sk-estimator-doc-link:hover,\n",
".sk-estimator-doc-link:hover,\n",
"div.sk-label-container:hover .sk-estimator-doc-link:hover,\n",
".sk-estimator-doc-link:hover {\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-3);\n",
" color: var(--sklearn-color-background);\n",
" text-decoration: none;\n",
"}\n",
"\n",
"div.sk-estimator.fitted:hover .sk-estimator-doc-link.fitted:hover,\n",
".sk-estimator-doc-link.fitted:hover,\n",
"div.sk-label-container:hover .sk-estimator-doc-link.fitted:hover,\n",
".sk-estimator-doc-link.fitted:hover {\n",
" /* fitted */\n",
" background-color: var(--sklearn-color-fitted-level-3);\n",
" color: var(--sklearn-color-background);\n",
" text-decoration: none;\n",
"}\n",
"\n",
"/* Span, style for the box shown on hovering the info icon */\n",
".sk-estimator-doc-link span {\n",
" display: none;\n",
" z-index: 9999;\n",
" position: relative;\n",
" font-weight: normal;\n",
" right: .2ex;\n",
" padding: .5ex;\n",
" margin: .5ex;\n",
" width: min-content;\n",
" min-width: 20ex;\n",
" max-width: 50ex;\n",
" color: var(--sklearn-color-text);\n",
" box-shadow: 2pt 2pt 4pt #999;\n",
" /* unfitted */\n",
" background: var(--sklearn-color-unfitted-level-0);\n",
" border: .5pt solid var(--sklearn-color-unfitted-level-3);\n",
"}\n",
"\n",
".sk-estimator-doc-link.fitted span {\n",
" /* fitted */\n",
" background: var(--sklearn-color-fitted-level-0);\n",
" border: var(--sklearn-color-fitted-level-3);\n",
"}\n",
"\n",
".sk-estimator-doc-link:hover span {\n",
" display: block;\n",
"}\n",
"\n",
"/* \"?\"-specific style due to the `<a>` HTML tag */\n",
"\n",
"#sk-container-id-10 a.estimator_doc_link {\n",
" float: right;\n",
" font-size: 1rem;\n",
" line-height: 1em;\n",
" font-family: monospace;\n",
" background-color: var(--sklearn-color-background);\n",
" border-radius: 1rem;\n",
" height: 1rem;\n",
" width: 1rem;\n",
" text-decoration: none;\n",
" /* unfitted */\n",
" color: var(--sklearn-color-unfitted-level-1);\n",
" border: var(--sklearn-color-unfitted-level-1) 1pt solid;\n",
"}\n",
"\n",
"#sk-container-id-10 a.estimator_doc_link.fitted {\n",
" /* fitted */\n",
" border: var(--sklearn-color-fitted-level-1) 1pt solid;\n",
" color: var(--sklearn-color-fitted-level-1);\n",
"}\n",
"\n",
"/* On hover */\n",
"#sk-container-id-10 a.estimator_doc_link:hover {\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-3);\n",
" color: var(--sklearn-color-background);\n",
" text-decoration: none;\n",
"}\n",
"\n",
"#sk-container-id-10 a.estimator_doc_link.fitted:hover {\n",
" /* fitted */\n",
" background-color: var(--sklearn-color-fitted-level-3);\n",
"}\n",
"</style><div id=\"sk-container-id-10\" class=\"sk-top-container\"><div class=\"sk-text-repr-fallback\"><pre>SVC(class_weight=&#x27;balanced&#x27;, probability=True)</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class=\"sk-container\" hidden><div class=\"sk-item\"><div class=\"sk-estimator fitted sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-10\" type=\"checkbox\" checked><label for=\"sk-estimator-id-10\" class=\"sk-toggleable__label fitted sk-toggleable__label-arrow\"><div><div>SVC</div></div><div><a class=\"sk-estimator-doc-link fitted\" rel=\"noreferrer\" target=\"_blank\" href=\"https://scikit-learn.org/1.6/modules/generated/sklearn.svm.SVC.html\">?<span>Documentation for SVC</span></a><span class=\"sk-estimator-doc-link fitted\">i<span>Fitted</span></span></div></label><div class=\"sk-toggleable__content fitted\"><pre>SVC(class_weight=&#x27;balanced&#x27;, probability=True)</pre></div> </div></div></div></div>"
],
"text/plain": [
"SVC(class_weight='balanced', probability=True)"
]
},
"execution_count": 147,
"metadata": {},
"output_type": "execute_result"
}
],
2025-02-21 17:13:13 +00:00
"source": [
"# YOUR CODE HERE\n",
2025-03-02 00:11:57 +00:00
"from sklearn.svm import SVC\n",
"\n",
"clf_2d_svm = SVC(probability=True, class_weight=\"balanced\")\n",
"clf_2d_svm.fit(X_train_2d, y_train)\n"
2025-02-21 17:13:13 +00:00
]
},
{
"cell_type": "code",
2025-03-02 00:11:57 +00:00
"execution_count": 149,
2025-02-21 17:13:13 +00:00
"metadata": {},
2025-03-02 00:11:57 +00:00
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"clf_2d_svm defined.\n"
]
}
],
2025-02-21 17:13:13 +00:00
"source": [
"check_var_defined('clf_2d_svm')"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "fb4bec941b61ff878b444ad7d0845e2e",
"grade": false,
"grade_id": "cell-5b337b0e95730f71",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*Replicate for the SVM your plot above for the 2D logistic regression model. Over the domain specified above plot the decision function score. Overlay on your plot the data instances, highlighting whether a RR Lyrae or background star, and the decision boundary.*"
]
},
{
"cell_type": "code",
2025-03-02 00:11:57 +00:00
"execution_count": 150,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "139c53d2625a5b080ec97d77d07c64d4",
"grade": true,
"grade_id": "cell-3f1d0097c1a9a231",
"locked": false,
"points": 4,
"schema_version": 3,
"solution": true
}
},
2025-03-02 00:11:57 +00:00
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAk4AAAHPCAYAAAClXVUVAAAAOnRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjEwLjAsIGh0dHBzOi8vbWF0cGxvdGxpYi5vcmcvlHJYcgAAAAlwSFlzAAAPYQAAD2EBqD+naQABAABJREFUeJzsvXd8HNXV//+eme2SdtW7ZFmSJVnuHWOwAVPiUAOEkBBKQgIhpADhl+QhCYRACA88hICdAF/yPAEMocQhFIeYDgaDwb1b3eq9rNr2md8fK620Wq20klaWZc/79XKC7ty5c2d3duYz55x7jqAoioKKioqKioqKisqoiFM9ARUVFRUVFRWV6YIqnFRUVFRUVFRUQkQVTioqKioqKioqIaIKJxUVFRUVFRWVEFGFk4qKioqKiopKiKjCSUVFRUVFRUUlRFThpKKioqKioqISIqpwUlFRUVFRUVEJEVU4qaioqKioqKiEiCqcVKYV+fn5/O53vwvbeK+++ir5+fkcOHBg1L7XXnst1157re/vmpoa8vPzefXVV31t69evJz8/P2zzCwctLS385Cc/YcWKFeTn5/PMM89M9ZRUJoEvvviC/Px8vvjiC1/bL3/5S84555ywHaP/91JTUxO2MVVUphuaqZ6AyvTn1Vdf5b/+6798f+t0OlJTU1m1ahU//OEPiY+Pn8LZTT1PPvkkubm5nHvuuVNy/D/84Q988skn/OhHPyI+Pp65c+cG7TtU9EVERFBYWMj3vvc9zjrrLL9tQ793SZKIi4tj1apV3H777SQlJY06t/Xr17NhwwY+//xzYmNjx3ZiKpPGVF+zKionMqpwUgkbP/nJT0hPT8fpdLJr1y5efPFFPv74YzZv3ozRaJzq6U2Y//3f/x21zy233MJNN93k1/bUU09xwQUXTNlDaPv27axdu5Ybb7wxpP6rVq3i0ksvRVEU6urqePHFF/nBD37A008/zZlnnhnQf/D3vnfvXv71r3+xa9cuNm/ejF6vD/fpqIyB++67j/GUIw12zV566aVceOGF6HS6cE1RRWXaoQonlbCxevVq5s2bB8DXv/51oqOj+dvf/sb777/PRRddNOw+vb29mEym4znNcRPKw0Kj0aDRnFg/q9bWVsxmc8j9s7KyuPTSS31/X3DBBXz1q1/lueeeG1Y4Df3eY2JiePrpp3n//ff56le/OvETGAeyLONyuaaFcJvMuWq12rCOJ0kSkiSFdUwVlemGGuOkMmmcdtppAL54iF/+8pcsWrSIqqoqvv/977No0SLuvPNOwCugHnzwQdasWcPcuXO54IIL+N///d+gb8tvvPEGF1xwAfPmzePyyy9nx44dfttra2v57W9/ywUXXMD8+fNZsWIFP/nJT4LGZtjtdu6++25WrFjB4sWL+fnPf47VavXrMzTGaTiGxjjl5+fT29vLv/71L/Lz88nPz+eXv/wl27dvJz8/n3fffTdgjDfffJP8/Hz27Nkz4rGqq6v5yU9+wvLly1mwYAFXXXUVH330kW97fzyKoii88MILvuOPlZycHGJiYqiqqgqp/9KlS33zmwi/+MUvWLFiBS6XK2Dbd7/7XS644ALf3/2xb2+88QYXXngh8+bN45NPPgG8lsKrr76aFStWMH/+fC6//HK2bNky7DFff/11Lr/8cubPn8/y5cu5/fbbqa+vH3Wu/d97WVkZP/3pT1m8eDErVqzg/vvvx+Fw+PUdaa6NjY3813/9F6effjpz587lwgsvZNOmTQHHa2ho4Ic//CELFy5k5cqVPPDAAzidzoB+w8U4ybLMs88+y8UXX8y8efM47bTTuPHGG31xfsGuWQge4/TCCy9w4YUXMnfuXM444wzuvfdeOjs7/fpce+21XHTRRZSWlnLttdeyYMECzjzzTJ5++ulRP18VlROJE+vVWOWkov9BGx0d7Wtzu93ceOONLFmyhF/84hcYDAYUReGWW27hiy++4Morr2T27Nl88sknPPTQQzQ2NnLXXXf5jbtjxw7eeustrr32WnQ6HS+++CLf+973+Mc//kFeXh4ABw4cYM+ePVx44YUkJydTW1vLiy++yHXXXce///3vANfh7373O8xmMz/60Y+oqKjgxRdfpK6ujo0bNyIIwrg/g4ceeohf//rXzJ8/n6uuugqAzMxMFi5cSEpKCm+++SbnnXee3z5vvvkmmZmZLFq0KOi4LS0tXH311dhsNq699lpiYmL417/+xS233MLjjz/Oeeedx7Jly3jooYf4+c9/7nO/jYeuri46OzvJzMwMqX9tbS3AmKxcw3HppZfy2muv8emnn3L22Wf72pubm9m+fTu33nqrX//t27fzn//8h2uuuYaYmBjS0tIAeO655zjnnHO4+OKLcblc/Pvf/+anP/0pTz31lF/c1hNPPMFjjz3GunXruPLKK2lra+P555/nmmuu4bXXXgvpfG677TbS0tL42c9+xt69e9m4cSOdnZ089NBDo861paWFq666CkEQuOaaa4iNjWXr1q386le/oru7mxtuuAHwivzrr7+e+vp6rr32WhITE3n99dfZvn17SJ/rr371K1599VVWr17NlVdeicfjYefOnezbt4958+YFvWaD0R+ndvrpp/PNb37T9/s5cOAAL774op/Vy2q18r3vfY/zzjuPdevW8fbbb/M///M/5OXlsWbNmpDmr6Iy5SgqKhPkn//8p5KXl6d89tlnSmtrq1JfX6/8+9//VpYvX67Mnz9faWhoUBRFUX7xi18oeXl5yv/8z//47f/uu+8qeXl5yl/+8he/9h//+MdKfn6+UllZ6WvLy8tT8vLylAMHDvjaamtrlXnz5im33nqrr81mswXMc8+ePUpeXp7yr3/9K2DuX/va1xSn0+lrf/rpp5W8vDzlvffe87V9+9vfVr797W/7/q6urlby8vKUf/7zn762xx9/XMnLy/M77sKFC5Vf/OIXAfN55JFHlLlz5yqdnZ2+ttbWVqWwsFB5/PHHA/oP5ve//72Sl5en7Nixw9fW3d2tnHPOOcrZZ5+teDweX3teXp5y7733jjje4L533XWX0traqrS2tioHDhxQbrzxRiUvL0/561//6td3uO99y5YtymmnnabMnTtXqa+vH/V4/Z9Xa2trwDaPx6OsXr1aue222/za//a3vyn5+flKVVWV37wLCgqUkpKSgHGGXgtOp1O56KKLlOuuu87XVlNTo8yePVt54okn/PoWFRUphYWFAe3BzuMHP/iBX/tvf/tbJS8vTzly5Mioc73rrruUVatWKW1tbX7tt99+u7JkyRLfeTzzzDNKXl6e8tZbb/n69Pb2Kuedd56Sl5enbN++3df+i1/8Qjn77LN9f3/++edKXl6ect999wWcgyzLvv8Ods32f+fV1dWKoniv1zlz5ijf/e53/a65559/XsnLy1M2bdrka/v2t78d8PtzOBzKqlWrlB//+McBx1JROVFRXXUqYeOGG25g5cqVrFmzhttvv52IiAg2bNgQsLrqm9/8pt/fW7duRZKkADfYd7/7XRRFYevWrX7tixYt8lsZlpqaytq1a/n000/xeDwAGAwG33aXy0V7ezuZmZmYzWYOHz4cMPdvfOMbfm/G3/zmN9FoNHz88cdj/BRC59JLL8XpdPq5jd566y3cbjeXXHLJiPt+/PHHzJ8/3+cWA+8KuG984xvU1tZSWlo67nlt2rSJlStXsnLlSq644gq2b9/O9773Pb7zne8M23/w9/6Tn/wEo9HIE088QXJy8rjnACCKIhdffDEffPAB3d3dvvY33niDRYsWkZGR4dd/2bJl5ObmBowz+FqwWq10dXWxZMkSv+vg3XffRZZl1q1bR1tbm+9ffHw8M2bM8FviPxLXXHON39/f/va3AQKu4aFzVRSFd955h3POOQdFUfzmcMYZZ9DV1cWhQ4d8YyUkJPCVr3zFt7/RaPRZh0binXfeQRAEfvSjHwVsG49l9bPPPsPlcnHdddchigOPk69//etERkYG/H5MJpOf5VOn0zFv3rwJu3VVVI4nqqtOJWzcfffdzJw5E0mSiI+PZ+bMmX43U/AGTw99oNb
"text/plain": [
"<Figure size 640x480 with 2 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
2025-02-21 17:13:13 +00:00
"source": [
"# YOUR CODE HERE\n",
2025-03-02 00:11:57 +00:00
"\n",
"decision = clf_2d_svm.decision_function(x_in).reshape(xx.shape)\n",
"plt.contour(xx, yy, decision)\n",
"\n",
"plot_classification_probability(clf_2d_svm.predict_proba(x_in))\n",
"plt.show()"
2025-02-21 17:13:13 +00:00
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "1d44b6983a7391f6baec9783e8f9c3b9",
"grade": false,
"grade_id": "cell-9396defb6a2fd540",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*Compute the precision and recall of your 2D SVM model. Use variables `precision_svm` and `recall_svm` for your results.*"
]
},
{
"cell_type": "code",
2025-03-02 00:11:57 +00:00
"execution_count": 161,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "676c64172c9ee6735a3554f3a19e2787",
"grade": false,
"grade_id": "cell-4bb00fa2958c34b5",
"locked": false,
"schema_version": 3,
"solution": true
}
},
"outputs": [],
"source": [
"# YOUR CODE HERE\n",
2025-03-02 00:11:57 +00:00
"# Make a Pandas dataframe of the test data and predictions\n",
"df_svm = pd.DataFrame({\n",
" \"u-g\": [x[0] for x in X_test_2d],\n",
" \"g-r\": [x[1] for x in X_test_2d],\n",
" \"target\": pd.Series(y_test)\n",
"})\n",
"\n",
"# Make a prediction from the test data\n",
"y_pred_svm = clf_2d_svm.predict(X_test_2d)\n",
"\n",
"df_svm[\"prediction\"] = y_pred_svm\n",
"#print(test_df)\n",
"\n",
"# Precision = Tp/(Tp+Fp)\n",
"# Count number of true positives\n",
"positives = df_svm[df_svm[\"prediction\"] == 1]\n",
"true_positives = positives[positives[\"target\"] == 1]\n",
"false_positives = positives[positives[\"target\"] == 0]\n",
"#print(positives, true_positives, false_positives)\n",
"\n",
"Tp = len(true_positives)\n",
"Fp = len(false_positives)\n",
"precision_svm = Tp/(Tp+Fp)\n",
"#print(precision_logistic)\n",
"\n",
"# Recall = Tp/(Tp+Fn)\n",
"negatives = df_svm[df_svm[\"prediction\"] == 0]\n",
"false_negatives = df_svm[df_svm[\"target\"] == 1]\n",
"\n",
"Fn = len(false_negatives)\n",
"recall_svm = Tp/(Tp+Fn)\n",
"#print(recall_logistic)\n",
"\n",
"#raise NotImplementedError()"
2025-02-21 17:13:13 +00:00
]
},
{
"cell_type": "code",
2025-03-02 00:11:57 +00:00
"execution_count": 158,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "242d477110906928300aca9370ecae60",
"grade": true,
"grade_id": "cell-8a0f183fa2697432",
"locked": true,
"points": 2,
"schema_version": 3,
"solution": false
}
},
2025-03-02 00:11:57 +00:00
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"precision_svm defined.\n",
"precision_svm = 0.132154\n"
]
}
],
2025-02-21 17:13:13 +00:00
"source": [
"check_var_defined('precision_svm')\n",
"print(\"precision_svm = {0:.6f}\".format(precision_svm))"
]
},
{
"cell_type": "code",
2025-03-02 00:11:57 +00:00
"execution_count": 159,
2025-02-21 17:13:13 +00:00
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "4b9d01eaa4ee806dc2c07205c2d46660",
"grade": true,
"grade_id": "cell-c88d680adad83470",
"locked": true,
"points": 2,
"schema_version": 3,
"solution": false
}
},
2025-03-02 00:11:57 +00:00
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"recall_svm defined.\n",
"recall_svm = 0.500000\n"
]
}
],
2025-02-21 17:13:13 +00:00
"source": [
"check_var_defined('recall_svm')\n",
"print(\"recall_svm = {0:.6f}\".format(recall_svm))"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "991b3f11093789127b5b6fbf01e75552",
"grade": false,
"grade_id": "cell-6e323ca87d06c0b4",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"*Comment on the difference in decision boundary between your logistic regression and SVM models and how this impacts the effectiveness of the models.*"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "3d6cb3a273d55b60497f2ed2047bfc2d",
"grade": true,
"grade_id": "cell-5429ab62dc4857e2",
"locked": false,
"points": 4,
"schema_version": 3,
"solution": true
}
},
"source": [
2025-03-02 00:11:57 +00:00
"The SVM model has a non-linear boundary which should be a better fit because the data is easier to separate in two dimensions. However judging by the precision and recall metrics it has not actually improved that much..."
2025-02-21 17:13:13 +00:00
]
2025-03-02 00:11:57 +00:00
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
2025-02-21 17:13:13 +00:00
}
],
"metadata": {
"kernelspec": {
2025-02-22 19:19:02 +00:00
"display_name": "Python 3 (ipykernel)",
2025-02-21 17:13:13 +00:00
"language": "python",
2025-02-22 19:19:02 +00:00
"name": "python3"
2025-02-21 17:13:13 +00:00
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
2025-02-22 19:19:02 +00:00
"version": "3.11.11"
2025-02-21 17:13:13 +00:00
}
},
"nbformat": 4,
"nbformat_minor": 4
}