spce0038-machine-learning-w.../cw2/spce0038_coursework_tf_MCMQ7.ipynb

2678 lines
977 KiB
Plaintext
Raw Normal View History

2025-03-14 17:57:07 +00:00
{
"cells": [
{
"cell_type": "markdown",
"id": "69c73990",
"metadata": {},
"source": [
"# Coursework TensorFlow\n",
"# SPCE0038: Machine Learning with Big-Data"
]
},
{
"cell_type": "markdown",
"id": "67cacfd6",
"metadata": {},
"source": [
"---"
]
},
{
"cell_type": "markdown",
"id": "1d428273",
"metadata": {},
"source": [
"## Overview\n",
"\n",
"This coursework is provided as a Jupyter notebook, which you will need to complete. \n",
"\n",
"Throughout the notebook you will need to complete code, analytic exercises (if equations are required please typeset your solutions using latex in the markdown cell provided) and descriptive answers. Much of the grading of the coursework will be performed automatically, so it is critical you name your variables as requested."
]
},
{
"cell_type": "markdown",
"id": "90b39499",
"metadata": {},
"source": [
"Before you turn this coursework in, make sure everything runs as expected. First, **restart the kernel** (in the menubar, select Kernel$\\rightarrow$Restart) and then **run all cells** (in the menubar, select Cell$\\rightarrow$Run All).\n",
"\n",
"Make sure you fill in any place that says \"YOUR ANSWER HERE\" or `YOUR CODE HERE` and remove remove the `raise NotImplementedError()` exceptions that are thrown before you have added your answers. Do not add and remove cells but rather provide your answers in the spaces given."
]
},
{
"cell_type": "markdown",
"id": "150e1000",
"metadata": {},
"source": [
"Please also:\n",
"\n",
"- Make sure you use a python environment using the `requirements.txt` files provided by the course.\n",
"- Make sure your notebook executes without errors.\n",
"- Do not add and remove cells but only provide your answers in the spaces given.\n",
"- Do not add or change code in the cells other than the ones marked with `# YOUR CODE HERE`.\n",
"- Do not overwrite or rename any existing variables.\n",
"- Do not install code or packages in the notebooks.\n",
"- Do not import any libraries other than modules from `sklearn` or `tensorflow`.\n",
"- Always label your plots.\n",
"- Answer the questions concisely and show your work/derivations/reasoning."
]
},
{
"cell_type": "markdown",
"id": "f92476d1",
"metadata": {},
"source": [
"**Please rename the notebook filename to include your candidate number in the filename. And please also add your candidate number below:**"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3593c649",
"metadata": {},
"outputs": [],
"source": [
"CANDIDATE_NUMBER = \"MCMQ7\""
]
},
{
"cell_type": "markdown",
"id": "322a38c5",
"metadata": {},
"source": [
"You will be able to run some basic tests in the notebook to check the basic operation of your code is as expected. Although do not assume your responses are complete or fully correct just because the basic tests pass."
]
},
{
"cell_type": "markdown",
"id": "9871e0f5",
"metadata": {},
"source": [
"Once you have renamed the notebook file and completed the exercises, please upload the notebook to Moodle.\n"
]
},
{
"cell_type": "markdown",
"id": "d38e1133",
"metadata": {},
"source": [
"---"
]
},
{
"cell_type": "markdown",
"id": "719f3952",
"metadata": {},
"source": [
"## Dependencies\n",
"\n",
"- Standard course dependencies (e.g. numpy, scikit-learn, etc.)\n",
"- [TensorFlow](https://www.tensorflow.org/)\n",
"- [TensorFlow DataSets](https://www.tensorflow.org/datasets)"
]
},
{
"cell_type": "markdown",
"id": "07758eeb",
"metadata": {},
"source": [
"---"
]
},
{
"cell_type": "markdown",
"id": "a2231590",
"metadata": {},
"source": [
"Notes for people running the notebook in Google Colab:\n",
"\n",
"- This notebook uses quite a bit of RAM (the solutions run with ~8.5Gb of RAM), which means that if you are very inefficient in your memory usage, it will not fit within Colab's limits of 12Gb. If this is the case, you can try restarting the runtime to free up memory of variables you have created and that you no longer use or write your code in a more efficient way.\n",
"- You can enable a Runtime with GPU acceleration for faster training (Runtime -> Change runtime type)"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "a32c7c90",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "b1a7b7723a7cd8fac9ab0856025a6d81",
"grade": false,
"grade_id": "cell-418daabc8f9aac61",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"2025-03-14 14:07:37.610956: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.\n",
"2025-03-14 14:07:37.788469: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.\n",
"2025-03-14 14:07:37.950994: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered\n",
"WARNING: All log messages before absl::InitializeLog() is called are written to STDERR\n",
"E0000 00:00:1741961258.106221 71564 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered\n",
"E0000 00:00:1741961258.153214 71564 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered\n",
"2025-03-14 14:07:38.479758: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.\n",
"To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.\n"
]
}
],
"source": [
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"\n",
"import tensorflow as tf\n",
"import tensorflow_datasets as tfds"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "fad96611",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "83978659708fbe27b7f96f0c006862a5",
"grade": false,
"grade_id": "cell-0be768f33f772611",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [],
"source": [
"def check_var_defined(var):\n",
" try:\n",
" exec(var)\n",
" except NameError:\n",
" raise NameError(var + \" not defined.\")\n",
" else:\n",
" print(var + \" defined.\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7e45caa0-f9fb-4f83-88df-52d5a5e649d7",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "3fbf8aa9005ee23d0ab5544a5d026671",
"grade": true,
"grade_id": "cell-59343806849c27ff",
"locked": true,
"points": 0,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"id": "0c0e7927",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "71eeea15b0b692ba25c290608f294ae1",
"grade": false,
"grade_id": "cell-9ae5247a66cdf986",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"# Part 1: Data pre-processing\n",
"\n",
"\n",
"In these exercises we will look at the classification of flowers into 5 different classes using convolutional neural networks (CNNs). The implementation of this will be done using TensorFlow (TF).\n",
"\n",
"The dataset can be loaded in using the [Tensorflow Datasets](https://www.tensorflow.org/datasets) package. Below you see how we load in the data and change it from a TF generator object into a list of the images and a list of the targets. "
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "f9c193a0",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "c2460b788cc541a7382bee075395dbb7",
"grade": false,
"grade_id": "cell-5a158c5e75738fc8",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"2025-03-14 14:44:13.164833: W external/local_tsl/tsl/platform/cloud/google_auth_provider.cc:184] All attempts to get a Google authentication bearer token failed, returning an empty token. Retrieving token from files failed with \"NOT_FOUND: Could not locate the credentials file.\". Retrieving token from GCE failed with \"FAILED_PRECONDITION: Error executing an HTTP request: libcurl code 6 meaning 'Couldn't resolve host name', error details: Could not resolve host: metadata.google.internal\".\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[1mDownloading and preparing dataset 218.21 MiB (download: 218.21 MiB, generated: 221.83 MiB, total: 440.05 MiB) to /home/ktyl/tensorflow_datasets/tf_flowers/3.0.1...\u001b[0m\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/home/ktyl/.conda/envs/mlbd/lib/python3.11/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
" from .autonotebook import tqdm as notebook_tqdm\n",
"Dl Completed...: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:31<00:00, 6.22s/ file]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[1mDataset tf_flowers downloaded and prepared to /home/ktyl/tensorflow_datasets/tf_flowers/3.0.1. Subsequent calls will reuse this data.\u001b[0m\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"\n",
"W0000 00:00:1741963485.913699 71564 gpu_device.cc:2344] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.\n",
"Skipping registering GPU devices...\n"
]
}
],
"source": [
"# Load data\n",
"data = tfds.load('tf_flowers', split=[\"train\"], as_supervised=True)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "e438b1ef",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "d0e89c00c9c0df44f1b834b9d4391b72",
"grade": false,
"grade_id": "cell-1aefc40e0c7fe096",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"2025-03-14 14:44:46.016395: I tensorflow/core/kernels/data/tf_record_dataset_op.cc:376] The default buffer size is 262144, which is overridden by the user specified `buffer_size` of 8388608\n",
"2025-03-14 14:44:46.984183: I tensorflow/core/framework/local_rendezvous.cc:405] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence\n"
]
}
],
"source": [
"images, targets = zip(*[i for i in data[0]])\n",
"labels = [\"dandelion\", \"daisy\", \"tulips\", \"sunflowers\", \"roses\"]"
]
},
{
"cell_type": "markdown",
"id": "ad3c8560",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "71525d2c752790a621a96df6f831cb50",
"grade": false,
"grade_id": "cell-955200c05b4bfd90",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"_Plot the first two images and their classes by writing a function `show_image`. Set the title of the images to be the class (use the actual label, not the number) it belongs to._ "
]
},
{
"cell_type": "code",
"execution_count": 51,
"id": "69bf403d",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "044d25f825f21b72888f98026cbb419d",
"grade": true,
"grade_id": "cell-2ad6ec61e28d4327",
"locked": false,
"points": 1,
"schema_version": 3,
"solution": true,
"task": false
}
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAgMAAAF0CAYAAAC+FDqzAAAAOnRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjEwLjAsIGh0dHBzOi8vbWF0cGxvdGxpYi5vcmcvlHJYcgAAAAlwSFlzAAAPYQAAD2EBqD+naQABAABJREFUeJzs/VmsbGt21wv+xtfMGRGr2c3Zp8/e2RinnTZuClNlUaaw8HVJUBQ2yE9IlMxDSeAHJIRkkGWEELwg/IJlCR4sKMkC/Aa6Bj9QlilZV6XSla/L14Z02pnpzDwnT7eb1UTMOb/vG6MexpyxYq1zEg7XNi6TMaTYK3bEjNl8zRj/0YuZGUc60pGOdKQjHenrlsIf9A0c6UhHOtKRjnSkP1g6goEjHelIRzrSkb7O6QgGjnSkIx3pSEf6OqcjGDjSkY50pCMd6eucjmDgSEc60pGOdKSvczqCgSMd6UhHOtKRvs7pCAaOdKQjHelIR/o6pyMYONKRjnSkIx3p65yOYOBIRzrSkY50pK9zOoKBIx3p64x+5md+BhHhC1/4wv6z7/3e7+V7v/d7/8Du6UhHOtIfLB3BwJGO9IeMfvmXf5mf+Imf4OnTp3/Qt3KkIx3pvxM6goEjHekPGf3yL/8yf+fv/J3fUzDwC7/wC/zCL/zC79n5jnSkI/3hovQHfQNHOtKR/uCp67o/6Fs40pGO9AdIR8vAkY70h4h+4id+gr/xN/4GAB/96EcRkb3/X0T4mZ/5mXf9RkT4iZ/4if/see/GDPziL/4iIsK/+Bf/gh/7sR/jpZde4uTkhD/7Z/8sX/rSl2799jd/8zf5wR/8QV566SVWqxUf+MAH+OEf/mGePXv2u33cIx3pSP+N6GgZONKR/hDRn//zf57Pfvaz/OzP/iz/6B/9Ix49egTA9fX178v1/t7f+3uICH/zb/5N3nzzTX7yJ3+S7/u+7+NXfuVXWK/XTNPE93//9zOOI3/tr/01XnrpJb7yla/wb/7Nv+Hp06fcu3fv9+W+jnSkI/3e0hEMHOlIf4joM5/5DN/+7d/Oz/7sz/Ln/tyf4yMf+QjArcyA30t6/Pgxv/Ebv8HZ2RkA3/7t385f/It/kX/yT/4JP/qjP8qv//qv8/nPf55/9a/+FT/0Qz+0/92P//iP/77cz5GOdKTfHzq6CY50pCN9TfpLf+kv7YEAwA/90A/x8ssv8z/+j/8jwF7z/3f/7t+x3W7/QO7xSEc60u+ejmDgSEc60tekT3ziE7f+LyJ8/OMf31siPvrRj/LX//pf55/+03/Ko0eP+P7v/37+8T/+x8d4gSMd6Q8ZHcHAkY703wGJyHt+3lr7fb/2P/yH/5Bf/dVf5cd+7MfY7Xb86I/+KJ/+9Kf58pe//Pt+7SMd6Ui/N3QEA0c60h8yei/B/+DBA4B31R744he/+Lu61m/+5m/e+r+Z8bnPfW4fq7DQt3zLt/C3//bf5pd+6Zf4D//hP/CVr3yFn/7pn/5dXftIRzrSfzs6goEjHekPGZ2cnAC3Bf/5+TmPHj3il37pl24d+1M/9VO/q2v9s3/2z7i8vNz//+d+7ud4/fXX+YEf+AEALi4uqLXe+s23fMu3EEJgHMff1bWPdKQj/bejYzbBkY70h4y+4zu+A4C/9bf+Fj/8wz9Mzpk/82f+DD/yIz/CP/gH/4Af+ZEf4Tu/8zv5pV/6JT772c/+rq718OFDvud7voe//Jf/Mm+88QY/+ZM/ycc//nH+yl/5KwD8+3//7/mrf/Wv8hf+wl/gk5/8JLVW/vk//+fEGPnBH/zB3/WzHulIR/pvQ0cwcKQj/SGj7/qu7+Lv/t2/y0//9E/zb//tv0VV+fznP8+P//iP89Zbb/FzP/dz/Mt/+S/5gR/4AX7+53+eF1544X/ztX7sx36MX/3VX+Xv//2/z+XlJX/qT/0pfuqnforNZgPAt37rt/L93//9/Ot//a/5yle+wmaz4Vu/9Vv5+Z//eb77u7/79+qRj3SkI/0+k5iZ/UHfxJGOdKT//6Jf/MVf5E/+yT/5rvoBRzrSkf77pGPMwJGOdKQjHelIX+d0BANHOtKRjnSkI32d0xEMHOlIRzrSkY70dU7HmIEjHelIRzrSkb7O6WgZONKRjnSkIx3p65yOYOBIRzrSkY50pK9zOoKBIx3pSEc60pG+zul9Fx36v/34h1ALiAgnq8Qq9nTSEXLCbMKImI2MTdkWZdd2oEKUgDASLVOLUQuYCapGq0bTgCqUZgzTyNjqXN5UickIXaNbwflp5Gx1zll+QBdPWOeOPiUSmWg9EUG1UusI0sjZ6HujX/vfdVdZdcZmDWc99NKgVmotqDawibFNPNtVXr8YeGs7sCuNQEZ1TWkd2qK/1FADJYJ1KIEYIphAa1ADaGZEuY/xv7va8Ud/9QkPfuuCfluQCIYQYyaI0IJgKPuK82IYHsoRkmAhIiIIEVmtMInY2zuu37zkSo2lFc0EFCDPL4ARUGAAZH7VeeIfACcCo/lx2/n3Ov/W5vcGZIQtRo9wJUYwP1lv8Kn7Hff+9B+nfudnsFc+Qmhr4mvvUD//WXj2Jlw9wa4viNeXyHZH3Q7obkRKo80Va2uDUqE0v7+qfi9LQEud722cn7PO34WDYwwIAU5i4jQkTgN0oWG1UifjGmgCpj4+5+dnrB6es5tGroct1ZTWlKk2tlW5bj4r3/DCC3z4wx8gnvWoFvS6IFeX2LCFYcDKQCsVaw1rRilGqz52k4IaPPquP0H3XX+SmE+xsqVdPiPVhm0vMJ182iWBgeqIEMEStQ7I5VtEiSiClC2SBM0bAkbYnDP1K+JzL9E+8Srx8TvE/++vUdYVuhX5XsfVr/wKv/L//nWeJPi+/+P/lfVf/r9j//E/0v7N/wMpT6FO2FSxacJqwUqhDTva0Hzc57ko4s8S5nEXoBN/X5e1Zjff6fxZBh6sN4STFV99+zFlnsMIXAPdfHyb1+WKSJJAE6iqXNHYzr+Z5vOmeU2CcBaUFwSuGrxzsD4OWzQpoAgqEUEAI1pD8P3TAoTnzsj/w2e4+J6PMJwIoTXQRqmFsQwMZWAYt0xtoFFRU0opDMNArRVripkx1sZUfcXGZIQgxCgQfIen1BEkUkXIOaKjULfCcFnYXY4M17C7hvpkZCyF1RrOn0s8fGnDgxfvsb4f6c8iakaI3qvCFGoTWhOaJloz6gDDtnJ9MXD9bODy2Y7dtjIOhVIqn67wfdfwHc/gbIDNCCf1hnfsAuQI6/WatFqTNqfQn2DdCbE/xVYnsOp9NmpBxxGrI0wTNkygxmSFqhVVpWijNGVXRsZp5HraMU2FqpBzR5dXpNwhOWFAa5WmhVYLrYy0MlFaYUL3a2gphB3mNRFnPrDM/8LDwvzdsjY3wCnQA7mD6SXhyR/pGL/pHP3IOe1exGIhykCSEdEt2ka0GvOyYGqwLXA1wMUOrkcYCowVLptf73s/B9/+/4L7W7gPnATo1j31QYR0Autz4vkj6gc+hHzqE+Su0S5+B/mNLyCf/S3efOsp1xcTz4CJMPNE3fPCMj9DEyGb8U1kXv7MdyIvP6Sue2IW5NnbpH/3i/yX6L+iAmGlS5mcAn0XyEBACGYgQpSMiRAEbJ78ioIJEjIiEYlKVDACwQIhCqEKzaBDiVnIrTJMI1onkIbM060YykRhIGhiqoGIEGIgMKEqtNYcZKhSW2WyRouBkITYjGlqXLfCxdDYJKNHoRS0KFMpPBl3vHk98ublxPVUaRYIGBJc3IgJUSJEMFXEgGAkjCQCIhgBVd/0WYQ1gbUkcgt77iSGC/vqAEDSvGr3sZzq38cAIRFSAhrBCjoqrdtgmxX5vJIvtpg5E2beAIswlzsbQ+YNITMDV4NicMWNoJ1l/J6WTTRh5PnvxoRJjGjCxx4kzv/sn4Y/8d20hy+SnlXKb/024Qu/RXrrK1SZSM8usd01dnUFU0OGCUbDWkCLUn0oGKvfRxO/30XgI1Bt/u7gtWz05Tnx6UDVIYyIIdZAjBggzeO0/G4oFdtOjM2Zo2O5RimN1iDHQCMylUq9ukKmS+p4jQ2FMI5IK7SpUkuhFUM
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAgMAAAFyCAYAAABoTdmuAAAAOnRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjEwLjAsIGh0dHBzOi8vbWF0cGxvdGxpYi5vcmcvlHJYcgAAAAlwSFlzAAAPYQAAD2EBqD+naQABAABJREFUeJzs/XusLct234V/RlX3nHOt/Tiv+/LjxtevBL+FAwkEX9sSWDEgkBJQAIEUGxsExgRLJCJ/IMUEIkhQkKNIQaAoUSASCXaCFFlEykNxuEQi4PiP4OAAiSXDjyixfa/PPY+915rdVeP3xxijqnquufbe59x9fJzcOY7m2XP17K6uqq6u8R1vUVXlQhe60IUudKELfdFS+rA7cKELXehCF7rQhT5cuoCBC13oQhe60IW+yOkCBi50oQtd6EIX+iKnCxi40IUudKELXeiLnC5g4EIXutCFLnShL3K6gIELXehCF7rQhb7I6QIGLnShC13oQhf6IqcLGLjQhS50oQtd6IucLmDgQhe60IUudKEvcrqAgQtd6B8w+m//2/+Wf+gf+oeY55lXX30VgO/8zu/kO7/zOz/Ufl3oQhf6lUvTh92BC13oQi+P/ubf/Jt8z/d8D9/93d/N7/ydv5Pr6+sPu0sXutCF/j6gCxi40IX+AaKf+ImfoNbKH/gDf4Cv+Zqv+bC7c6ELXejvE7qYCS50oX+A6Od//ucBmnng70e6ubmh1vphd+NCF/qiogsYuNCFPiB6++23+aEf+iE+9alPsd/v+djHPsZ3fdd38VM/9VMAfOpTn+J7vud77lx3at//iZ/4CUSE//6//+/5Pb/n9/DlX/7lHA4H/sl/8p/kb/2tv9XO+9SnPsXv+l2/C4CPfvSjiAg//MM/fG//fv7nf57v+77v4+Mf/ziHw4Fv+ZZv4Y/9sT+2Oedbv/Vb+c2/+Tdvjn3TN30TIsJf/+t/vR37k3/yTyIi/MzP/Ew79v/9f/8f//q//q/z8Y9/nP1+zzd8wzfwR/7IH9m0FWP7E3/iT/Af/of/IV/2ZV/G9fU1b731Fsuy8B/9R/8RX/u1X8vhcOCNN97g277t2/jzf/7P3zumC13oQu+PLmaCC13oA6J/69/6t/ixH/sxfvAHf5Cv//qv57Of/Sz/8//8P/MzP/MzfOu3fut7bu8/+8/+M1JK/Pbf/tv5/Oc/z+/7fb+Pf/Vf/Vf5q3/1rwLwIz/yI/w3/81/w//wP/wP/Jf/5X/Jw4cP+eZv/uazbT19+pTv/M7v5G/9rb/FD/7gD/KVX/mV/OiP/ijf8z3fw5tvvsm/9+/9ewB8+tOf5r/77/67dt3nPvc5/sbf+BuklPjMZz7T2v/MZz7DRz/6Ub7u674OgL/39/4e/9g/9o8hIvzgD/4gH/3oR/mzf/bP8n3f93289dZb/NAP/dCmP//xf/wfs9vt+O2//bdze3vLbrfjh3/4h/lP/9P/lO///u/n1/26X8dbb73FT/7kT/JTP/VTfNd3fdd7nr8LXehCzyC90IUu9IHQK6+8ov/Ov/Pv3Pv7V3zFV+hv/a2/9c7x7/iO79Dv+I7vaH//pb/0lxTQr/u6r9Pb29t2/A/8gT+ggP7v//v/3o79rt/1uxTQX/iFX3hmmz/yIz+igP7xP/7H27Hj8aj/+D/+j+vDhw/1rbfeUlXVH/3RH1VA/4//4/9QVdU/82f+jO73e/3n//l/Xv+lf+lfatd+8zd/s/6m3/Sb2t/f933fp1/yJV+iv/iLv7jpx7/8L//L+sorr+iTJ082Y/uqr/qqdizoW77lW/Sf/Wf/2bNzd6ELXejl0sVMcKELfUD06quv8lf/6l/l7/ydv/NS2vve7/1edrtd+/vTn/40AD/7sz/7ntv6H//H/5FPfOIT/Cv/yr/Sjs3zzG/7bb+Nd955h7/8l//y5h7/0//0PwGmAfhH/9F/lO/6ru/iM5/5DABvvvkmP/3TP93OVVX+1J/6U/xz/9w/h6ryi7/4i+3zG3/jb+Tzn/98M5UE/dbf+lu5urraHHv11Vf5G3/jb/B//9//93se34UudKH3RhcwcKELfUD0+37f7+Onf/qn+eQnP8mv+3W/jh/+4R9+X4w76Ff9ql+1+fu1114D4Jd+6Zfec1s/93M/x9d+7deS0nYLCDX/z/3czwHw8Y9/nK/92q9tjP8zn/kMn/70p/n2b/92/s7f+Tv87M/+LH/lr/wVaq0NDPzCL/wCb775Jv/1f/1f89GPfnTz+d7v/V6gOzoGfeVXfuWdPv7u3/27efPNN/nVv/pX803f9E38jt/xOzZ+Che60IVeHl3AwIUu9AHRb/ktv4Wf/dmf5Q/+wT/Il37pl/Kf/+f/Od/wDd/An/2zfxYAETl7XSnl7PGc89njqvpyOnwPfdu3fRuf+cxnePr0KX/tr/01Pv3pT/ON3/iNvPrqq3zmM5/hM5/5DA8fPuQf/of/YYAWCfCv/Wv/Gn/+z//5s59/4p/4Jzb3ONUKAHz7t387f/tv/23+yB/5I3zjN34jf/gP/2G+9Vu/lT/8h//wBzreC13oi5EuDoQXutAHSF/yJV/CD/zAD/ADP/AD/PzP/zzf+q3fyu/5Pb+Hf/qf/qd57bXXePPNN+9c83M/93N81Vd91Qfar6/4iq/gr//1v06tdaMd+Jt/82+234M+/elP80f/6B/lT/yJP0Ephd/wG34DKaUGEn7mZ36G3/AbfkMDKx/96Ed59OgRpRT+qX/qn/qC+vn666/zvd/7vXzv934v77zzDt/+7d/OD//wD/P93//9X1C7F7rQhbZ00Qxc6EIfAJVS+PznP7859rGPfYwv/dIv5fb2FoCv/uqv5n/5X/4XjsdjO+fHf/zH+X//3//3A+/fP/PP/DP83b/7d/mTf/JPtmPruvIH/+Af5OHDh3zHd3xHOx7q/9/7e38v3/zN38wrr7zSjv/Fv/gX+cmf/Ml2DpgG41/4F/4F/tSf+lP89E//9J17/8Iv/MIL9fGzn/3s5u+HDx/yNV/zNW3+LnShC708umgGLnShD4DefvttvvzLv5x/8V/8F/mWb/kWHj58yF/4C3+B/+1/+9/4/b//9wPw/d///fzYj/0Y3/3d381v+S2/hb/9t/82f/yP/3G++qu/+gPv37/5b/6b/Ff/1X/F93zP9/DX/tpf41Of+hQ/9mM/xl/5K3+FH/mRH+HRo0ft3K/5mq/hE5/4BP/n//l/8u/+u/9uO/7t3/7t/Af/wX8AsAEDYGGQf+kv/SV+/a//9fwb/8a/wdd//dfzuc99jp/6qZ/iL/yFv8DnPve55/bx67/+6/nO7/xOfu2v/bW8/vrr/ORP/mQL1bzQhS70cukCBi50oQ+Arq+v+YEf+AH+3J/7c/zpP/2nqbXyNV/zNfyhP/SH+Lf/7X8bgN/4G38jv//3/37+i//iv+CHfuiH+Ef+kX+EH//xH+ff//f//Q+8f1dXV/zET/wEv/N3/k7+2B/7Y7z11lv8ml/za/ijf/SPnk2E9OlPf5of/dEf5du+7dvasV/7a38t19fXrOvKr//1v35z/sc//nH+1//1f+V3/+7fzZ/+03+aP/SH/hBvvPEG3/AN38Dv/b2/94X6+Nt+22/jz/yZP8Of+3N/jtvbW77iK76C/+Q/+U/4Hb/jd3xBY7/QhS50l0Q/aO+jC13oQhe60IUu9CuaLj4DF7rQhS50oQt9kdMFDFzoQhe60IUu9EVOFzBwoQtd6EIXutAXOV3AwIUudKELXehCX+R0AQMXutCFLnShC32R0wUMXOhCF7rQhS70RU4vnGfgvjzqF/oHlwTQ0wNnzmE47zDBboJ5gjnbZzfZQksC2f8VgAoikBLkCXKGlGGKzwSHHeQkTFmYtJL9/N0MOdl5oqAKrNamkeHcWiu6AsU+GcgImYmEklDyVMlZkQQwc3sLT28qb32+cHsLNzdQFruHAPsJpgS7KbPb2bVaoRSoBbTazKRk85CTkrOPWe1zXKFU66ZMGUmZm+PKsirH1W4kyeckWXuqUIraWNXnXGysgrR3dKKyS5X9XshZmJIiKIKNf8qZLAkUylrQtSI
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"def show_image(image, target):\n",
" # YOUR CODE HERE\n",
"\n",
" # Plot each image in a separate figure\n",
" plt.figure()\n",
" \n",
" # Plot their classes\n",
" # Set the title of the images to be the class the image belongs to\n",
" label = labels[target]\n",
" \n",
" plt.imshow(image)\n",
" plt.axis(\"off\")\n",
" plt.title(label)\n",
" #raise NotImplementedError()\n",
"\n",
"for i in range(2):\n",
" show_image(images[i], targets[i])"
]
},
{
"cell_type": "markdown",
"id": "c091a6b0",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "39fbb3182de77dc1dfefd62803c45add",
"grade": false,
"grade_id": "cell-b8ee9d31572d4ef8",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"Before we can use the data to train neural networks, we need to pre-process the data such that:\n",
" \n",
"- the images are all the same shape (224,224)\n",
"- the images, originally with values (0, 255), are mapped to have values between (0, 1)\n",
"- the labels are represented as one-hot vectors"
]
},
{
"cell_type": "markdown",
"id": "bb856fda",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "fefdc012c93c2ff67ef9a229628f8c8c",
"grade": false,
"grade_id": "cell-6b5b2f4a3e9c2f2e",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"_Write a function to rescale and resize the images._"
]
},
{
"cell_type": "code",
"execution_count": 64,
"id": "2f2da49d",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "cbefaa1f473a3192c47b509da825bf06",
"grade": false,
"grade_id": "cell-e3182b662219760c",
"locked": false,
"schema_version": 3,
"solution": true,
"task": false
}
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYUAAAGbCAYAAAAr/4yjAAAAOnRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjEwLjAsIGh0dHBzOi8vbWF0cGxvdGxpYi5vcmcvlHJYcgAAAAlwSFlzAAAPYQAAD2EBqD+naQABAABJREFUeJzs/UusbV1234f9xpxzrbX3ed3Hd79XsapYpEjKjkzRpiTEAQSYghUQBCLDkGiFnQgQQPVsNQwYRmhDoGEYdsewOhIE2A1CbhC2GCSAFFFSw2AohL1ACmFbjkiZIotFVtX3uvecsx9rzTnHSGPMtfY6+96veBkWRVE542Lfc87ee73mY7zHf4iZGY/0SI/0SI/0SED4vb6BR3qkR3qkR/rnhx6FwiM90iM90iMt9CgUHumRHumRHmmhR6HwSI/0SI/0SAs9CoVHeqRHeqRHWuhRKDzSIz3SIz3SQo9C4ZEe6ZEe6ZEWehQKj/RIj/RIj7TQo1B4pEd6pEd6pIUehcIj/f8l/dRP/RQiwj/9p/90ee+HfuiH+KEf+qHfs3t6pEf654EehcIj/b6kX/iFX+Anf/Inefny5e/1rTzSI/0LRY9C4ZF+X9Iv/MIv8J/8J//Jt1Uo/L2/9/f4e3/v733bzvdIj/T7kdLv9Q080iP980J93/9e38IjPdLvOT1aCo/0+45+8id/kv/gP/gPAPiu7/ouRGSJD4gIP/VTP/XaMSLCT/7kT37L857HFH7u534OEeG/++/+O37iJ36CDz74gMvLS/6tf+vf4qtf/eqDY3/pl36JP/Nn/gwffPABm82GL37xi/zYj/0Yr169+p0+7iM90j9TerQUHun3Hf3pP/2n+cf/+B/z0z/90/xX/9V/xYsXLwDY7Xa/K9f7z/6z/wwR4T/8D/9DvvnNb/KX//Jf5k/+yT/JP/yH/5Dtdss0TfzwD/8w4zjy7/17/x4ffPABX/va1/hbf+tv8fLlS548efK7cl+P9Ei/G/QoFB7p9x394T/8h/nBH/xBfvqnf5p/+9/+t/nKV74C8CCT6NtJn376Kf/oH/0jrq+vAfjBH/xB/uyf/bP81//1f81f/It/kf/5f/6f+ZVf+RX+xt/4G/zoj/7octxf+kt/6Xflfh7pkX436dF99EiP9FvQn/tzf24RCAA/+qM/yocffsjf/tt/G2CxBP7u3/277Pf735N7fKRH+nbRo1B4pEf6Leh7v/d7H/wtInzP93zPYpl813d9F//+v//v89/8N/8NL1684Id/+If5K3/lrzzGEx7p9yU9CoVH+heGROSN79daf9ev/V/+l/8lv/iLv8hP/MRPcDgc+It/8S/yh/7QH+LXf/3Xf9ev/UiP9O2kR6HwSL8v6U0C4NmzZwCv1S786q/+6u/oWr/0S7/04G8z45d/+ZeXWMZM3//9389//B//x/z8z/88f//v/32+9rWv8df+2l/7HV37kR7pnzU9CoVH+n1Jl5eXwEMBcHNzw4sXL/j5n//5B9/9q3/1r/6OrvXX//pf5+7ubvn7Z37mZ/jN3/xNfuRHfgSA29tbSikPjvn+7/9+QgiM4/g7uvYjPdI/a3rMPnqk35f0R/7IHwHgP/qP/iN+7Md+jK7r+FN/6k/x4z/+4/wX/8V/wY//+I/zR//oH+Xnf/7n+cf/+B//jq71/Plz/vgf/+P8+T//5/nGN77BX/7Lf5nv+Z7v4S/8hb8AwP/wP/wP/Lv/7r/Lv/Pv/Dt83/d9H6UU/tv/9r8lxsif+TN/5nf8rI/0SP8s6VEoPNLvS/pjf+yP8Z/+p/8pf+2v/TX+zt/5O6gqv/Irv8Jf+kt/iY8++oif+Zmf4b//7/97fuRHfoSf/dmf5b333vv/+Vo/8RM/wS/+4i/yn//n/zl3d3f8m//mv8lf/at/lYuLCwB+4Ad+gB/+4R/mb/7Nv8nXvvY1Li4u+IEf+AF+9md/ln/9X//Xv12P/EiP9M+ExMzs9/omHumR/nmkn/u5n+NP/Ik/8Vr9wSM90r/I9BhTeKRHeqRHeqSFHoXCIz3SIz3SIy30KBQe6ZEe6ZEeaaHHmMIjPdIjPdIjLfRoKTzSIz3SIz3SQo9C4ZEe6ZEe6ZEWeus6hf/T//k7CUHoElykS/rYEXtFTclauJtGihYCilgg1ESZCrXCNAaqKqqVu+NErpWpZGJvpN548XzD5WbLO9sP2KYNQ+rZpisCgVCNUiZUK9fXlWGobC8KTy6EiwE+uAa0UkqhcKBa5jhOfLw78Ouv7skTlBK4H2+YpsA4CkVBCWTrCRYQE0LtiAadKT/0yy/5Y//0lpvf3BGLYalDQkCCIClgplStiBgiQEwQEiYd02c76m7isFeyQQZm/1wAKlDaS9vfN8AzgYPBBNy192v7jp39DrDtI9/x7IKLH/pBNt//vaQX/xJyv4df+SXs61/DXn6CfuMb2GGE3QGmjJXKcYRcYaqQ7XQ/I7DHr29tYczX23aJIQjvREFyhVy5F1CBHuHq+RO2N5d89OolUy7krNzWytGML7/3HjfvPOXFd38ROSqMBfvoN+GwQ3e31GlEc+V4MKpCMXjyJ/80/Ze+G+mvkHFE9nt0fAm1+CiagSk6FqRWQjlg5YjVEbl8ChdX2IsP4Pk1XAyE/+l/xOqBsi382j/4n/jsq7/Jl774B9l+9/ex/Tf+98T/5/8d+af/C7bfo3lCxyN1f0Bz4Vh8nKY2fwL04uOiBll8nGTlhN3cXKHAy7t7xnZsbMcCDJLoJHKwzNGMVxhjm4cLAtsoPOuEKVdyNT7F19GD9SAR9/yqr4kghJsN9U/8Qcr/4Q8zhUylUvLEYTpwzHv2+Y5cM8fxyHEcmfIEgGplzCMiShAj9ZEQIkESGgKKYLvAuKvcf3rk7jNl2it6mLi8ibzzQc9733nD9slAugyEAEGgVqGqcDzCeITDTvnkN27Z34+8+vTI9VR5lis//jF85wHeuxe2avQGxwix67i4viZePSFsL5HtC+g32PYKywXLBY47rBQsZ8ZypGghV+WYJ3bjkVf7e3ItxNjT9QPDsEUBNaWUkTIeyeOBfR3JGOuOHEP7Oe/T2tZAaPN5BWyBiwTTB5HPfrCj/GsvsC9dEC8zMRxJckvNR7RUchYOGe5G49Md7Cb4bAeafeP9H/9v8JXfhKdAv+3pLzfwzgfw9B3s+/4VQpwQ23P3//j7HD95ycv7IwcLjAhHKhk4AiAE4D3g+fvv8+Uf+NeQTYekQPi//F8/h8Of6K2FQggQI6QUSFGIQQjihkYKiRQKaoApgiABkICIEaJgEjCEFH0RF6uIKJhRtVC0UCxTLVE1oVp9+RuoNuFTK6EqvcGklViNXTaCVayUtiAKL/cTr/aZ+2Ol5kBVIVeoBojfm5gQRRCDYIIoCEJC2t++880MVOdRwErFcKaEgIkQIpgpIpWQItb32P64jN18tNCYSftpnBiF8lBYzJ+F9resvpsCDJdbNt/znXTvfoBcPIWXL+H2Drt7Cfs7OOxhmpCSsapoNbRCUR+Hag8Fz/qeZiG0/N1+ETOEJgjbzaiBqlGrYmb+akcKglZFc8EOB2yfYcwwHpz5lkothurD+7EywTTBICDygOMKgrV/hIBZxWrBTDEEygTTAY73MA0wdNAlkA4JQr8Z2F5tSU+fEfse+fQTmEY/3rQ9bFt62p5vNW/hbJzUToJ6/o60wZnHkNX3fa4NRZdjH5wPoxpMamSDIqdbmtfQw9/nEQFTAzVE1dc4LM+lqpiaf2e1tqxNvqkPtQU/j6FYPM0jQZzZByEGZ/xq/t1aDTNfFKF1wfMLNIUrCiEYaeYfKRKCoAEmgbG9CkYBoj8GYobWimj1Z2I1EPj96jxy5i87e80ja2a+TlXbvPnv1RRdVqzf9rxf1z/X+2Pej8vZm4KkIVBFUJNlTStCLeJ7rxq5QqmQC+TcBGdbYzUYOp9YDJPqcxwEtgmCgfZ0z55SqyK7EcMerNF5tYkIF2lgSAMSIqhiRXkbemuhkJLRpcDQRzoJpOALgBAwYCyKWaEUQ1AkmGvXBqkLUH2D9wihRiqKSQY1pjoRizDpgVQ
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"def rescale_and_resize(image):\n",
" # YOUR CODE HERE\n",
" \n",
" # Rescale to (0, 1) range\n",
" preprocessed_image = image / 255\n",
" \n",
" # Resize to (224, 244)\n",
" preprocessed_image = tf.image.resize(preprocessed_image, [224, 224])\n",
" \n",
" return preprocessed_image\n",
" #raise NotImplementedError()\n",
"\n",
"show_image(rescale_and_resize(images[0]), targets[0])"
]
},
{
"cell_type": "code",
"execution_count": 65,
"id": "5b886c27",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "1c2e2853426a640227d69892a38e8250",
"grade": true,
"grade_id": "cell-cfd89fb8cbcec231",
"locked": true,
"points": 2,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"rescale_and_resize defined.\n"
]
}
],
"source": [
"check_var_defined('rescale_and_resize')"
]
},
{
"cell_type": "markdown",
"id": "053649d0",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "4f93e8f3413c7f585d48368707148246",
"grade": false,
"grade_id": "cell-75388a1741e14c88",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"_Now use the function to pre-process the data in the variable `images` and save the preprocessed images in an np.ndarray `images_preprocessed`._"
]
},
{
"cell_type": "code",
"execution_count": 67,
"id": "a880cd0d",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "9403f9bc6c73d10b1e8066f7d493c251",
"grade": false,
"grade_id": "cell-db4f53cb45712358",
"locked": false,
"schema_version": 3,
"solution": true,
"task": false
}
},
"outputs": [],
"source": [
"# YOUR CODE HERE\n",
"images_preprocessed = np.array([rescale_and_resize(img) for img in images])\n",
"#raise NotImplementedError()"
]
},
{
"cell_type": "code",
"execution_count": 68,
"id": "f767e9cf",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "ab073c4d8d13a773b25c7593eb989630",
"grade": true,
"grade_id": "cell-3511cf050a781aab",
"locked": true,
"points": 1,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"images_preprocessed defined.\n"
]
}
],
"source": [
"check_var_defined('images_preprocessed')\n",
"assert type(images_preprocessed) == np.ndarray, \"Make sure to store your answer as a np.ndarray\""
]
},
{
"cell_type": "markdown",
"id": "94226e02",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "077387018e16644826fd4e066af0bac9",
"grade": false,
"grade_id": "cell-0825cee2278d9239",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"One approach to represent the image labels so that they can be trainable by a neural network is to represented them as a one-hot vector. \n",
"\n",
"_Write a function `one_hot_encoding` that takes the integer label and returns a one-hot vector of the label._"
]
},
{
"cell_type": "code",
"execution_count": 74,
"id": "e40efe42",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "2a20a5b6a598460960720ee49678a575",
"grade": false,
"grade_id": "cell-f8acfd909ba340a6",
"locked": false,
"schema_version": 3,
"solution": true,
"task": false
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"target: 2, encoding: [0. 0. 1. 0. 0.]\n"
]
}
],
"source": [
"def one_hot_encoding(target):\n",
" # YOUR CODE HERE\n",
" v = np.zeros(len(labels))\n",
" v[target] = 1\n",
" return v\n",
" #raise NotImplementedError()\n",
"\n",
"print(f\"target: {targets[0]}, encoding: {one_hot_encoding(targets[0])}\")"
]
},
{
"cell_type": "markdown",
"id": "8736ec68",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "0adb27dc2f6de0791a6b8c2d3da175ec",
"grade": false,
"grade_id": "cell-48b78919122984cd",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"_Now use the function `one_hot_encoding` to create an np.ndarray of the one-hot representations of all the labels in `targets` and save them in `targets_preprocessed`._"
]
},
{
"cell_type": "code",
"execution_count": 77,
"id": "b91925ad",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "c1b73325cca5ea5c87c016fa8679da87",
"grade": false,
"grade_id": "cell-66f4d1b4865c393b",
"locked": false,
"schema_version": 3,
"solution": true,
"task": false
}
},
"outputs": [],
"source": [
"# YOUR CODE HERE\n",
"targets_preprocessed = np.array([one_hot_encoding(t) for t in targets])\n",
"#raise NotImplementedError()"
]
},
{
"cell_type": "code",
"execution_count": 78,
"id": "008bf36d",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "77ce70a606155c9bd692ea157f2ba4c5",
"grade": true,
"grade_id": "cell-b7bd3e108b492d1a",
"locked": true,
"points": 1,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"targets_preprocessed defined.\n"
]
}
],
"source": [
"check_var_defined('targets_preprocessed')\n",
"assert type(targets_preprocessed) == np.ndarray, \"Make sure to store your answer as a np.ndarray\""
]
},
{
"cell_type": "markdown",
"id": "93a82e2e",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "41c81734a6061adb3960393c731fbcb8",
"grade": false,
"grade_id": "cell-064a192e04bbe563",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"In this notebook you will train different models and compare them against each other. \n",
"\n",
"Now that the data is pre-processed. We will split the data in three datasets, a train, validation and test set. \n",
"\n",
"_Why do we need these three sets and what do we use them for._"
]
},
{
"cell_type": "markdown",
"id": "a070cee1",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "e411d567e9977776b601d7b2b60e317f",
"grade": true,
"grade_id": "cell-a9728f3a37f9b12d",
"locked": false,
"points": 3,
"schema_version": 3,
"solution": true,
"task": false
}
},
"source": [
"The training dataset is for initially fitting the models to the data.\n",
"\n",
"The test dataset is for performing final, unbiased verification of the model once training is complete.\n",
"\n",
"The validation set is somewhat of a halfway point between the two. It is used for tuning the model while training, for example by detecting overfitting. Overfitting can be detected by measuring the model's error on the validation set per epoch. If the validation error increases between epochs, the model could be overfitting and this can be used to stop the training early before it gets worse."
]
},
{
"cell_type": "markdown",
"id": "ffd06ef7",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "e8ede38fcd2fe8b3353140b6185a939b",
"grade": false,
"grade_id": "cell-5f7c36786ea1ac16",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"_Now split the data such that we have a train set with 80\\% of the samples and a validation and test set each with 10\\% of the samples. Save the results in `x_train`, `y_train`, `x_val`, `y_val`, `x_test`, and `y_test`._"
]
},
{
"cell_type": "code",
"execution_count": 86,
"id": "5fb75b70",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "9cc2c69bb2fa3aa04c24a14b0c7c0c4e",
"grade": false,
"grade_id": "cell-0d2a7323600586f0",
"locked": false,
"schema_version": 3,
"solution": true,
"task": false
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Train samples: 2936, Validation samples: 367, Test samples: 367\n"
]
}
],
"source": [
"tf.keras.utils.set_random_seed(371947)\n",
"# YOUR CODE HERE\n",
"length = len(images_preprocessed)\n",
"x_train, x_val, x_test = np.split(images_preprocessed, [int(length*.8), int(length*.9)])\n",
"y_train, y_val, y_test = np.split(targets_preprocessed, [int(length*.8), int(length*.9)])\n",
"#raise NotImplementedError()\n",
"\n",
"print(f\"Train samples: {len(x_train)}, Validation samples: {len(x_test)}, Test samples: {len(x_val)}\")"
]
},
{
"cell_type": "code",
"execution_count": 87,
"id": "caaff270",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "14729a3cfe23681c3c0310bcf17af528",
"grade": true,
"grade_id": "cell-2e10a0d0c102a096",
"locked": true,
"points": 2,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"x_train defined.\n",
"y_train defined.\n",
"x_val defined.\n",
"y_val defined.\n",
"x_test defined.\n",
"y_test defined.\n"
]
}
],
"source": [
"check_var_defined('x_train')\n",
"check_var_defined('y_train')\n",
"\n",
"check_var_defined('x_val')\n",
"check_var_defined('y_val')\n",
"\n",
"check_var_defined('x_test')\n",
"check_var_defined('y_test')"
]
},
{
"cell_type": "markdown",
"id": "cc327e32",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "6274d96a91edd7e1292a5b6355ea3bef",
"grade": false,
"grade_id": "cell-383c6a2fa62bb715",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"WARNING: Before we continue we delete the variable `images_preprocessed` (you don't need that anymore) to manage our RAM consumption. If you want to use that variable again you will have to rerun the cell that creates it."
]
},
{
"cell_type": "code",
"execution_count": 88,
"id": "156ef4a4",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "7aefd78f3a200aeea2cb44ab96d4578a",
"grade": false,
"grade_id": "cell-97bcc5cb67d2816b",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [],
"source": [
"# Freeing up RAM by deleting this variable\n",
"del images_preprocessed"
]
},
{
"cell_type": "markdown",
"id": "cccda08d",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "5c9e2a2efad571fc5c110c912ee46805",
"grade": false,
"grade_id": "cell-30894c26978e5d77",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"# Part 2: Training a basic CNN model \n",
"\n",
"Now that we have pre-processed the data and split it into different parts for training, validation and testing, you can start training some neural networks. "
]
},
{
"cell_type": "markdown",
"id": "7a19e6a8",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "f28ec881e0bb1e54f3f45c022b8774be",
"grade": false,
"grade_id": "cell-bff4b784586115f7",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"Our model will take as input the $224 \\times 224$ rgb (3-channel) images and will give as output a length five vector of which the different elements correspond to the five different classes. \n",
"\n",
"The model will start with convolutional layers followed by a hidden dense layer and then the final dense layer that gives us our output. \n",
"\n",
"_What kind of activation function should we use on the convolutional, dense and output layers and why these specific activation functions? (motivate your answers)_"
]
},
{
"cell_type": "markdown",
"id": "409a0e94",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "77b3bb801e2b3da0731abff4924aec74",
"grade": true,
"grade_id": "cell-50455d0a455cb88d",
"locked": false,
"points": 3,
"schema_version": 3,
"solution": true,
"task": false
}
},
"source": [
"YOUR ANSWER HERE"
]
},
{
"cell_type": "markdown",
"id": "a35b7d12",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "5a5e9aca09fced3bed74485bd77db7d6",
"grade": false,
"grade_id": "cell-bfa3eeeb3542219a",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"_Create a model using the `tf.keras.models.Sequential()` model and add to that:_\n",
"\n",
"- Input layer: A 2D convolutional layer with 8 filters, a 3x3 kernel and the ReLU activation function. Specify that this layer has an `input_shape` of (224, 224, 3)_ \n",
"- A 2D MaxPool layer with strides 2x2\n",
"\n",
"- A 2D convolutional layer with 16 filters, a 3x3 kernel and a the ReLU activation function\n",
"- A 2D MaxPool layer with strides 2x2\n",
"\n",
"- A 2D convolutional layer with 32 filters, a 3x3 kernel and a the ReLU activation function\n",
"- A 2D MaxPool layer with strides 2x2\n",
"\n",
"- A 2D convolutional layer with 32 filters, a 3x3 kernel and a the ReLU activation function\n",
"- A 2D MaxPool layer with strides 2x2\n",
"\n",
"- A Flatten layer to flatten the filters to a single vector\n",
"- A Dense layer with 32 nodes and your chosen activation\n",
"\n",
"- Output layer: A Dense layer with 5 nodes and your chosen activation\n",
"\n",
"_Store the model in the variable `model_basic`._"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b869ea33",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "9260f5cd8f6f4294a65d990d4b4811d8",
"grade": true,
"grade_id": "cell-07bac90b1f60520f",
"locked": false,
"points": 3,
"schema_version": 3,
"solution": true,
"task": false
}
},
"outputs": [],
"source": [
"tf.keras.backend.clear_session()\n",
"tf.keras.utils.set_random_seed(93612)\n",
"\n",
"# YOUR CODE HERE\n",
"raise NotImplementedError()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "dfd1b4b1",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "31e5c487515b6eb896b76f01274fa4e1",
"grade": true,
"grade_id": "cell-6f80d53f202270f5",
"locked": true,
"points": 0,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [],
"source": [
"check_var_defined('model_basic')"
]
},
{
"cell_type": "markdown",
"id": "732cd91b",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "0a0467c488cd028d8b4f31654da06172",
"grade": false,
"grade_id": "cell-fb2a62d393542d0d",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"To compile the model we need to specify a loss function. \n",
"\n",
"_What loss function would be appropriate for this multi-class classfication problem?_\n",
"\n",
"_Also, during training we would like to monitor how well our model performs on predicting the targets. What would be a good metric to track? Motivate your answers._ "
]
},
{
"cell_type": "markdown",
"id": "7fcc8c8c",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "2c0d8befa01ec00ced7f85f0e9716399",
"grade": true,
"grade_id": "cell-0cdde96c1310bf02",
"locked": false,
"points": 2,
"schema_version": 3,
"solution": true,
"task": false
}
},
"source": [
"YOUR ANSWER HERE"
]
},
{
"cell_type": "markdown",
"id": "99fcaf9a",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "4efc96f0206afd42ee2b8b2b28952a9f",
"grade": false,
"grade_id": "cell-4a033537b6ee9406",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"Now compile the model using the `Adam` optimiser (with a `learning_rate` of 1e-4), your chosen loss, and your chosen metric to track. (Tip: TF has some loss function and metrics implemented in [tf.keras.losses](https://www.tensorflow.org/api_docs/python/tf/keras/losses) and [tf.keras.metrics](https://www.tensorflow.org/api_docs/python/tf/keras/metrics).)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0daef6f2",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "65e0a8df26061b87c55af408082d9018",
"grade": true,
"grade_id": "cell-82f41439014ffdad",
"locked": false,
"points": 1,
"schema_version": 3,
"solution": true,
"task": false
}
},
"outputs": [],
"source": [
"# YOUR CODE HERE\n",
"raise NotImplementedError()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b3f30c12",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "49c1362aa1b61db1c6ba1e0b660622da",
"grade": true,
"grade_id": "cell-05e3602641ad48ce",
"locked": true,
"points": 0,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [],
"source": [
"check_var_defined('model_basic')"
]
},
{
"cell_type": "markdown",
"id": "9414b548",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "1f382259b7899502ac4aac68a8ac4998",
"grade": false,
"grade_id": "cell-a109b929fd00512d",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"_Now fit the train data for 10 epochs and save the training history in the variable `history_basic`. Also specify the `validation_data` and a `batch_size` of 32._"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5f355c96",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "56cd98d186929b568e2f9f81368720a7",
"grade": true,
"grade_id": "cell-33eed7f2f5756706",
"locked": false,
"points": 2,
"schema_version": 3,
"solution": true,
"task": false
}
},
"outputs": [],
"source": [
"tf.keras.utils.set_random_seed(47290)\n",
"# YOUR CODE HERE\n",
"raise NotImplementedError()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d123574b",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "df7229f0fa928a3cc4c45fb2f2759d54",
"grade": true,
"grade_id": "cell-ca82ca44cd2514f0",
"locked": true,
"points": 0,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [],
"source": [
"check_var_defined('history_basic')"
]
},
{
"cell_type": "markdown",
"id": "8aae7b43",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "ef84465233211abf5df74bdd82bbdead",
"grade": false,
"grade_id": "cell-5e75bbc569fe9852",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"_Now to see how the model performs, write a function `plot_metrics` that plots the loss for the train and validation set. In the same function also create a separate plot that plots the other metric for the train and validation set._"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "38bb58e0",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "a5bdb9abb70f3a6d27b0a5a96972a350",
"grade": true,
"grade_id": "cell-61240c7d5919394e",
"locked": false,
"points": 3,
"schema_version": 3,
"solution": true,
"task": false
}
},
"outputs": [],
"source": [
"def plot_metrics(history):\n",
" # YOUR CODE HERE\n",
" raise NotImplementedError()\n",
" \n",
"plot_metrics(history_basic)"
]
},
{
"cell_type": "markdown",
"id": "92bba19a",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "bd1e870f79c2e6d81c41ace194b0ebe4",
"grade": false,
"grade_id": "cell-19742c74e422d586",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"_Comment on the performance of the model based the tracked loss and metric_"
]
},
{
"cell_type": "markdown",
"id": "7e27cebe",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "8e94b63bf3b595eb26cdf1b881cc8abc",
"grade": true,
"grade_id": "cell-e6ba8b200c4b3147",
"locked": false,
"points": 1,
"schema_version": 3,
"solution": true,
"task": false
}
},
"source": [
"YOUR ANSWER HERE"
]
},
{
"cell_type": "markdown",
"id": "1d15de46",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "2e655dbc985e6392b5ddbcc72ca6e229",
"grade": false,
"grade_id": "cell-151e07a3bf171490",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"*What happens if we were to train the model for more epochs? What will happen to the performance of the model?*"
]
},
{
"cell_type": "markdown",
"id": "fc6353ec",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "793762314007eaee2230eb75a100a0c0",
"grade": true,
"grade_id": "cell-9b5a09ea6043f78b",
"locked": false,
"points": 1,
"schema_version": 3,
"solution": true,
"task": false
}
},
"source": [
"YOUR ANSWER HERE"
]
},
{
"cell_type": "markdown",
"id": "af62740d",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "08f400f65ad158d2f87611c7a71ba058",
"grade": false,
"grade_id": "cell-738ce98653d1a570",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"*Write a function `model_predict` that takes the model and some image data and returns the predicted targets (as integers corresponding to the predicted labels).* "
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f22e89cb",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "e40d60ca339a73ff568d0961f7e900ff",
"grade": false,
"grade_id": "cell-cfb685b6f75c6006",
"locked": false,
"schema_version": 3,
"solution": true,
"task": false
}
},
"outputs": [],
"source": [
"def model_predict(model, x):\n",
" # YOUR CODE HERE\n",
" raise NotImplementedError()\n",
" return predicted_targets"
]
},
{
"cell_type": "markdown",
"id": "aacd284a",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "f19da898555f0263ef177dc0ed28374e",
"grade": false,
"grade_id": "cell-e8629ecfe3f00676",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"_Gather the targets of the test set (scalar integer value corresponding to the labels) and save them in `test_targets`._\n",
"\n",
"_Compute the targets for `model_basic` and store them in the variable `test_targets_basic`._"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3565c7b9",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "e9a5b17685ffee2ab1de566b7ba2b168",
"grade": false,
"grade_id": "cell-cbe466cfa34d7e4b",
"locked": false,
"schema_version": 3,
"solution": true,
"task": false
}
},
"outputs": [],
"source": [
"# YOUR CODE HERE\n",
"raise NotImplementedError()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a7f97fbb",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "aa0abefb84844c72e579327d68fd75ee",
"grade": true,
"grade_id": "cell-9e9b4d1d290afa50",
"locked": true,
"points": 2,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [],
"source": [
"check_var_defined('test_targets')\n",
"check_var_defined('test_targets_basic')\n",
"\n",
"assert test_targets.ndim == 1, \"Make sure you are only predicting the scalar label value not the one hot vectors\"\n",
"assert test_targets_basic.ndim == 1, \"Make sure you are only predicting the scalar label value not the one hot vectors\""
]
},
{
"cell_type": "markdown",
"id": "841af931",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "c3fefe19959c788d51c8741c70763997",
"grade": false,
"grade_id": "cell-a46cb8c0103a4c02",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"_Calculate the precision and recall averaged over the 5 classes. (Precision and recall for each classes and then avaraged in one score)_"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c5047120",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "fa62df2d562bac9c7210862873ba902d",
"grade": false,
"grade_id": "cell-2dc0bd5540e9f26b",
"locked": false,
"schema_version": 3,
"solution": true,
"task": false
}
},
"outputs": [],
"source": [
"def average_recall_precision(y, y_predict):\n",
" # YOUR CODE HERE\n",
" raise NotImplementedError()\n",
"\n",
" print(f\"Average Recall: {recall:.3f}, Average Precision {precision:0.3f}\")\n",
" return recall, precision\n",
"\n",
"recall_basic, precision_basic = average_recall_precision(test_targets, test_targets_basic)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5d9aae79",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "8d7838e29e632bcfa1a9ce7da5218100",
"grade": true,
"grade_id": "cell-da4b4b840b2dd7b5",
"locked": true,
"points": 1,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [],
"source": [
"check_var_defined('recall_basic')\n",
"check_var_defined('precision_basic')"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "49508417-b4f1-4880-9554-2015bc064a18",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "244174442f031c3dc061fa1552ae8919",
"grade": true,
"grade_id": "cell-0817deec60dce891",
"locked": true,
"points": 1,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"id": "04eb1d82",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "91c5670a7abf30249e2812eb51b47444",
"grade": false,
"grade_id": "cell-2b085c7cf1510472",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"_Show the predicted targets in a confusion matrix. Show the predicted labels in percentages (percentage of the samples from true class labeled as predicted class) and add the labels to the axes._"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1b4795f7",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "913d7788ac41f1a1a351e0730056d72c",
"grade": true,
"grade_id": "cell-2399b15ba35796cd",
"locked": false,
"points": 2,
"schema_version": 3,
"solution": true,
"task": false
}
},
"outputs": [],
"source": [
"from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay\n",
"\n",
"def plot_confusion_matrix(y, y_pred, title=\"\"):\n",
" # YOUR CODE HERE\n",
" raise NotImplementedError()\n",
" plt.title(title)\n",
" plt.show()\n",
"\n",
"plot_confusion_matrix(test_targets, test_targets_basic, \"basic model\")"
]
},
{
"cell_type": "markdown",
"id": "cd514aca",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "879d47a1563a4574537052e39ade0729",
"grade": false,
"grade_id": "cell-42cbefa4adf96a73",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"_Comment on the recall and precision of the model, as well as the predictions in the confusion matrix._"
]
},
{
"cell_type": "markdown",
"id": "89866352",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "3e2ac185cd5e069336a59850767c5c95",
"grade": true,
"grade_id": "cell-6a7fca3e2bf83f7f",
"locked": false,
"points": 2,
"schema_version": 3,
"solution": true,
"task": false
}
},
"source": [
"YOUR ANSWER HERE"
]
},
{
"cell_type": "markdown",
"id": "3edf45c3",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "5b767c57e15f18abc0089a5e5650afe3",
"grade": false,
"grade_id": "cell-d356a7145292ec49",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"# Part 3: Improving the model\n",
"\n",
"To improve on the model we will include some Dropout layers. \n",
"\n",
"_What do dropout layers do and why might this increase the performance of our models?_"
]
},
{
"cell_type": "markdown",
"id": "b4b05027",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "e91f9ddbac41c7048beebaa26453963a",
"grade": true,
"grade_id": "cell-d51957783a8b8f11",
"locked": false,
"points": 3,
"schema_version": 3,
"solution": true,
"task": false
}
},
"source": [
"YOUR ANSWER HERE"
]
},
{
"cell_type": "markdown",
"id": "5c99ebc0",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "dd2566932baca09a6f679c828996194c",
"grade": false,
"grade_id": "cell-7f4c0f0200535322",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"Another method to improve the model is by using a technique called data augmentation. \n",
"\n",
"_Explain the concept data augmentation and explain how it might increase the performance of our model._"
]
},
{
"cell_type": "markdown",
"id": "e73d6db6",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "f11e0ec2ffa6c21658d7bcf6bbd17dec",
"grade": true,
"grade_id": "cell-8490281b269b4500",
"locked": false,
"points": 2,
"schema_version": 3,
"solution": true,
"task": false
}
},
"source": [
"YOUR ANSWER HERE"
]
},
{
"cell_type": "markdown",
"id": "af5368bf",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "b07bc2ef06fc01df17799136bea01510",
"grade": false,
"grade_id": "cell-fac7ac8abd544790",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"_Use the exact same model as for the `model_basic`, but play around with adding in a data augmentation layer (e.g. [tf.keras.layers.RandomFlip](https://www.tensorflow.org/api_docs/python/tf/keras/layers/RandomFlip) ) and adding in dropout layers (dropout is typically only added in the dense part of the network). Store the new model in the variable `model_dropout` and compile it using the same metrics and loss as before._"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "cae90829",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "5a19d58043b584bf8350c53034028644",
"grade": true,
"grade_id": "cell-25d3daacfe968aa2",
"locked": false,
"points": 2,
"schema_version": 3,
"solution": true,
"task": false
}
},
"outputs": [],
"source": [
"tf.keras.backend.clear_session()\n",
"tf.keras.utils.set_random_seed(48263)\n",
"\n",
"# YOUR CODE HERE\n",
"raise NotImplementedError()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "77868aae",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "b072fbdbe92be0f3926ceaf0935a6de0",
"grade": true,
"grade_id": "cell-824a00bec8ece0ac",
"locked": true,
"points": 0,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [],
"source": [
"check_var_defined('model_dropout')\n",
"model_dropout.summary()"
]
},
{
"cell_type": "markdown",
"id": "6778a133",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "f825eb30514de109277305d52676cf9d",
"grade": false,
"grade_id": "cell-e6b6f0ea7ef2d37a",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"_Fit the new model in the same way as before and save the history in `history_dropout`. However, train for 20 epochs instead of 10._"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "92ec2689",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "c89d459ee4ecefb767369c54929ecfb0",
"grade": true,
"grade_id": "cell-82fcca9814ec3c0e",
"locked": false,
"points": 1,
"schema_version": 3,
"solution": true,
"task": false
}
},
"outputs": [],
"source": [
"tf.keras.utils.set_random_seed(103745)\n",
"# YOUR CODE HERE\n",
"raise NotImplementedError()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5048652c",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "35d4e0daf253b3a62a9f19ff3dac107b",
"grade": true,
"grade_id": "cell-df6e944e1dfa0984",
"locked": true,
"points": 0,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [],
"source": [
"check_var_defined('history_dropout')"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "128cc76b",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "98d15e93390af85cc8b0e007b9dc242a",
"grade": false,
"grade_id": "cell-0ddc85bc444c422c",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [],
"source": [
"plot_metrics(history_dropout)"
]
},
{
"cell_type": "markdown",
"id": "c10bf8c3",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "979f4033a7940ff00c6f7ca954637940",
"grade": false,
"grade_id": "cell-b670310c7765e587",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"_Comment on the performance of the improved model based on the loss and metrics during training._"
]
},
{
"cell_type": "markdown",
"id": "6b3faf05",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "00191b16cf0e011c9d03a1129754e555",
"grade": true,
"grade_id": "cell-6927b7d95cf333bb",
"locked": false,
"points": 2,
"schema_version": 3,
"solution": true,
"task": false
}
},
"source": [
"YOUR ANSWER HERE"
]
},
{
"cell_type": "markdown",
"id": "4bc94e80",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "efc719533371854eb0212f0e5c2b10ec",
"grade": false,
"grade_id": "cell-ce1c537bdeaf042b",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"_Why did we have to train the model for more epochs than the basic model?_"
]
},
{
"cell_type": "markdown",
"id": "d81fb816",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "7532010e993a0c2cf68ea538cc1828cb",
"grade": true,
"grade_id": "cell-ed5fd1fcd6d7edae",
"locked": false,
"points": 1,
"schema_version": 3,
"solution": true,
"task": false
}
},
"source": [
"YOUR ANSWER HERE"
]
},
{
"cell_type": "markdown",
"id": "d43af512",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "0019529f078ddad35635d799818a93a4",
"grade": false,
"grade_id": "cell-6fe45193734a98f1",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"Now we evaluate our model on the test set using the functions you wrote before"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b5d2ea24",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "a8a74e97055b77988de9ab2d2bbe584d",
"grade": false,
"grade_id": "cell-dbb41e724cc87ef4",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [],
"source": [
"test_targets_dropout = model_predict(model_dropout, x_test)\n",
"recall_dropout, precision_dropout = average_recall_precision(test_targets, test_targets_dropout) \n",
"plot_confusion_matrix(test_targets, test_targets_dropout, \"Dropout model\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "dbae38a3",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "1ff7f78be2332484fcb2a4f42c88024b",
"grade": true,
"grade_id": "cell-38adf79a9623097a",
"locked": true,
"points": 1,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [],
"source": [
"check_var_defined('test_targets_dropout')\n",
"check_var_defined('recall_dropout')"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "98f7bbc4",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "ef2773bada37ac2e989c915fd1c202ac",
"grade": true,
"grade_id": "cell-893d3a09e7d217a5",
"locked": true,
"points": 1,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [],
"source": [
"check_var_defined('test_targets_dropout')\n",
"check_var_defined('precision_dropout')"
]
},
{
"cell_type": "markdown",
"id": "f1997b4c",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "45af3ed742ffc985d3ae0d1bb0642ec9",
"grade": false,
"grade_id": "cell-ee9f450cfd4f28c1",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"_Comment on the precision and recall of the model as well as the predictions in the confusion matrix_"
]
},
{
"cell_type": "markdown",
"id": "c67b7709",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "4fff78a473cf809da4746a10024bf93a",
"grade": true,
"grade_id": "cell-5e65073f0a33499f",
"locked": false,
"points": 1,
"schema_version": 3,
"solution": true,
"task": false
}
},
"source": [
"YOUR ANSWER HERE"
]
},
{
"cell_type": "markdown",
"id": "916a0268",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "16a0f02889f362263892d78966b2cdf7",
"grade": false,
"grade_id": "cell-05958d87bacb98bb",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"# Part 4: Transfer learning\n",
"\n",
"In order to improve our model even further, we will make use of transfer learning. \n",
"\n",
"_Explain in your own words what tranfer learning means and why it would help in our particular case._"
]
},
{
"cell_type": "markdown",
"id": "8beb5bd3",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "008803ee61ecba25492a7b6ab82d276c",
"grade": true,
"grade_id": "cell-308cc1a86688dfa3",
"locked": false,
"points": 2,
"schema_version": 3,
"solution": true,
"task": false
}
},
"source": [
"YOUR ANSWER HERE"
]
},
{
"cell_type": "markdown",
"id": "7c876e43",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "9738191e10f9792a81f9d65df11ab118",
"grade": false,
"grade_id": "cell-471888928c62b0bd",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"Below you can see how we load in a pre-trained MobileNet that is trained on the ImageNet datastet. By not including the top part of the network we get only the convolutional layers and can add our own dense layers after that. We set all the layers of the MobileNet as not trainable, since this would be computationally expensive to do and we also want to avoid overfitting. Instead we will only be training the dense part. "
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "152fe58b",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "f6cb7992b095564ec69734777844f1f2",
"grade": false,
"grade_id": "cell-cb1e989dcd98ff48",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [],
"source": [
"mobilenet = tf.keras.applications.mobilenet.MobileNet(\n",
" input_shape=(224, 224, 3),\n",
" include_top=False, \n",
" weights='imagenet'\n",
")\n",
"\n",
"for layer in mobilenet.layers:\n",
" layer.trainable = False\n",
"\n",
"print(\"Output shape of the MobileNet: \", mobilenet.output_shape)"
]
},
{
"cell_type": "markdown",
"id": "864f49e7",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "c8249bc8280f0c194b5224a4898b45d0",
"grade": false,
"grade_id": "cell-40d065f39b3e5f02",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"_Use a sequential model and add the MobileNet, followed by a MaxPool2D layer, and then the dense part of the network which you can use the same as we have used in the previous models. Store the model in the variable `model_mobilenet`. Compile the model using the same metrics, loss and optimiser as before._"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6427f088",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "8fbbe772986c6b73371072f2bdaa32f3",
"grade": true,
"grade_id": "cell-0e5534072707a1a9",
"locked": false,
"points": 3,
"schema_version": 3,
"solution": true,
"task": false
}
},
"outputs": [],
"source": [
"tf.keras.backend.clear_session()\n",
"tf.keras.utils.set_random_seed(387453)\n",
"\n",
"# YOUR CODE HERE\n",
"raise NotImplementedError()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "27b1a82a",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "3908bbc5d4a03396a323b818481eff71",
"grade": true,
"grade_id": "cell-1901a93a5517baf3",
"locked": true,
"points": 0,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [],
"source": [
"check_var_defined('model_mobilenet')\n",
"model_mobilenet.summary()"
]
},
{
"cell_type": "markdown",
"id": "32486ee9",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "74a789206fadd2fe414bb72553f5cd94",
"grade": false,
"grade_id": "cell-f779ec7bb8b27b20",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"*Train the model in the same way as before, for 10 epochs.*"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b06557e3",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "b5a02272f53a882dc78ea8dfc863738f",
"grade": true,
"grade_id": "cell-aacaa24fc5614424",
"locked": false,
"points": 1,
"schema_version": 3,
"solution": true,
"task": false
}
},
"outputs": [],
"source": [
"tf.keras.utils.set_random_seed(9673)\n",
"# YOUR CODE HERE\n",
"raise NotImplementedError()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6268f779",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "ca257ea8df19cb9c0c69b5791331d558",
"grade": true,
"grade_id": "cell-0afec72710b80268",
"locked": true,
"points": 0,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [],
"source": [
"check_var_defined('history_mobilenet')"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6cbcc8b8",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "fbb6ea9e46c5fb28ddabec590a316660",
"grade": false,
"grade_id": "cell-d159d268c6e2d3b7",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [],
"source": [
"plot_metrics(history_mobilenet)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "fd842596",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "0496a72dd46f2c72d22da989a15c2476",
"grade": false,
"grade_id": "cell-d87a64b79f1549a7",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [],
"source": [
"test_targets_mobilenet = model_predict(model_mobilenet, x_test)\n",
"recall_mobilenet, precision_mobilenet = average_recall_precision(test_targets, test_targets_mobilenet) \n",
"plot_confusion_matrix(test_targets, test_targets_mobilenet, \"Mobilenet model\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e2342a8b",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "42cb5f8e28e56b7043c36d5df75d2983",
"grade": true,
"grade_id": "cell-40f99ea46ff86e1b",
"locked": true,
"points": 1,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [],
"source": [
"check_var_defined('test_targets_mobilenet')\n",
"check_var_defined('recall_mobilenet')\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2bb96e04",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "d14981d8b3ee887ec733ccdf2a5c4087",
"grade": true,
"grade_id": "cell-9cf521b9f236b452",
"locked": true,
"points": 1,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [],
"source": [
"check_var_defined('precision_mobilenet')\n"
]
},
{
"cell_type": "markdown",
"id": "59328ebf",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "5c1cb00105466e9970116015677dc3c1",
"grade": false,
"grade_id": "cell-8ef0296961e07f6f",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"_Comment on the results from the transfer-learned model and why the results differ to the case considered previously without transfer learning._"
]
},
{
"cell_type": "markdown",
"id": "81595522",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "61f54494e95241953588077a482d2287",
"grade": true,
"grade_id": "cell-d7f4cdf7f9265a07",
"locked": false,
"points": 2,
"schema_version": 3,
"solution": true,
"task": false
}
},
"source": [
"YOUR ANSWER HERE"
]
},
{
"cell_type": "markdown",
"id": "765888af",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "4e1abcc1a230cf42061638a716dde7e9",
"grade": false,
"grade_id": "cell-0f6125b686f4ee8e",
"locked": true,
"schema_version": 3,
"solution": false,
"task": false
}
},
"source": [
"_Suggest some ways the model could be improved further._"
]
},
{
"cell_type": "markdown",
"id": "835f9579",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "29e376668ed2a5736535480ca1085599",
"grade": true,
"grade_id": "cell-316415b7bac4ae5f",
"locked": false,
"points": 2,
"schema_version": 3,
"solution": true,
"task": false
}
},
"source": [
"YOUR ANSWER HERE"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.11"
}
},
"nbformat": 4,
"nbformat_minor": 5
}