{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Lecture 16: Decision trees"
]
},
{
"cell_type": "markdown",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "skip"
},
"tags": []
},
"source": [
"\n",
"[Run in colab](https://colab.research.google.com/drive/1P9IoqXN9dbjJ3TN50wa8wwDdvn9P6hX7)"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"editable": true,
"execution": {
"iopub.execute_input": "2025-02-27T23:21:19.492309Z",
"iopub.status.busy": "2025-02-27T23:21:19.492086Z",
"iopub.status.idle": "2025-02-27T23:21:19.498527Z",
"shell.execute_reply": "2025-02-27T23:21:19.497958Z"
},
"slideshow": {
"slide_type": "skip"
},
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Last executed: 2025-02-27 23:21:19\n"
]
}
],
"source": [
"import datetime\n",
"now = datetime.datetime.now()\n",
"print(\"Last executed: \" + now.strftime(\"%Y-%m-%d %H:%M:%S\"))"
]
},
{
"cell_type": "markdown",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "subslide"
},
"tags": []
},
"source": [
"Considered conceptually as a *flow diagram* or tree of decisions based on inspecting properties of data-set.\n",
"\n",
"- Can perform both classification and regression.\n",
"- A fundamental component of random forests (a powerful machine learning algorithm covered in the next lecture). \n",
"- We will learn how to visualise and make predictions using Decision Trees.\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "slide"
},
"tags": []
},
"source": [
"## Conceptual example"
]
},
{
"cell_type": "markdown",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": ""
},
"tags": []
},
"source": [
"\n",
"\n",
"[[Image source](https://inside-machinelearning.com/en/decision-tree-and-hyperparameters/)]"
]
},
{
"cell_type": "markdown",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "slide"
},
"tags": []
},
"source": [
"## Walk-through of decision tree\n",
"\n",
"Let's consider an illustration using the Iris Data set (introduced in Lecture 3)."
]
},
{
"cell_type": "markdown",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "subslide"
},
"tags": []
},
"source": [
"### Images of different Iris species\n",
"\n",
"#### Iris Setosa\n",
"\n",
"
\n",
"\n",
"#### Iris Versicolor\n",
"\n",
"
\n",
"\n",
"#### Iris Virginica\n",
"\n",
"
\n",
"\n",
"[[Image source](https://github.com/jakevdp/sklearn_tutorial)]\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "subslide"
},
"tags": []
},
"source": [
"Load feature matrix, where each row correpsonds to an observed (*sampled*) flower, with a number of *features*, with corresponding target vector.\n",
"\n",
"Consider two features only for now (petal length and width)."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"editable": true,
"execution": {
"iopub.execute_input": "2025-02-27T23:21:19.532095Z",
"iopub.status.busy": "2025-02-27T23:21:19.531889Z",
"iopub.status.idle": "2025-02-27T23:21:20.377758Z",
"shell.execute_reply": "2025-02-27T23:21:20.377159Z"
},
"slideshow": {
"slide_type": ""
},
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"
DecisionTreeClassifier(max_depth=2, random_state=42)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
DecisionTreeClassifier(max_depth=2, random_state=42)
DecisionTreeClassifier(criterion='entropy', max_depth=2)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
DecisionTreeClassifier(criterion='entropy', max_depth=2)
DecisionTreeRegressor(max_depth=2, random_state=42)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
DecisionTreeRegressor(max_depth=2, random_state=42)
DecisionTreeRegressor(max_depth=3, random_state=42)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
DecisionTreeRegressor(max_depth=3, random_state=42)