Step 2.2 – Hands-On: Decision Trees

What Is a Decision Tree?

A decision tree is a supervised learning algorithm used for both classification and regression. It splits data into branches based on feature values, like a flowchart of decisions.

Why Use It?

Easy to visualize and interpret
Handles both numerical and categorical data
No need for feature scaling
Great for understanding decision boundaries

What You’ll Build

A Python script that:

Loads a sample dataset
Trains a decision tree classifier
Visualizes the tree structure
Evaluates accuracy

Sample Code

from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier, plot_tree
from sklearn.metrics import accuracy_score
import matplotlib.pyplot as plt

# Load dataset
iris = load_iris()
X, y = iris.data, iris.target

# Train model
model = DecisionTreeClassifier(max_depth=3)
model.fit(X, y)

# Predict
y_pred = model.predict(X)

# Evaluate
accuracy = accuracy_score(y, y_pred)
print(f"Accuracy: {accuracy:.2f}")

# Visualize
plt.figure(figsize=(12, 8))
plot_tree(model, feature_names=iris.feature_names, class_names=iris.target_names, filled=True)
plt.title("Decision Tree Visualization")
plt.show()

Output Visualization

This is where you can display the decision tree plot generated by your script: