In the previous lesson, you learned that data is the fuel of AI. But fuel alone doesn't drive a car โ you need an engine. In AI, that engine is called an algorithm.
By the end of this lesson, you'll understand three fundamental algorithms and know when to use each one.
An algorithm is simply a set of step-by-step instructions to solve a problem.
You already follow algorithms every day:
In AI, algorithms are the step-by-step instructions a computer follows to find patterns in data and make predictions.
Think about how you decide what to wear each morning. You probably check the weather, think about your plans, consider what's clean โ that's an algorithm! You follow a series of steps (even if unconsciously) to reach a decision. AI algorithms do the same thing, just with data instead of gut feelings.
A decision tree is one of the most intuitive algorithms in AI. It makes decisions by asking a series of yes/no questions โ just like the game "20 Questions."
Imagine you're deciding whether to play outside:
Is it raining?
โโโ Yes โ Stay inside ๐
โโโ No โ Is it above 15ยฐC?
โโโ Yes โ Play outside! โฝ
โโโ No โ Wear a jacket and play outside ๐งฅ
That's a decision tree! Each node asks a question, each branch follows an answer, and each leaf gives a final decision.
Suppose we want to predict whether someone will buy a product:
Age greater than 25?
โโโ Yes โ Has bought before?
โ โโโ Yes โ WILL BUY โ
(95% confidence)
โ โโโ No โ Price under ยฃ20?
โ โโโ Yes โ WILL BUY โ
(70% confidence)
โ โโโ No โ WON'T BUY โ (80% confidence)
โโโ No โ Student discount available?
โโโ Yes โ WILL BUY โ
(60% confidence)
โโโ No โ WON'T BUY โ (75% confidence)
Random Forests โ one of the most powerful ML algorithms โ simply combine hundreds of decision trees and let them "vote" on the answer. It's like asking 500 people for directions and going with the majority. This simple idea dramatically improves accuracy!
KNN is the "ask your neighbors" algorithm. Its logic is beautifully simple: things that are similar tend to be close together.
Imagine you move to a new city and want to find a good restaurant. What do you do? You ask your nearest neighbors for recommendations! If 3 out of 5 neighbors recommend Italian food, you'd probably try Italian.
KNN works exactly the same way:
Imagine a graph with red dots (cats) and blue dots (dogs), based on weight and height:
Height
| ๐ด ๐ด
| ๐ด โ ๐ด โ What is โ?
| ๐ด ๐ต
| ๐ต ๐ต
| ๐ต ๐ต ๐ต
+----------------โ Weight
If K=3, we find the 3 nearest neighbors to โ. If 2 are red (cat) and 1 is blue (dog), we predict: cat! ๐ฑ
from sklearn.neighbors import KNeighborsClassifier
# Create and train a KNN model with K=5
knn = KNeighborsClassifier(n_neighbors=5)
knn.fit(X_train, y_train)
# Predict on new data
prediction = knn.predict(X_test)
print(f"Accuracy: {knn.score(X_test, y_test):.2%}")
KNN is a "lazy learner" โ it doesn't actually learn anything during training! It just memorises all the data and does the real work at prediction time by calculating distances. This makes training instant but predictions slow on large datasets.
Linear regression is the art of fitting a line through data points. It's used when you want to predict a number (not a category).
Think about this: the more hours you study, the higher your test score tends to be. If you plotted this on a graph, you'd see data points trending upward. Linear regression draws the best-fitting line through those points.
Score
100 | * *
80 | * *
60 | * *
40 | *
20 | *
+-------------------------โ Hours studied
The line lets you predict: "If I study for 7 hours, I'll probably score around 85."
Every line can be described as:
y = mx + b
The algorithm finds the best values for m and b so the line is as close as possible to all the data points.
from sklearn.linear_model import LinearRegression
import numpy as np
# Study hours and test scores
hours = np.array([1, 2, 3, 4, 5, 6, 7, 8]).reshape(-1, 1)
scores = np.array([20, 35, 45, 55, 65, 75, 82, 90])
# Fit the model
model = LinearRegression()
model.fit(hours, scores)
# Predict score for 5.5 hours of study
predicted = model.predict([[5.5]])
print(f"Predicted score for 5.5 hours: {predicted[0]:.1f}")
print(f"Slope (m): {model.coef_[0]:.2f}")
print(f"Intercept (b): {model.intercept_:.2f}")
| Question | Decision Tree | KNN | Linear Regression | |----------|:---:|:---:|:---:| | Predicting a category (cat/dog)? | โ | โ | โ | | Predicting a number (price, score)? | โ | โ | โ | | Need to explain the decision? | โ โ | โ | โ | | Have a very large dataset? | โ | โ | โ | | Relationship is a straight line? | โ | โ | โ โ | | Don't know the relationship shape? | โ | โ | โ |
There's no single "best" algorithm. The right choice depends on your data, your problem, and what you need. A doctor diagnosing diseases might prefer a decision tree because it can explain why it made a diagnosis. A weather app predicting temperature might use linear regression because the relationship with historical data is roughly linear.
Let's build a simple decision tree for recommending films! Think about how you choose films:
Do you want something funny?
โโโ Yes โ Do you like animated films?
โ โโโ Yes โ Watch "Inside Out 2" ๐ญ
โ โโโ No โ Do you want something family-friendly?
โ โโโ Yes โ Watch "Home Alone" ๐
โ โโโ No โ Watch "The Grand Budapest Hotel" ๐จ
โโโ No โ Do you like action?
โโโ Yes โ Do you prefer superheroes?
โ โโโ Yes โ Watch "Spider-Man: Across the Spider-Verse" ๐ท๏ธ
โ โโโ No โ Watch "Top Gun: Maverick" โ๏ธ
โโโ No โ Do you want a true story?
โโโ Yes โ Watch "Hidden Figures" ๐
โโโ No โ Watch "Interstellar" ๐
Try this yourself: Add more branches! What about genre, mood, length, or language? Every question you add makes the tree more personalised. This is exactly how recommendation algorithms start โ simple rules that get refined with data.
# A simple decision tree in code
def recommend_movie(funny, animated, action, superhero, true_story):
if funny:
if animated:
return "Inside Out 2 ๐ญ"
else:
return "The Grand Budapest Hotel ๐จ"
else:
if action:
if superhero:
return "Spider-Man: Across the Spider-Verse ๐ท๏ธ"
else:
return "Top Gun: Maverick โ๏ธ"
else:
if true_story:
return "Hidden Figures ๐"
else:
return "Interstellar ๐"
# Try it!
print(recommend_movie(funny=False, animated=False,
action=True, superhero=True,
true_story=False))
# Output: Spider-Man: Across the Spider-Verse ๐ท๏ธ
You've now met three classic algorithms. In the next lesson, we'll explore neural networks โ the powerful, brain-inspired algorithms behind modern AI breakthroughs like ChatGPT, image generation, and self-driving cars. Get ready to think in layers! ๐ง