So far, we've looked at data in lines - arrays, linked lists, stacks, and queues all arrange items in a sequence. But the real world isn't linear. Family trees branch out. Social networks form webs. Road maps create interconnected routes.
Trees and graphs capture these relationships, and they're at the heart of some of AI's most powerful techniques.
A tree is a structure where each item (called a node) can have child nodes, forming a hierarchy. There's one special node at the top called the root, and nodes with no children are called leaves.
CEO
/ \
CTO CFO
/ \ \
Dev1 Dev2 Accountant
A binary tree restricts each node to at most two children - a left child and a right child. This simple constraint enables powerful algorithms.
8
/ \
3 10
/ \ \
1 6 14
A binary search tree adds one rule: for every node, all values in the left subtree are smaller, and all values in the right subtree are larger.
This makes searching fast - at each node, you know whether to go left or right:
Find 6 in the BST above:
Start at 8 → 6 < 8, go left
At 3 → 6 > 3, go right
At 6 → Found it!
Time complexity: O(log n) for a balanced tree - the same logarithmic magic as binary search on a sorted array.
In a balanced binary search tree with 1,000,000 nodes, roughly how many comparisons are needed to find a value?
A graph generalises trees by removing the hierarchy constraint. It consists of nodes (also called vertices) and edges (connections between nodes). Edges can be:
Social network (undirected):
Alice - Bob - Charlie
\ |
Diana - Eve
Road map (weighted, directed):
London →(2h)→ Birmingham →(1.5h)→ Manchester
Facebook's social graph contains over 3 billion nodes (users) and hundreds of billions of edges (friendships). Graph algorithms determine your News Feed, friend suggestions, and ad targeting - all running on one of the largest graphs ever built.
One of the most interpretable AI models is the decision tree. It asks a series of yes/no questions to classify data:
Is temperature > 30°C?
├── Yes: Is humidity > 70%?
│ ├── Yes: "Don't play tennis"
│ └── No: "Play tennis"
└── No: Is it windy?
├── Yes: "Don't play tennis"
└── No: "Play tennis"
Decision trees are popular because humans can read and understand them - crucial in healthcare, finance, and legal AI where explainability matters.
A hospital uses an AI to predict patient risk. Regulators require the AI to explain its decisions. Why might a decision tree be preferred over a deep neural network here, even if the neural network is slightly more accurate?
A random forest builds hundreds of decision trees, each trained on a slightly different subset of the data, then takes a vote. This ensemble approach is more accurate and robust than a single tree - and it's still one of the most widely used AI techniques in industry.
A knowledge graph stores facts as relationships between entities:
(London) --[capital_of]--> (United Kingdom)
(London) --[located_in]--> (England)
(Big Ben) --[located_in]--> (London)
Google's Knowledge Graph powers those information panels you see in search results. When you search "Big Ben," the graph connects it to London, the UK, and related landmarks.
Netflix, Spotify, and Amazon model users and items as a graph. If users A and B both liked items X and Y, and user A also liked item Z, the graph suggests Z to user B. This is collaborative filtering powered by graph structure.
Why are graphs better than simple lists for modelling social networks?
DFS explores as far as possible along one path before backtracking. Think of it as exploring a maze by always taking the leftmost turn until you hit a dead end, then backtracking.
A
/ \
B C
/ \ \
D E F
DFS order: A → B → D → E → C → F
DFS uses a stack (naturally via recursion, or explicitly). It's great for:
BFS explores all neighbours at the current depth before moving deeper. Think of it as ripples spreading outward from a stone dropped in a pond.
A
/ \
B C
/ \ \
D E F
BFS order: A → B → C → D → E → F
BFS uses a queue. It's great for:
Notice how stacks and queues from the previous lesson connect to tree and graph traversal? DFS uses a stack; BFS uses a queue. Data structures build upon each other - that's why learning them in order matters.
You want to find the shortest route between two cities on a map where all roads are the same length. Which traversal should you use?
Social media platforms measure "degrees of separation" - how many friend-of-friend hops connect two people. Which traversal algorithm would efficiently find the fewest hops between two users? How does this relate to the "six degrees of separation" idea?
Google's PageRank algorithm - the original breakthrough that made Google dominant - models the web as a graph. Each web page is a node, each hyperlink is a directed edge, and a page's importance is determined by how many important pages link to it. It's essentially a random walk on a graph.