A heap is a complete binary tree where every parent satisfies the heap property - either always smaller (min-heap) or always larger (max-heap) than its children. "Complete" means every level is full except possibly the last, which fills left to right.
1 Min-heap
/ \
3 5
/ \
7 4
Because it's complete, we store it in a flat array - no pointers needed. For index i: left child = 2i + 1, right child = 2i + 2, parent = (i - 1) // 2.
| Property | Min-Heap | Max-Heap | |----------|----------|----------| | Root | Smallest element | Largest element | | Parent ≤ Children? | ✅ Yes | ❌ No (≥) | | Extract gives | Minimum | Maximum |
Python's heapq is a min-heap by default. For a max-heap, negate values on insert and negate again on extract.
Insert - push to end, then bubble up (swap with parent while smaller):
import heapq
h = [1, 3, 5, 7]
heapq.heappush(h, 2) # h becomes [1, 2, 5, 7, 3]
Extract min - swap root with last element, pop last, then sift down (swap with smaller child):
smallest = heapq.heappop(h) # returns 1
Both operations run in O(log n) because the tree height is log n.
Calling heapq.heapify(arr) converts a list into a heap in-place in O(n), not O(n log n). It works bottom-up, sifting down each non-leaf node. This is one of the most surprising complexity results in CS.
A priority queue lets you always access the highest-priority item first. Heaps are the go-to implementation because both insert and extract are O(log n). Use one any time you need "give me the best item so far."
"Find the K largest elements in an unsorted array."
Keep a min-heap of size K. For each element, push it; if the heap exceeds K, pop the smallest. What remains are the K largest.
import heapq
def top_k(nums, k):
return heapq.nlargest(k, nums)
Time: O(n log k) - much better than sorting when k ≪ n.
You need the 5 largest scores from 1 million entries. What heap type and size?
Push the head of each list into a min-heap. Pop the smallest, then push the next element from that same list. Repeat until all lists are exhausted. Time: O(N log K) where N is total elements.
Maintain two heaps: a max-heap for the lower half and a min-heap for the upper half. Balance their sizes so they differ by at most 1. The median is always at one or both roots.
Lower half (max-heap): [1, 2, 3] ← max = 3
Upper half (min-heap): [4, 5, 6] ← min = 4
Median = (3 + 4) / 2 = 3.5
In the two-heap median approach, where does a new element go first?
Heaps shine in scheduling: meeting rooms (track earliest end-time), task scheduling with cooldowns, or CPU job queues. The key insight is that a heap efficiently tracks "what finishes next."
| Need | Best choice | Why | |------|-------------|-----| | Repeated min/max extraction | Heap | O(log n) insert + extract | | Static data, one-time sort | Sorted array | O(n log n) once, O(1) access | | Range queries + order stats | Balanced BST | O(log n) for everything | | Top-K from a stream | Heap of size K | O(n log k) total |
Which operation is O(1) in a min-heap?
| Operation | Time | |-----------|------| | Insert | O(log n) | | Extract min/max | O(log n) | | Peek min/max | O(1) | | Heapify | O(n) | | Top-K from n items | O(n log k) |