Your Problem-Solving Companion (ItsAweso.me Series)
This guide continues your comprehensive study companion for mastering data structures and algorithms. Building on the sorting foundations from the previous parts, we now shift to how we actually find things efficiently: searching.
How to Use This Guide:
- After reading: Try the boundary drills and templates
- Before practice: Review binary search invariants and pitfalls
- During interviews: Use the selection and “search-on-answer” patterns
- For deeper learning: Implement each variant, then trace with adversarial cases
Remember: Mastery in search comes from writing and debugging your own boundary‑correct solutions.
Table of Contents
- Series Context: Why Search After Sort?
- Framing a Search Problem
- Linear Search: Baseline and Early Exits
- Binary Search: Invariants, Bounds, and Monotone Predicates
- Choosing Containers for Search
- Practical Heuristics (That Feel Like Cheating)
- Common Pitfalls and How to Avoid Them
- Testing and Debugging Search Code
- Exercises and Boundary Drills
- Key Takeaways
- Next in the Series
Series Context: Why Search After Sort?
Sorting gives structure; searching exploits it. With data ordered or indexed, we trade linear scans for logarithmic or constant-time lookups. This post focuses on fundamentals you’ll use daily.
Teaching Flow: Problem → Invariant → Implementation → Adversarial Tests
The Problem: Off‑by‑one errors and unclear invariants cause many binary search bugs and interview failures.
The Solution: Frame the problem precisely, pick the right container, lock down invariants, then test boundaries aggressively.
Framing a Search Problem
- State space: What are you searching over? indices, values, time, capacity
- Goal test: Exact match, first/last position, feasibility
- Objective: Any solution vs optimal (earliest/latest/minimum/maximum)
- Monotonicity: Is there a predicate
P(x)that flips from false→true exactly once? - Determinism: Exact vs approximate (not in this part)
Example 1: Search for Value
// Find if target exists in sorted array
int binarySearch(int arr[], int n, int target) {
int low = 0, high = n - 1;
while (low <= high) {
int mid = low + (high - low) / 2;
if (arr[mid] == target) return mid;
else if (arr[mid] < target) low = mid + 1;
else high = mid - 1;
}
return -1; // Not found
}
Example 2: Search on Answer (Capacity)
// Find minimum ship capacity to deliver all packages within D days
bool canDeliver(int weights[], int n, int capacity, int days) {
int currentWeight = 0, daysUsed = 1;
for (int i = 0; i < n; i++) {
if (currentWeight + weights[i] > capacity) {
daysUsed++;
currentWeight = weights[i];
} else {
currentWeight += weights[i];
}
}
return daysUsed <= days;
}
int minShipCapacity(int weights[], int n, int days) {
if (n == 0) return 0;
// Find min possible capacity (largest single package)
int low = weights[0];
for (int i = 1; i < n; i++) {
if (weights[i] > low) low = weights[i];
}
// Find max possible capacity (sum of all packages to ensure feasibility)
int high = 0;
for (int i = 0; i < n; i++) {
high += weights[i];
}
while (low < high) {
int mid = low + (high - low) / 2;
if (canDeliver(weights, n, mid, days)) {
high = mid; // Try smaller capacity
} else {
low = mid + 1; // Need larger capacity
}
}
return low;
}
Linear Search: Baseline and Early Exits
- When it’s fine: Tiny inputs, unsorted data, one‑off scans
- Techniques: Early exit, sentinel to reduce bound checks
- Complexity: O(n) time, O(1) space
Sentinel Pattern Example:
// Linear search with sentinel to reduce bound checks
int linearSearchWithSentinel(int arr[], int n, int target) {
if (n == 0) return -1; // Handle empty array
// Store original last element
int originalLast = arr[n - 1];
// Set sentinel
arr[n - 1] = target;
int i = 0;
while (arr[i] != target) {
i++;
}
// Restore original element
arr[n - 1] = originalLast;
// Check if we found target or hit sentinel
if (i < n - 1 || originalLast == target) {
return i;
}
return -1; // Not found
}
Why sentinel helps: Eliminates one comparison per iteration (no need to check i < n).
Binary Search: Invariants, Bounds, and Monotone Predicates
- Core idea: Maintain an invariant over
[low, high]and shrink correctly - Compute mid safely:
mid = low + (high - low) / 2(avoid overflow) - Variants:
- Existence test: Does target exist?
- Lower bound: First index where
a[i] >= target - Upper bound: First index where
a[i] > target - Search on answer: Binary search the parameter when
P(x)is monotone
- Duplicates: Prefer bounds‑style implementations
- Termination: While low < high with half‑open intervals is often simpler
Lower Bound Template (First position where arri >= target):
int lowerBound(int arr[], int n, int target) {
int low = 0, high = n; // Note: high = n (exclusive)
while (low < high) {
int mid = low + (high - low) / 2;
if (arr[mid] < target) {
low = mid + 1;
} else {
high = mid; // Keep mid in search space
}
}
return low; // First position where arr[i] >= target
}
Upper Bound Template (First position where arri > target):
int upperBound(int arr[], int n, int target) {
int low = 0, high = n; // Note: high = n (exclusive)
while (low < high) {
int mid = low + (high - low) / 2;
if (arr[mid] <= target) {
low = mid + 1;
} else {
high = mid; // Keep mid in search space
}
}
return low; // First position where arr[i] > target
}
Key insight: Both use high = n (exclusive) and while (low < high). The difference is in the comparison: < vs <=.
Choosing Containers for Search
- Array (sorted): Binary search, range queries with bounds
- Hash set/map: O(1) average lookup, poor ordering/range behavior
- Heap: Efficient top‑k, not general search
- Tree (balanced BST): Ordered set/map, O(log n) lookup + range queries
Guideline: If you need ordering or ranges, prefer sorted arrays or trees. If you need membership only, prefer hashing.
Hash Table Example:
// Find two numbers that sum to target using hash table
int* twoSum(int nums[], int numsSize, int target, int* returnSize) {
int *result = malloc(2 * sizeof(int));
int hash[1000] = {0}; // Simple hash table for demo
for (int i = 0; i < numsSize; i++) {
int complement = target - nums[i];
if (hash[complement] != 0) {
result[0] = hash[complement] - 1; // Convert back to 0-based
result[1] = i;
*returnSize = 2;
return result;
}
hash[nums[i]] = i + 1; // Store 1-based index
}
*returnSize = 0;
return result;
}
Why hash table here: O(1) average lookup vs O(n) linear search for complement.
Practical Heuristics (That Feel Like Cheating)
- Two‑pointers / sliding window: When the structure allows local moves instead of global search
- Precompute + search: Prefix sums, hashing, frequency maps
- Binary search on answer: Capacity, time, threshold problems with monotone feasibility
Binary Search on Time Example:
// Find minimum time to complete all tasks with given workers
bool canComplete(int tasks[], int n, int workers, int timeLimit) {
int currentTime = 0, workersUsed = 1;
for (int i = 0; i < n; i++) {
if (currentTime + tasks[i] > timeLimit) {
workersUsed++;
currentTime = tasks[i];
} else {
currentTime += tasks[i];
}
}
return workersUsed <= workers;
}
int minTimeToComplete(int tasks[], int n, int workers) {
if (n == 0) return 0;
// Find min possible time (largest single task)
int low = tasks[0];
for (int i = 1; i < n; i++) {
if (tasks[i] > low) low = tasks[i];
}
// Find max possible time (sum of all tasks to ensure feasibility)
int high = 0;
for (int i = 0; i < n; i++) {
high += tasks[i];
}
while (low < high) {
int mid = low + (high - low) / 2;
if (canComplete(tasks, n, workers, mid)) {
high = mid; // Try smaller time
} else {
low = mid + 1; // Need more time
}
}
return low;
}
Pattern: Binary search the answer (time) using a feasibility function that checks if the answer is achievable.
Common Pitfalls and How to Avoid Them
- Off‑by‑one errors: Mix of inclusive/exclusive bounds; pick a convention and stick to it
- Infinite loops: Failing to move bounds when
midequals a bound; ensure progress - Mid overflow: Use
low + (high - low) / 2 - Empty/singleton arrays: Handle
n == 0andn == 1 - Duplicates: Incorrect early exit; use bounds to find first/last positions
Testing and Debugging Search Code
- Adversarial sets: empty, one element, all equal, strictly increasing, duplicates clustering at edges
- Property tests: For lower_bound, all indices < answer violate predicate; all ≥ satisfy
- Trace logs: Print
(low, mid, high)for a few crafted cases
5-Case Binary Search Test Checklist:
| Test Case | Array | Target | Expected (lower_bound) | Why It Matters |
|---|---|---|---|---|
| Empty | [] | 5 | 0 | Handles zero-length arrays |
| Single | [3] | 3 | 0 | Single element edge case |
| All Equal | [2,2,2,2] | 2 | 0 | Duplicate handling |
| Not Found | [1,3,5,7] | 4 | 2 | Target between elements |
| Edge Values | [1,3,5,7] | 1 or 7 | 0 or 3 | First/last elements |
Property Test for Lower Bound:
// Verify: all indices < result have arr[i] < target
// all indices >= result have arr[i] >= target
bool verifyLowerBound(int arr[], int n, int target, int result) {
for (int i = 0; i < result; i++) {
if (arr[i] >= target) return false;
}
for (int i = result; i < n; i++) {
if (arr[i] < target) return false;
}
return true;
}
Exercises and Boundary Drills
- Implement
exists(target)on a sorted array (iterative) - Implement
lower_boundandupper_bound(handle duplicates) - Given sorted array, return the range
[first, last]of a target or[-1, -1] - “Search on answer”: Min ship capacity to deliver within
Ddays - Top‑k elements: Compare full sort vs heap approach
Tip: Time yourself. Aim for ≤15 minutes per variant without bugs.
Key Takeaways
- Frame the problem first: state space, goal, monotonicity
- Prefer bounds‑style binary search to handle duplicates cleanly
- Choose containers based on membership vs ordering/range needs
- Test edge cases deliberately to expose boundary errors
Next in the Series
We’ll move to advanced and domain‑specific search techniques: graph search (BFS/DFS, Dijkstra, 0‑1 BFS), string matching (KMP, Boyer‑Moore, Rabin‑Karp), tries, and indexing systems (inverted indexes, B+‑trees).
