Problem-Solving Series #4: Searching Fundamentals - From Linear to Binary and Beyond

Your Problem-Solving Companion (ItsAweso.me Series)

This guide continues your comprehensive study companion for mastering data structures and algorithms. Building on the sorting foundations from the previous parts, we now shift to how we actually find things efficiently: searching.

How to Use This Guide:

After reading: Try the boundary drills and templates
Before practice: Review binary search invariants and pitfalls
During interviews: Use the selection and “search-on-answer” patterns
For deeper learning: Implement each variant, then trace with adversarial cases

Remember: Mastery in search comes from writing and debugging your own boundary‑correct solutions.

Series Context: Why Search After Sort?
Framing a Search Problem
Linear Search: Baseline and Early Exits
Binary Search: Invariants, Bounds, and Monotone Predicates
Choosing Containers for Search
Practical Heuristics (That Feel Like Cheating)
Common Pitfalls and How to Avoid Them
Testing and Debugging Search Code
Exercises and Boundary Drills
Key Takeaways
Next in the Series

Series Context: Why Search After Sort?

Sorting gives structure; searching exploits it. With data ordered or indexed, we trade linear scans for logarithmic or constant-time lookups. This post focuses on fundamentals you’ll use daily.

Teaching Flow: Problem → Invariant → Implementation → Adversarial Tests

The Problem: Off‑by‑one errors and unclear invariants cause many binary search bugs and interview failures.

The Solution: Frame the problem precisely, pick the right container, lock down invariants, then test boundaries aggressively.

Framing a Search Problem

State space: What are you searching over? indices, values, time, capacity
Goal test: Exact match, first/last position, feasibility
Objective: Any solution vs optimal (earliest/latest/minimum/maximum)
Monotonicity: Is there a predicate P(x) that flips from false→true exactly once?
Determinism: Exact vs approximate (not in this part)

Example 1: Search for Value

// Find if target exists in sorted array
int binarySearch(int arr[], int n, int target) {
    int low = 0, high = n - 1;
    while (low <= high) {
        int mid = low + (high - low) / 2;
        if (arr[mid] == target) return mid;
        else if (arr[mid] < target) low = mid + 1;
        else high = mid - 1;
    }
    return -1; // Not found
}

Example 2: Search on Answer (Capacity)

// Find minimum ship capacity to deliver all packages within D days
bool canDeliver(int weights[], int n, int capacity, int days) {
    int currentWeight = 0, daysUsed = 1;
    for (int i = 0; i < n; i++) {
        if (currentWeight + weights[i] > capacity) {
            daysUsed++;
            currentWeight = weights[i];
        } else {
            currentWeight += weights[i];
        }
    }
    return daysUsed <= days;
}

int minShipCapacity(int weights[], int n, int days) {
    if (n == 0) return 0;
    
    // Find min possible capacity (largest single package)
    int low = weights[0];
    for (int i = 1; i < n; i++) {
        if (weights[i] > low) low = weights[i];
    }
    
    // Find max possible capacity (sum of all packages to ensure feasibility)
    int high = 0;
    for (int i = 0; i < n; i++) {
        high += weights[i];
    }
    
    while (low < high) {
        int mid = low + (high - low) / 2;
        if (canDeliver(weights, n, mid, days)) {
            high = mid; // Try smaller capacity
        } else {
            low = mid + 1; // Need larger capacity
        }
    }
    return low;
}

Linear Search: Baseline and Early Exits

When it’s fine: Tiny inputs, unsorted data, one‑off scans
Techniques: Early exit, sentinel to reduce bound checks
Complexity: O(n) time, O(1) space

Sentinel Pattern Example:

// Linear search with sentinel to reduce bound checks
int linearSearchWithSentinel(int arr[], int n, int target) {
    if (n == 0) return -1; // Handle empty array
    
    // Store original last element
    int originalLast = arr[n - 1];
    
    // Set sentinel
    arr[n - 1] = target;
    
    int i = 0;
    while (arr[i] != target) {
        i++;
    }
    
    // Restore original element
    arr[n - 1] = originalLast;
    
    // Check if we found target or hit sentinel
    if (i < n - 1 || originalLast == target) {
        return i;
    }
    return -1; // Not found
}

Why sentinel helps: Eliminates one comparison per iteration (no need to check i < n).

Binary Search: Invariants, Bounds, and Monotone Predicates

Core idea: Maintain an invariant over [low, high] and shrink correctly
Compute mid safely: mid = low + (high - low) / 2 (avoid overflow)
Variants:
- Existence test: Does target exist?
- Lower bound: First index where a[i] >= target
- Upper bound: First index where a[i] > target
- Search on answer: Binary search the parameter when P(x) is monotone
Duplicates: Prefer bounds‑style implementations
Termination: While low < high with half‑open intervals is often simpler

Lower Bound Template (First position where arri >= target):

int lowerBound(int arr[], int n, int target) {
    int low = 0, high = n; // Note: high = n (exclusive)
    
    while (low < high) {
        int mid = low + (high - low) / 2;
        if (arr[mid] < target) {
            low = mid + 1;
        } else {
            high = mid; // Keep mid in search space
        }
    }
    return low; // First position where arr[i] >= target
}

Upper Bound Template (First position where arri > target):

int upperBound(int arr[], int n, int target) {
    int low = 0, high = n; // Note: high = n (exclusive)
    
    while (low < high) {
        int mid = low + (high - low) / 2;
        if (arr[mid] <= target) {
            low = mid + 1;
        } else {
            high = mid; // Keep mid in search space
        }
    }
    return low; // First position where arr[i] > target
}

Key insight: Both use high = n (exclusive) and while (low < high). The difference is in the comparison: < vs <=.

Choosing Containers for Search

Array (sorted): Binary search, range queries with bounds
Hash set/map: O(1) average lookup, poor ordering/range behavior
Heap: Efficient top‑k, not general search
Tree (balanced BST): Ordered set/map, O(log n) lookup + range queries

Guideline: If you need ordering or ranges, prefer sorted arrays or trees. If you need membership only, prefer hashing.

Hash Table Example:

// Find two numbers that sum to target using hash table
int* twoSum(int nums[], int numsSize, int target, int* returnSize) {
    int *result = malloc(2 * sizeof(int));
    int hash[1000] = {0}; // Simple hash table for demo
    
    for (int i = 0; i < numsSize; i++) {
        int complement = target - nums[i];
        if (hash[complement] != 0) {
            result[0] = hash[complement] - 1; // Convert back to 0-based
            result[1] = i;
            *returnSize = 2;
            return result;
        }
        hash[nums[i]] = i + 1; // Store 1-based index
    }
    
    *returnSize = 0;
    return result;
}

Why hash table here: O(1) average lookup vs O(n) linear search for complement.

Practical Heuristics (That Feel Like Cheating)

Two‑pointers / sliding window: When the structure allows local moves instead of global search
Precompute + search: Prefix sums, hashing, frequency maps
Binary search on answer: Capacity, time, threshold problems with monotone feasibility

Binary Search on Time Example:

// Find minimum time to complete all tasks with given workers
bool canComplete(int tasks[], int n, int workers, int timeLimit) {
    int currentTime = 0, workersUsed = 1;
    for (int i = 0; i < n; i++) {
        if (currentTime + tasks[i] > timeLimit) {
            workersUsed++;
            currentTime = tasks[i];
        } else {
            currentTime += tasks[i];
        }
    }
    return workersUsed <= workers;
}

int minTimeToComplete(int tasks[], int n, int workers) {
    if (n == 0) return 0;
    
    // Find min possible time (largest single task)
    int low = tasks[0];
    for (int i = 1; i < n; i++) {
        if (tasks[i] > low) low = tasks[i];
    }
    
    // Find max possible time (sum of all tasks to ensure feasibility)
    int high = 0;
    for (int i = 0; i < n; i++) {
        high += tasks[i];
    }
    
    while (low < high) {
        int mid = low + (high - low) / 2;
        if (canComplete(tasks, n, workers, mid)) {
            high = mid; // Try smaller time
        } else {
            low = mid + 1; // Need more time
        }
    }
    return low;
}

Pattern: Binary search the answer (time) using a feasibility function that checks if the answer is achievable.

Common Pitfalls and How to Avoid Them

Off‑by‑one errors: Mix of inclusive/exclusive bounds; pick a convention and stick to it
Infinite loops: Failing to move bounds when mid equals a bound; ensure progress
Mid overflow: Use low + (high - low) / 2
Empty/singleton arrays: Handle n == 0 and n == 1
Duplicates: Incorrect early exit; use bounds to find first/last positions

Testing and Debugging Search Code

Adversarial sets: empty, one element, all equal, strictly increasing, duplicates clustering at edges
Property tests: For lower_bound, all indices < answer violate predicate; all ≥ satisfy
Trace logs: Print (low, mid, high) for a few crafted cases

5-Case Binary Search Test Checklist:

Test Case	Array	Target	Expected (lower_bound)	Why It Matters
Empty	`[]`	`5`	`0`	Handles zero-length arrays
Single	`[3]`	`3`	`0`	Single element edge case
All Equal	`[2,2,2,2]`	`2`	`0`	Duplicate handling
Not Found	`[1,3,5,7]`	`4`	`2`	Target between elements
Edge Values	`[1,3,5,7]`	`1` or `7`	`0` or `3`	First/last elements

Property Test for Lower Bound:

// Verify: all indices < result have arr[i] < target
//         all indices >= result have arr[i] >= target
bool verifyLowerBound(int arr[], int n, int target, int result) {
    for (int i = 0; i < result; i++) {
        if (arr[i] >= target) return false;
    }
    for (int i = result; i < n; i++) {
        if (arr[i] < target) return false;
    }
    return true;
}

Exercises and Boundary Drills

Implement exists(target) on a sorted array (iterative)
Implement lower_bound and upper_bound (handle duplicates)
Given sorted array, return the range [first, last] of a target or [-1, -1]
“Search on answer”: Min ship capacity to deliver within D days
Top‑k elements: Compare full sort vs heap approach

Tip: Time yourself. Aim for ≤15 minutes per variant without bugs.

Key Takeaways

Frame the problem first: state space, goal, monotonicity
Prefer bounds‑style binary search to handle duplicates cleanly
Choose containers based on membership vs ordering/range needs
Test edge cases deliberately to expose boundary errors

Next in the Series

We’ll move to advanced and domain‑specific search techniques: graph search (BFS/DFS, Dijkstra, 0‑1 BFS), string matching (KMP, Boyer‑Moore, Rabin‑Karp), tries, and indexing systems (inverted indexes, B+‑trees).