Your Problem-Solving Companion (ItsAweso.me Series)

This guide continues your comprehensive study companion for mastering data structures and algorithms. Building on the sorting foundations from the previous parts, we now shift to how we actually find things efficiently: searching.

How to Use This Guide:

  • After reading: Try the boundary drills and templates
  • Before practice: Review binary search invariants and pitfalls
  • During interviews: Use the selection and “search-on-answer” patterns
  • For deeper learning: Implement each variant, then trace with adversarial cases

Remember: Mastery in search comes from writing and debugging your own boundary‑correct solutions.

Table of Contents

Series Context: Why Search After Sort?

Sorting gives structure; searching exploits it. With data ordered or indexed, we trade linear scans for logarithmic or constant-time lookups. This post focuses on fundamentals you’ll use daily.

Teaching Flow: Problem → Invariant → Implementation → Adversarial Tests

The Problem: Off‑by‑one errors and unclear invariants cause many binary search bugs and interview failures.

The Solution: Frame the problem precisely, pick the right container, lock down invariants, then test boundaries aggressively.


Framing a Search Problem

  • State space: What are you searching over? indices, values, time, capacity
  • Goal test: Exact match, first/last position, feasibility
  • Objective: Any solution vs optimal (earliest/latest/minimum/maximum)
  • Monotonicity: Is there a predicate P(x) that flips from false→true exactly once?
  • Determinism: Exact vs approximate (not in this part)

Example 1: Search for Value

// Find if target exists in sorted array
int binarySearch(int arr[], int n, int target) {
    int low = 0, high = n - 1;
    while (low <= high) {
        int mid = low + (high - low) / 2;
        if (arr[mid] == target) return mid;
        else if (arr[mid] < target) low = mid + 1;
        else high = mid - 1;
    }
    return -1; // Not found
}

Example 2: Search on Answer (Capacity)

// Find minimum ship capacity to deliver all packages within D days
bool canDeliver(int weights[], int n, int capacity, int days) {
    int currentWeight = 0, daysUsed = 1;
    for (int i = 0; i < n; i++) {
        if (currentWeight + weights[i] > capacity) {
            daysUsed++;
            currentWeight = weights[i];
        } else {
            currentWeight += weights[i];
        }
    }
    return daysUsed <= days;
}

int minShipCapacity(int weights[], int n, int days) {
    if (n == 0) return 0;
    
    // Find min possible capacity (largest single package)
    int low = weights[0];
    for (int i = 1; i < n; i++) {
        if (weights[i] > low) low = weights[i];
    }
    
    // Find max possible capacity (sum of all packages to ensure feasibility)
    int high = 0;
    for (int i = 0; i < n; i++) {
        high += weights[i];
    }
    
    while (low < high) {
        int mid = low + (high - low) / 2;
        if (canDeliver(weights, n, mid, days)) {
            high = mid; // Try smaller capacity
        } else {
            low = mid + 1; // Need larger capacity
        }
    }
    return low;
}

Linear Search: Baseline and Early Exits

  • When it’s fine: Tiny inputs, unsorted data, one‑off scans
  • Techniques: Early exit, sentinel to reduce bound checks
  • Complexity: O(n) time, O(1) space

Sentinel Pattern Example:

// Linear search with sentinel to reduce bound checks
int linearSearchWithSentinel(int arr[], int n, int target) {
    if (n == 0) return -1; // Handle empty array
    
    // Store original last element
    int originalLast = arr[n - 1];
    
    // Set sentinel
    arr[n - 1] = target;
    
    int i = 0;
    while (arr[i] != target) {
        i++;
    }
    
    // Restore original element
    arr[n - 1] = originalLast;
    
    // Check if we found target or hit sentinel
    if (i < n - 1 || originalLast == target) {
        return i;
    }
    return -1; // Not found
}

Why sentinel helps: Eliminates one comparison per iteration (no need to check i < n).


Binary Search: Invariants, Bounds, and Monotone Predicates

  • Core idea: Maintain an invariant over [low, high] and shrink correctly
  • Compute mid safely: mid = low + (high - low) / 2 (avoid overflow)
  • Variants:
    • Existence test: Does target exist?
    • Lower bound: First index where a[i] >= target
    • Upper bound: First index where a[i] > target
    • Search on answer: Binary search the parameter when P(x) is monotone
  • Duplicates: Prefer bounds‑style implementations
  • Termination: While low < high with half‑open intervals is often simpler

Lower Bound Template (First position where arri >= target):

int lowerBound(int arr[], int n, int target) {
    int low = 0, high = n; // Note: high = n (exclusive)
    
    while (low < high) {
        int mid = low + (high - low) / 2;
        if (arr[mid] < target) {
            low = mid + 1;
        } else {
            high = mid; // Keep mid in search space
        }
    }
    return low; // First position where arr[i] >= target
}

Upper Bound Template (First position where arri > target):

int upperBound(int arr[], int n, int target) {
    int low = 0, high = n; // Note: high = n (exclusive)
    
    while (low < high) {
        int mid = low + (high - low) / 2;
        if (arr[mid] <= target) {
            low = mid + 1;
        } else {
            high = mid; // Keep mid in search space
        }
    }
    return low; // First position where arr[i] > target
}

Key insight: Both use high = n (exclusive) and while (low < high). The difference is in the comparison: < vs <=.


  • Array (sorted): Binary search, range queries with bounds
  • Hash set/map: O(1) average lookup, poor ordering/range behavior
  • Heap: Efficient top‑k, not general search
  • Tree (balanced BST): Ordered set/map, O(log n) lookup + range queries

Guideline: If you need ordering or ranges, prefer sorted arrays or trees. If you need membership only, prefer hashing.

Hash Table Example:

// Find two numbers that sum to target using hash table
int* twoSum(int nums[], int numsSize, int target, int* returnSize) {
    int *result = malloc(2 * sizeof(int));
    int hash[1000] = {0}; // Simple hash table for demo
    
    for (int i = 0; i < numsSize; i++) {
        int complement = target - nums[i];
        if (hash[complement] != 0) {
            result[0] = hash[complement] - 1; // Convert back to 0-based
            result[1] = i;
            *returnSize = 2;
            return result;
        }
        hash[nums[i]] = i + 1; // Store 1-based index
    }
    
    *returnSize = 0;
    return result;
}

Why hash table here: O(1) average lookup vs O(n) linear search for complement.


Practical Heuristics (That Feel Like Cheating)

  • Two‑pointers / sliding window: When the structure allows local moves instead of global search
  • Precompute + search: Prefix sums, hashing, frequency maps
  • Binary search on answer: Capacity, time, threshold problems with monotone feasibility

Binary Search on Time Example:

// Find minimum time to complete all tasks with given workers
bool canComplete(int tasks[], int n, int workers, int timeLimit) {
    int currentTime = 0, workersUsed = 1;
    for (int i = 0; i < n; i++) {
        if (currentTime + tasks[i] > timeLimit) {
            workersUsed++;
            currentTime = tasks[i];
        } else {
            currentTime += tasks[i];
        }
    }
    return workersUsed <= workers;
}

int minTimeToComplete(int tasks[], int n, int workers) {
    if (n == 0) return 0;
    
    // Find min possible time (largest single task)
    int low = tasks[0];
    for (int i = 1; i < n; i++) {
        if (tasks[i] > low) low = tasks[i];
    }
    
    // Find max possible time (sum of all tasks to ensure feasibility)
    int high = 0;
    for (int i = 0; i < n; i++) {
        high += tasks[i];
    }
    
    while (low < high) {
        int mid = low + (high - low) / 2;
        if (canComplete(tasks, n, workers, mid)) {
            high = mid; // Try smaller time
        } else {
            low = mid + 1; // Need more time
        }
    }
    return low;
}

Pattern: Binary search the answer (time) using a feasibility function that checks if the answer is achievable.


Common Pitfalls and How to Avoid Them

  • Off‑by‑one errors: Mix of inclusive/exclusive bounds; pick a convention and stick to it
  • Infinite loops: Failing to move bounds when mid equals a bound; ensure progress
  • Mid overflow: Use low + (high - low) / 2
  • Empty/singleton arrays: Handle n == 0 and n == 1
  • Duplicates: Incorrect early exit; use bounds to find first/last positions

Testing and Debugging Search Code

  • Adversarial sets: empty, one element, all equal, strictly increasing, duplicates clustering at edges
  • Property tests: For lower_bound, all indices < answer violate predicate; all ≥ satisfy
  • Trace logs: Print (low, mid, high) for a few crafted cases

5-Case Binary Search Test Checklist:

Test CaseArrayTargetExpected (lower_bound)Why It Matters
Empty[]50Handles zero-length arrays
Single[3]30Single element edge case
All Equal[2,2,2,2]20Duplicate handling
Not Found[1,3,5,7]42Target between elements
Edge Values[1,3,5,7]1 or 70 or 3First/last elements

Property Test for Lower Bound:

// Verify: all indices < result have arr[i] < target
//         all indices >= result have arr[i] >= target
bool verifyLowerBound(int arr[], int n, int target, int result) {
    for (int i = 0; i < result; i++) {
        if (arr[i] >= target) return false;
    }
    for (int i = result; i < n; i++) {
        if (arr[i] < target) return false;
    }
    return true;
}

Exercises and Boundary Drills

  1. Implement exists(target) on a sorted array (iterative)
  2. Implement lower_bound and upper_bound (handle duplicates)
  3. Given sorted array, return the range [first, last] of a target or [-1, -1]
  4. “Search on answer”: Min ship capacity to deliver within D days
  5. Top‑k elements: Compare full sort vs heap approach

Tip: Time yourself. Aim for ≤15 minutes per variant without bugs.


Key Takeaways

  1. Frame the problem first: state space, goal, monotonicity
  2. Prefer bounds‑style binary search to handle duplicates cleanly
  3. Choose containers based on membership vs ordering/range needs
  4. Test edge cases deliberately to expose boundary errors

Next in the Series

We’ll move to advanced and domain‑specific search techniques: graph search (BFS/DFS, Dijkstra, 0‑1 BFS), string matching (KMP, Boyer‑Moore, Rabin‑Karp), tries, and indexing systems (inverted indexes, B+‑trees).