Efficiently finds a target value by halving the search space each step.
Binary Search is one of the most fundamental and widely used algorithms in all of computer science. It finds a target value within a sorted array by repeatedly dividing the search interval in half, eliminating half of the remaining candidates with each comparison. This simple yet powerful idea underlies countless systems, from database indexing to compiler optimization to the algorithms that route internet traffic.
The concept behind Binary Search is ancient. The idea of narrowing down possibilities by halving appears in number-guessing games that predate modern computing. However, the formal algorithm was first described by John Mauchly in 1946, and despite its conceptual simplicity, it proved remarkably difficult to implement correctly. Jon Bentley, in his classic book "Programming Pearls" (1986), noted that while most professional programmers believe they can write a correct Binary Search, studies showed that roughly 90% of implementations contained bugs. The most notorious bug involved integer overflow in the midpoint calculation: the expression (low + high) / 2 can overflow when low and high are large integers. The safer formulation low + (high - low) / 2 avoids this problem. This bug persisted in the Java standard library implementation for nearly a decade before being discovered and fixed in 2006.
low = 0 and high = n - 1, representing the current search range within the sorted array.mid = low + (high - low) / 2 to find the middle index of the current range. Using this formula instead of (low + high) / 2 prevents integer overflow.mid and compare it to the target value.
mid.high = mid - 1.low = mid + 1.low > high, which indicates the target is not present in the array.The algorithm's power comes from the fact that each comparison eliminates exactly half of the remaining search space. Starting with n elements, after one comparison we have at most n/2, then n/4, then n/8, and so on. After k comparisons, the remaining search space has at most n/2^k elements. Setting this equal to 1 and solving for k gives k = log2(n).
| Case | Time | Space |
|---|---|---|
| Best | O(1) | O(1) |
| Average | O(log n) | O(1) |
| Worst | O(log n) | O(1) |
The logarithmic scaling is extraordinarily powerful. For an array of 1 billion elements, Binary Search needs at most about 30 comparisons. For 1 trillion elements, it needs only about 40. This makes Binary Search feasible even on datasets far too large for linear scanning.
Binary Search has spawned a rich family of related techniques. Lower bound and upper bound searches find the first or last position where a value could be inserted while maintaining sorted order, which is essential for range queries. Exponential search first finds a range where the target might exist by doubling the index, then applies Binary Search within that range, making it efficient for unbounded or very large sorted sequences. Interpolation search estimates where the target is likely to be based on its value relative to the endpoints, achieving O(log log n) average time on uniformly distributed data. Perhaps most powerfully, binary search on the answer applies the halving principle not to an array but to an answer space: given a monotonic predicate function, Binary Search can find the exact threshold where the predicate flips from false to true. This technique is widely used in competitive programming and optimization.
Beyond the integer overflow bug, common mistakes include using incorrect boundary updates (such as setting high = mid instead of high = mid - 1, which can cause infinite loops), confusing inclusive and exclusive bounds, and applying Binary Search to unsorted data. The off-by-one errors that plague Binary Search implementations are so common that they have become a standard teaching example for the importance of loop invariants.
Binary Search requires sorted or monotonic data as a precondition. It is the algorithm of choice for searching in sorted arrays, finding insertion points, implementing lower and upper bound queries, and solving optimization problems where the feasibility function is monotonic. It appears in standard library functions across virtually every programming language, in B-tree and binary search tree lookups, in debugging techniques like git bisect for finding the commit that introduced a bug, and in system design problems involving partitioning and load balancing.