algoviz - Radix Sort — Process Individual Digits by Significance

Radix Sort

Sorting

Sorts integers by processing individual digits from least to most significant.

Radix Sort is a non-comparison sorting algorithm that sorts integers (or other data with positional structure) by processing individual digits one at a time. Unlike comparison-based sorts such as Quick Sort and Merge Sort, Radix Sort never directly compares two elements against each other. Instead, it distributes elements into buckets based on digit values, one digit position at a time. The least significant digit (LSD) variant processes digits from right to left, using a stable sub-sort (typically Counting Sort) at each position.

Historical Context

The concept behind Radix Sort dates back to the 1880s, when Herman Hollerith invented a tabulating machine that sorted punched cards by reading one column at a time. The machine was used in the 1890 United States Census, processing data far faster than manual methods. Hollerith's company eventually became IBM, and his card-sorting technique is essentially LSD Radix Sort performed mechanically. The algorithm was formalized for electronic computers in the 1950s and remains one of the most efficient ways to sort integers and fixed-length strings.

How It Works

Find max digits: Determine the maximum number of digits d among all elements. This tells the algorithm how many passes to make.
Sort by least significant digit: Use Counting Sort to sort the array based on the least significant (rightmost) digit. Elements are placed into buckets 0 through 9 based on their digit value at this position, then collected back into the array in bucket order.
Move to next digit: Repeat the Counting Sort pass on the next digit position, working from right to left.
Complete: After processing all d digit positions, the array is fully sorted.

The stability of the sub-sort at each digit position is critical. Because Counting Sort is stable -- elements with the same digit value maintain their relative order from the previous pass -- the sorting work from earlier passes is preserved. Without stability, earlier digit passes would be undone, and the algorithm would fail.

LSD vs MSD Variants

There are two main variants of Radix Sort. The LSD (Least Significant Digit) variant, shown in this visualization, processes digits from right to left. It is simpler to implement because it uses a single flat pass at each digit level and naturally handles variable-length numbers. The MSD (Most Significant Digit) variant processes digits from left to right and recursively sorts sub-groups with the same leading digit. MSD Radix Sort can short-circuit when sub-groups become small, making it faster for certain distributions, but it is more complex to implement and harder to parallelize.

Complexity

Case	Time	Space
All	O(d * (n + k))	O(n + k)

Here d is the number of digits, n is the number of elements, and k is the range of each digit (10 for decimal, 256 for byte-level processing). When d is constant (as it is for fixed-width 32-bit or 64-bit integers), the time complexity simplifies to O(n), making Radix Sort asymptotically faster than any comparison sort. The space complexity of O(n + k) comes from the auxiliary arrays used in Counting Sort.

Comparison with Comparison-Based Sorts

The theoretical lower bound for comparison-based sorting is O(n log n). Radix Sort bypasses this limit because it does not compare elements directly -- it uses the structure of the keys (digit positions) to place elements. However, this comparison is nuanced: for n 32-bit integers, d is 32 bits (or 4 bytes), so Radix Sort runs in O(4n) = O(n) while comparison sorts run in O(n log n). When n is large enough that log n exceeds d, Radix Sort wins. For small n or very long keys (where d is large), comparison sorts may be faster.

When to Use

Radix Sort is ideal for sorting large collections of fixed-length integers, strings, IP addresses, or database keys. It is widely used in suffix array construction, database indexing, GPU-based sorting, and network packet classification. The main limitations are its O(n + k) extra space requirement and its restriction to data that can be decomposed into digit-like components with a bounded range per position.