Builds a trie (prefix tree) by inserting words one character at a time.
A Trie (pronounced "try"), also called a prefix tree or digital tree, is a tree data structure used for efficient storage and retrieval of strings over a known alphabet. Each path from the root to a marked node represents a stored word, and strings that share a common prefix share the same initial path in the trie. The name "trie" was coined by Edward Fredkin in 1960, derived from the word "retrieval," though Fredkin himself pronounced it "tree." To avoid confusion with the general concept of trees, most computer scientists now pronounce it "try." The underlying idea dates back even further to Axel Thue's work on string combinatorics in the early 1900s and was independently explored by Rene de la Briandais in 1959.
Consider inserting the words "CAR," "CARD," and "CARE" into an empty trie. The first word "CAR" creates nodes for C, A, and R, marking R as an end of word. When "CARD" is inserted, the algorithm follows the existing C-A-R path and only creates a new node for D. Similarly, "CARE" shares C-A-R and adds E. The shared prefix "CAR" is stored only once, which is the core efficiency advantage of tries.
| Operation | Time |
|---|---|
| Insert word | O(L) where L = word length |
| Search word | O(L) |
| Prefix search | O(P) where P = prefix length |
| Delete word | O(L) |
| Space | O(N x A) worst case |
Here N is the total number of characters stored and A is the alphabet size (e.g., 26 for lowercase English letters). Each node may have up to A children. In the worst case with no shared prefixes, space consumption can be significant, but in practice natural language data shares many prefixes, yielding substantial compression.
Trie nodes can store children in different ways, each with trade-offs. An array of size A provides O(1) child lookup but wastes space for sparse nodes. A hash map at each node reduces space for sparse alphabets while maintaining average O(1) lookup. A sorted list of children saves space but requires O(log A) lookup via binary search. Production systems choose based on alphabet size and memory constraints. For small alphabets like DNA sequences (A=4), arrays are ideal. For Unicode text, hash maps are practically required.
Compared to hash tables, tries offer several unique capabilities. A hash table can check whether a word exists in O(1) average time, but it cannot efficiently enumerate all words with a given prefix. Tries support prefix queries naturally: navigate to the node representing the prefix, then enumerate all words in that subtree. Compared to balanced BSTs storing strings, tries avoid repeated string comparisons and provide lookup time proportional to the query string length rather than O(L log n).
Tries are the backbone of autocomplete systems in search engines, text editors, and mobile keyboards. Spell checkers use tries to quickly verify whether a word exists in the dictionary and to suggest corrections by exploring nearby paths. IP routers use a specialized form called a Patricia trie (or radix tree) for longest prefix match routing. T9 predictive text input on older mobile phones used tries to map numeric key sequences to words.
Choose a trie when your problem involves prefix-based operations -- autocomplete, prefix search, spell checking, or dictionary storage with prefix queries. If you only need exact key lookups with no prefix operations, a hash table is simpler and more space-efficient.