A digital trie combines aspects of binary searching, radix searching, and hashing. The term “digital trie” refers to the data structure used to hold the items to be searched. It is a multilevel data structure that branches N ways at each level (in the example that follows, each level branches from 0 to 16 ways). The subject of treelike data structures and searching is too broad to describe fully here, but a good book on data structures or algorithms can teach you the concepts involved.
Below example shows a program implementing digital trie searching. You can combine this example with code at the end of the chapter to produce a working program. The concept is not too hard. Suppose that you use a hash function that maps to a full 32-bit integer. The hash value can also be considered to be a concatenation of eight 4-bit hash values. You can use the first 4-bit hash value to index into a 16-entry hash table.
Naturally, there will be many collisions, with only 16 entries. Collisions are resolved by having the table entry point to a second 16-entry hash table, in which the next 4-bit hash value is used as an index.
The tree of hash tables can be up to eight levels deep, after which you run out of hash values and have to search through all the entries that collided. However, such a collision should be very rare because it occurs only when all 32 bits of the hash value are identical, so most searches require only one comparison.
The binary searching aspect of the digital trie is that it is organized as a 16-way tree that is traversed to find the data. The radix search aspect is that successive 4-bit chunks of the hash value are examined at each level in the tree. The hashing aspect is that it is conceptually a hash table with 232 entries.
No comments:
Post a Comment