Trie, also referred to as digital tree or prefix tree, is a tree-like data structure that is used for storage and efficient retrieval of keys, usually strings.
A trie is more efficient than a hashtable and other tree data structures in solving certain string retrieval problems such as looking up a word in a dictionary, looking up an a name in an address book, auto-complete words, etc. In fact, the name 'trie' comes from the word “retrieval”, as the main purpose of using this structure is that it provides fast retrieval.
A trie is an abstract data structure since it internally use other data structures, such as arrays, to provide the trie functionality.
Similar to other data structures, it is important to be thorough with the concepts of trie, the common operations performed on trie, and the efficiency of these operations.
Below interview questions on trie addresses some of these topics. Java programming language will be used for code snippets when applicable to depict the concepts of trie.
Structure
Trie is similar to graph data structure. It consists of nodes, with each node containing zero or more child nodes.
Trie is similar to ordered trees where each of the children can either be null or points to another node
If words are stored in the trie data structure, then each alphabet of the word is represented by a node, with each node being a child of the previous node.
For example, the word 'the' is stored in the trie data structure as three nodes, one for each alphabet. 'h' node is a child node of 't' node, and 'e' node is a child of 'h' node.
Properties: Following are some properties of a trie data structure.
Operations: The key operations supported by trie data structure are:
Structure
A trie node represents an alphabet in a word. So if a word 'hello' is inserted into the trie then five nodes are created, each node representing an alphabet.
A trie node at a minimum contains the following data elements:
Pseudocode
Code
public class TrieNode {
//indicates end of word
public boolean isEndOfWord;
//child nodes
public TrieNode[] children;
//max number of child nodes
public static int maxChildren = 26;
//constructor
public TrieNode() {
this.isEndOfWord = false;
this.children = new TrieNode[maxChildren];
}
public void markAsEnd() {
this.isEndOfWord = true;
}
public void unmarkAsEnd() {
this.isEndOfWord = false;
}
}
Structure
A trie class internally uses the trie node (created in previous question) and provides functions to insert a word, delete a word, and search for a word.
A trie class contains a root note which is null, and has 26 child nodes, each of which may be null or may point to another trie node.
Pseudocode
Code
public class Trie {
//root node
public TrieNode rootNode;
public Trie() {
this.rootNode = new TrieNode();
}
public void insert(String word) {
}
public void search(String word) {
}
public void delete(String word) {
}
}
Here we will implement the logic for the insert() function that we defined in the trie class in the previous question.
Implementation logic
To insert a word into a trie data structure, we will take each character of the word, beginning from the first character, and check if it already exists in the trie path (prefix path). If it does not exist, we will add a new node to the prefix path. If it is the end of the word we will set the end of word flag to true.
Pseudocode
Code
public class Trie {
//root node
public TrieNode rootNode;
public Trie() {
this.rootNode = new TrieNode();
}
public void insert(String word) {
//set current node to root node
TrieNode currentNode = rootNode;
int position = 0;
//for each character of the word
for (int i = 0; i < word.length(); i++) {
//find the index position
position = word.charAt(i) - 'a';
//If current node does not have a child node at the index position
if (null == currentNode.children[position]) {
//create new node
currentNode.children[position] = new TrieNode();
}
//set current node to new child node
currentNode = currentNode.children[position];
}
//mark last node as end of word
currentNode.isEndOfWord = true;
}
public void search(String word) {
}
public void delete(String word) {
}
}
Here we will implement the logic for the search() function that we defined in the trie class in the previous question.
Implementation logic
To search for a word in a trie data structure, we will take each character of the word, beginning from the first character, and check if it exists in the trie path (prefix path). If all characters exits in a prefix path, i.e. each character node is a child of the previous character node then the word exists, else the word does not exist in the trie data structure.
Pseudocode
1. Validate input word is not null
2. Set current node to root node
3. For each character of the word
4. Find the index position of the character (0...25)
5. Check if current node has a child node at the index position
6. If found, set child node as the current node
7. If not found, return false
8. Return true if endOfWord is true for the last node
Code
public boolean search(String word) {
//1. Validate input word is not null
if (word==null) {
return false;
}
//2. Set current node to root node
TrieNode currentNode = rootNode;
int position = 0;
//3. For each character of the word
for (int i = 0; i < word.length(); i++) {
//4. Find the index position of the character with reference to 'a' which is at index 0
position = word.charAt(i) - 'a';
//5. Check if current node has a child node at the index position
if (currentNode.children[position]!=null) {
//6. If found, set child node as the current node
currentNode = currentNode.children[position];
} else {
//7. If not found, return false
return false;
}
}
//8. Return true if endOfWord is true for the last node
return currentNode.isEndOfWord;
}
Here we will implement the logic for the delete() function that we defined in the trie class in the previous question.
Implementation logic
To delete a word in a trie data structure, we will take each character of the word, beginning from the first character, and check if it exists. If it exists and has no child nodes then we delete this node. If it exists and has child nodes, and isEndOfWord falg is true, then we set this to flag is false.
1. Validate input word is not null
2. Set current node to root node
3. For each character of the word
4. Find the index position of the character (0...25)
5. Check if current node has a child node at the index position
6. If found, set child node as the current node
7. If not found, return false
8. If current node does not have child nodes then delete node.
9. If current node has isEndOfWord flag as true, then set it to false.
Code
public boolean search(String word) {
//1. Validate input word is not null
if (word==null) {
return false;
}
//2. Set current node to root node
TrieNode currentNode = rootNode;
int position = 0;
//3. For each character of the word
for (int i = 0; i < word.length(); i++) {
//4. Find the index position of the character with reference to 'a' which is at index 0
position = word.charAt(i) - 'a';
//5. Check if current node has a child node at the index position
if (currentNode.children[position]!=null) {
//6. If found, set child node as the current node
currentNode = currentNode.children[position];
} else {
//7. If not found, return false
return false;
}
}
//8. Return true if endOfWord is true for the last node
return currentNode.isEndOfWord;
}