Tree: A tree is a completed, acyclic graph that consists of n nodes and n-1 edges.
Rooted Tree: One of the nodes is appointed as the root of the tree and all other nodes are placed underneath the root.
Tree traversal & Tree DP
General graph traversal algorithms can be applied to tree traversal. But a tree doesn’t contains cycle. Therefore we don’t need to maintain a visit vector, we just need to note down the parent node during the traversal.
// Tree traversal
void dfs(int parent, int current) {
for (auto next:graph[current]) {
if (next!=parent) dfs(current, next);
}
}
// Tree DP
void dfs(int parent, int current) {
for (auto next:graph[current]) {
// push the current path info to child node
if (next!=parent) dfs(current, next, current_path_info);
}
// aggregate child nodes' information to current node
dp[current] = aggregate(child_info);
}
The diameter of a tree
Question: Find the longest path (diameter) in a tree.
- Algorithm1: calculate following DP values for each node
- ToLeaf[x]: the max length of a path from x to leaf nodes
- ToLeaf[x] = max(ToLeaf[y])+1, for all y in x’s child nodes
- MaxPath[x]: the max length of a path in the subtree from x
- MaxPath[x] = max(MaxPath[y], ToLeafTop1 + ToLeafTop2 + 2)
- Diameter = MaxPath[root]
- ToLeaf[x]: the max length of a path from x to leaf nodes
- Algorithm2: two rounds of DFS
- Starting from any node, use DFS to find the deepest leaf node
- Starting from the leaf node, use DFS to find the the new deepest leaf node. The path between these two leaf nodes is the diameter of the tree.
All longest paths
Question: Find out all longest path that goes through each node.
- MaxPath1[x], MaxPath1child[x]: the longest path from x to leaf node going through a child node, and we note down the child node.
- MaxPath2[x], MaxPath2child[x]: the longest path from x to leaf node going through a different child node, and we note down the child node.
- MaxPath3[x]: the longest path from x to leaf node going through its parent node. The path shouldn’t go back to itself.
- This value can be calculated when we have MaxPath1 and MaxPath2 information
- MaxPath[x] = the sum of the longest two paths among MaxPath1[x], MaxPath2[x] and MaxPath3[x]
Tree query: a query asking for information of paths, subtree in a rooted tree. Examples are,
- The k-th ancestor of a node
- The sum of values in the subtree of a node
- The sum of values on a path between two nodes
- The lowest common ancestor of two nodes
Finding ancestors
Question: Find the k-th ancestor of a node.
Use successor graph / functional graph to solve it in O(log k) time.
Subtree query and path query
Question: Query the information of a subtree or a path from the root node to a certain node.
Use tree traversal array to solve this in O(n) time. Tree traversal array store each node’s information by the traversal order.
// Build tree traversal arrays
// Mapping relation between tree id and array id
unordered_map<TreeNode*, int> tree_id_array_id;
// Three tree traversal arrays
vector<int> tree_size;
vector<int> subtree_sum;
vector<int> root_path_sum;
int dfs(TreeNode* parent, TreeNode* current) {
if (cur==nullptr) return -1;
// Create the id for current node
int node_id = tree_id_node_id.size();
tree_id_array_id[current] = node_id;
// Initialize current node's value in array
tree_size.push_back(1);
subtree_sum.push_back(current->val);
root_path_sum.push_back(root_path_sum[parent_id] + current->val);
// Traverse all children
for (auto next:current->child) {
int child_id = dfs(current, next);
if (child_id == -1) continue;
// Aggregate children's values
tree_size[node_id] += tree_size[child_id];
subtree_sum[node_id] += subtree_sum[child_id];
}
// return the id for the current node
return node_id;
}
// Query tree traversal arrays
int node_id = tree_id_array_id[x];
tree_size[node_id];
subtree_sum[node_id];
root_path_sum[node_id];
//Update tree traversal arrays
// Mapping relation between subtree and array range
int node_x_id = tree_id_array_id[x];
int subtree_x_begin = node_x_id;
int subtree_x_end = node_x_id+tree_size[node_x_id];
// Update subtree sum value
for (int i=0;i<=node_x_id;++i) {
int subtree_i_begin = i;
int subtree_i_end = i+tree_size[i];
// Check if subtree i contains node x
if (node_x_id>=subtree_i_begin
&& node_x_id<subtree_i_end) {
subtree_sum[i] += delta;
}
}
// Update path sum value
for (int i=subtree_x_begin; i<subtree_x_begin;++i) {
root_path_sum[i] += delta;
}
Lowest common ancestor (LCA)
Question: Find the lowest node whose subtree contains target nodes in interest.
Algorithm1: Meet at LCA. Starting from two nodes, always move up the lower node until two nodes meet at their LCA. This method requires us to know the depth of each node.
- For some special graph, this information is known ahead. For example, in complete binary tree, we know parent_id = child_id / 2.
- For other general graphs, we need to pre-calculate the depth information, then use successor graph / functional graph method to find LCA.
// For example, in a binary complete tree,
// we have special rule on node order
while (x!=y) {
if (x<y) x /= 2;
else y /= 2;
}
Algorithm2: Tree traversal array. Build a tree traversal array for depth information.
unordered_map<TreeNode*, int> tree_id_node_id;
vector<int> depth;
// Query LCA of x and y
int node_x_id = tree_id_array_id[x];
int node_y_id = tree_id_array_id[y];
int min_depth = MAX_INIT;
int node_lca_id;
// This range give us a path between x and y,
// the highest node in the path is their LCA.
for (int i=std::min(node_x_id, node_y_id);
i<=std::max(node_x_id, node_y_id);
++i) {
if (depth[i]<min_depth) {
min_depth = depth[i];
node_lca_id = i;
}
}
// Distance between x and y
int dist = depth[node_x_id]+depth[node_y_id]-depth[node_lca_id];
Algorithm3: Tarjan’s offline algorithm.
During the traversal of a tree, if a node is fully traversed (the node itself and all its subtree), we mark it as visited (a post order traversal). For the two target nodes, once both of them are visited, we know their LCA is in the current search tree.
- Case one: one visited node is in the subtree of the other visited node. The newly visited node is the root node of the subtree.
- Case two: two visited nodes are in two subtree of its LCA node. And this subtree rooted at LCA is currently being traversed.
We maintain disjoin set of nodes, connecting children nodes to its parent. Once a node is fully traversed, we first find LCA for it and the other target node. (For case one, the other target node will connect to current node). Then we connect it to its parent. (For case two, if in the future this node is looked up, it will be connected to its parent but not itself).
void dfs(int parent, int current) {
for (int next:graph[current]) {
if (next==parent) continue;
dfs(current, next);
}
// First find LCA for current node
for (int the_other_node:queries[current]) {
if (!vis[the_other_node]) continue;
ans.push_back(std::make_tuple(current, the_other_node, Find(the_other_node));
}
// Second connect current node to its parent and mark as visited
Union(current, parent); // make sure parent is the representitive node.
visited[current] = true;
}