Editorial for IOI '11 P2 - Race
Submitting an official solution before solving the problem yourself is a bannable offence.
Given a tree with nodes, this task asks for a path of length with the minimum number of edges. It looks like a usual dynamic programming task. However, when is large, another approach is required.
The model solution for this task follows a divide-and-conquer approach.
Consider a node in the graph. There are two possible cases: when node belongs to the solution path ; or when node does not.
In the second case, we can delete node from the tree and break it into smaller trees. We can then recurse on each of the resulting trees to find the solution.
With this general approach in mind, we have to answer the following questions.
- How to find the best path that contains node ?
- How to choose to achieve a better running time?
Note that the second question is very important because if we can guarantee that the sizes of all resulting trees are small, we can bound the number of recursion levels.
Finding the solution containing
Consider the case that contains node . Let's consider a simpler case where we only want to find if there exists a path of length exactly that contains .
If is one of the endpoints in , we can find the path using one application of depth first search (DFS).
However, if is "inside" , then two of 's adjacent nodes and must also be in . Thus, we shall find and .
Consider some node adjacent to . With one application of DFS, we can find the set of all path lengths for all paths starting at and containing edge .
Hence, to find and , we need to find two nodes and such that there exists a pair and for which . This can be done by DFS from through every edge for all adjacent nodes with careful bookkeeping using an array of size .
The running time for this step is .
Finding the right node
Our goal is to find node such that after deleting , all resulting trees are sufficiently "small." In this case, we shall find node such that each remaining tree has at most nodes. We shall refer to node as the central node.
It is not clear if such a node exists. So let's argue about that first.
Pick an arbitrary node as a candidate. Denote by the forest obtained by deleting from . For each node adjacent to , denote by the tree containing in . If every tree has at most nodes, we are done and is the required central node.
Otherwise, there exists one tree that contains more than nodes. (Note that there can be only one tree violating our criteria.) In this case, we pick as our new candidate and repeat the process.
This process will eventually stop at some candidate node and that's the required central node. To see this, note that after leaving , we shall never go back to pick again; since there are nodes, the process can repeat at most times.
After knowing that the central node exists, there are many ways to find it. We can follow the process directly as in the argument. But this is too slow to be useful.
The following are two procedures that find the central node in time and time.
Bottom-up approach
We can find node in a bottom-up fashion. We shall keep a priority queue of all "processed" subtrees using their sizes as weights.
We maintain, for each node, its state which can either be new or processed; initially all nodes are new. Every node also has a weight. Initially, every node has a weight of .
We start with all leaf nodes in . Note that each node in is every node which has all but one of its adjacent nodes processed. For each node , we denote by the unique neighbor of which is new.
While there are nodes in , we extract node with the smallest weight. We update 's state to processed and increase the weight of by 's weight. If all but one neighbor of are processed, we insert into .
Using this procedure, the last node inserted to is the desired central node.
DFS with bookkeeping
With DFS and good bookkeeping, we can find the central node in .
We pick an arbitrary node to start a DFS. With this procedure, we can consider as rooted at and the parent-child relationship between adjacent pairs of nodes are clearly defined. While performing DFS, we compute, for each node , the number of its descendants .
With this information, we can figure out if a candidate is the central node. For each node adjacent to , if is one of 's children, the size of the resulting tree containing after deleting is . If is 's parent, the size of the resulting tree containing after deleting is:
where is the set of children of . If the size of each resulting tree is at most , is the desired central node. The time needed to check is proportional to 's degree. Therefore, we can check all nodes in time .
Running time
Let be the worst-case running time when the tree has nodes. We can write the recurrence as:
where is the time for finding , is the size of the new trees, and is some constant.
Since we know that , there are at most levels of the recursion.
If we use -time to find , each level would run in time and the total running time is . If we use a slower -time procedure, the total running time will be .
Notes
There are other heuristics for finding that do not always work. Here are some examples.
- In a divide-and-conquer solution, the highest degree node is picked.
- In a divide-and-conquer solution, the node that minimizes the maximum distance to any node is picked.
Comments