Editorial for APIO '13 P2 - Toll - DMOJ: Modern Online Judge

Editorial for APIO '13 P2 - Toll

Remember to use this editorial only when stuck, and not to copy-paste code from it. Please be respectful to the problem author and editorialist.
Submitting an official solution before solving the problem yourself is a bannable offence.

Problem

This problem wants us to assign weights to some edges and pick a minimum spanning tree (MST) from the resultant graph to maximize a certain revenue function on the MST. For convenience, we'll call the initially weighted edges "default edges" and the edges we need to price (assign weights to) "Mr G's edges".

Too many MSTs

Given that the number of possible MSTs can be exponential in general graphs, the selection of the MST itself would normally already be intractable. However, in this case it turns out that when the edge weights are largely distinct, the number of MSTs is actually not a lot. In fact, notice that if we fix some edges to be in the MST, and guarantee that the remaining edge weights are all distinct, then there is only one unique MST. This leads directly to the following observation.

Observation 1. Suppose we fix some of Mr G's edges to be in the MST, and remove the rest from the graph. Then there's only one possible MST.

Proof. Contract the connected components formed from Mr G's fixed edges to single vertices. All the edge weights in this graph are now distinct, and the only MST of this graph can be extended to the MST of the original graph by adding in Mr G's edges. □

With this observation, it turns out that once we price those fixed Mr G's edges such that they are included in the MST, there will only be one MST to consider, regardless of the weights of Mr G's edges. This implies that, no matter how we price Mr G's edges, there are only at most $2^K$ ~2^K~ possible MSTs: one for each subset of Mr G's edges to be fixed inside.

How to fix?

Now our problem is: How do we price some edges such that they are included in the MST? One trivial way is simply to price all of them at $0$ ~0~. But our revenue will then also be $0$ ~0~. Indeed, we want to price them as high as possible, while keeping them in the MST. Let us first figure out the maximum weight we can price a single edge, $e$ ~e~, to keep it in the MST of any graph.

Observation 2. If an edge $e$ ~e~ is the only edge that has the maximum weight in any cycle of a graph, then $e$ ~e~ cannot be in the MST.

Proof. Suppose the MST contains $e$ ~e~. Remove $e$ ~e~ from the MST, and we get two disjoint trees that can be connected by an edge $e'$ ~e'~ from that cycle. The weight of $e'$ ~e'~ is strictly less than that of $e$ ~e~, thus we can replace $e$ ~e~ by $e'$ ~e'~ to obtain an MST of smaller total weight, a contradiction. □

Thus, with slight modification, we can come up with a necessary condition for any edge belonging to Mr G to be in the MST:

Condition 1. The edge must not have the maximum weight in any cycle of the graph, unless it shares the weight with a default edge in the cycle.

Let us price Mr G's edges so that they will all fulfill Condition 1. Now observe that once they do so, they must be in the MST!

Observation 3. As long as Mr G's edges all fulfill Condition 1, they must be in the MST.

Proof. Continually find any cycle in the graph and remove the edge of maximum weight since it cannot be in the MST (breaking ties by preferring to remove default edges). Since Mr G's edges all fulfill Condition 1, for each cycle they are contained in they will never be removed. Eventually there will be no cycles and the graph will form an MST with all of Mr G's edges. □

Fulfilling Condition 1

Let us first start with the graph with none of Mr G's edges, and find its MST using a standard MST algorithm like Kruskal's. We shall add in Mr G's edges one by one, ensuring that the edges fulfill Condition 1 all the time. For the first edge $e_1$ ~e_1~, consider the cycle formed from adding $e_1$ ~e_1~ to the initial MST (let's call it the MST-cycle for $e_1$ ~e_1~). $e_1$ ~e_1~ cannot be the only edge with maximum weight in this cycle, so we set its weight to be the same as the maximum edge weight in the current MST-cycle, $w_1$ ~w_1~.

Now, what about the other cycles containing $e_1$ ~e_1~? It turns out that there is no need to consider them at all!

Observation 4. The maximum edge weight in any cycle containing $e_1$ ~e_1~ is no less than $w_1$ ~w_1~, the maximum edge weight in the MST-cycle for $e_1$ ~e_1~.

Proof. Suppose the maximum edge weight in a cycle containing $e_1$ ~e_1~ is $w < w_1$ ~w < w_1~. Then $w_1$ ~w_1~ is greater than all edge weights in both cycles. It is easy to see that even without Mr G's edge $e_1$ ~e_1~, there must be a new cycle containing $w_1$ ~w_1~ with edges only from these two cycles. But $w_1$ ~w_1~ is the only edge with maximum weight in this new cycle. Hence it cannot be in the initial MST, a contradiction. □

It follows immediately that by pricing $e_1$ ~e_1~ at $w_1$ ~w_1~, it cannot be the only edge with maximum weight in any cycle, fulfilling Condition 1.

We now price $e_1$ ~e_1~ at $w_1$ ~w_1~ to displace the default edge with weight $w_1$ ~w_1~ from the MST, forming a resultant spanning tree.

Claim. The resultant tree $T$ ~T~ is an MST for the new graph that includes $e_1$ ~e_1~.

Proof. Consider the MST for the graph with only default edges. Every default edge not in this MST must have the maximum edge weight in some cycle and hence cannot be in the MST for the new graph. Since Mr G's edge $e_1$ ~e_1~ fulfills Condition 1, it must be in the MST. Thus the edges in $T$ ~T~ are the only ones that can be in the MST. □

Now that we have a new "initial MST", we can add in the next edge $e_2$ ~e_2~ similarly. However, now the MST-cycle for $e_2$ ~e_2~ might contain $e_1$ ~e_1~, which we also want to keep inside the MST. If $e_1$ ~e_1~ has the maximum weight in this cycle, $e_2$ ~e_2~ will definitely displace it and hence we must lower the weight of $e_1$ ~e_1~. Naturally, $e_1$ ~e_1~ should be set to at most the next-maximum weight – any larger will mean either $e_1$ ~e_1~ or $e_2$ ~e_2~ has to go! $e_2$ ~e_2~ can then also be set to that same weight, to displace the default edge with that weight.

By lowering the weight of $e_1$ ~e_1~, $e_1$ ~e_1~ will still fulfill Condition 1 for all other cycles. The argument that the other cycles containing $e_2$ ~e_2~ do not need to be considered is similar to that for $e_1$ ~e_1~; the only difference is that the initial MST now contains $e_1$ ~e_1~. Thus $e_2$ ~e_2~ also fulfills Condition 1. The resultant tree is now an MST for the graph with $e_1$ ~e_1~ and $e_2$ ~e_2~ added in, by a similar argument.

Hence, in a similar vein, we can go on to price all of Mr G's edges maximally. Eventually, they will all fulfill Condition 1 – and along the way, we have also built up an MST containing all of them!

We now have a simple solution for our original task: Iterate through all subsets of Mr G's edges to fix in the MST by adding in the edges one by one in a depth-first manner. Each time an edge is added in, find its MST-cycle and set its weight to be the maximum default edge weight $w_{max}$ ~w_{max}~ in the cycle. Mr G's edges in the cycle cannot have weights greater than $w_{max}$ ~w_{max}~, so update all their weights too. After adding in the edges for each subset, we have an MST whose revenue can be computed using a simple tree traversal. The runtime is $\mathcal O(M \log M + 2^K N)$ ~\mathcal O(M \log M + 2^K N)~. This should solve subtask 3 to get 56 points.

Compressing the tree

Our initial MST is huge: a total of $N-1$ ~N-1~ edges, with $N$ ~N~ up to $100\,000$ ~100\,000~. However, it turns out that not many of these default edges are important. If we look at our simple solution, the only default edges we need to consider are those that are maximum in some MST-cycle when we add in Mr G's edges. We shall call these "maximal default edges". It turns out that there are only at most $K$ ~K~ of them!

Observation 5. The maximal default edges for a given subset of Mr G's edges must be part of the maximal default edges for the full set of Mr G's edges.

Proof. We can add in Mr G's edges in arbitrary order to obtain the same set of maximal default edges. Thus, when adding in the full set of Mr G's edges, we add in the given subset of Mr G's edges first to obtain the maximal default edges for that subset. □

So how do we find the maximal default edges for the full set of Mr G's edges? We know that they cannot be in the MST with all $K$ ~K~ Mr G's edges, and yet are present in the initial MST with only default edges. Hence, we can easily find them by computing both MSTs and doing a quick comparison between them. This takes $\mathcal O(M \log M)$ ~\mathcal O(M \log M)~ time.

Once the maximal default edges are obtained, the initial MST can now be compressed to at most $K$ ~K~ edges connecting $K+1$ ~K+1~ components, using a simple union-find data structure. Our simple solution now finds the MST-cycle, prices the weights, and computes the revenue in $\mathcal O(K)$ ~\mathcal O(K)~ time. Hence the runtime is $\mathcal O(M \log M + 2^K K)$ ~\mathcal O(M \log M + 2^K K)~. This is expected to obtain 100 points.

Additional remarks

If we do not add in Mr K's edges one by one in a depth-first manner, but instead recompute the pricing for all edges for each subset of Mr K's edges, an additional $\mathcal O(K)$ ~\mathcal O(K)~ time is invoked. This gives solutions of $\mathcal O(M \log M + 2^K NK)$ ~\mathcal O(M \log M + 2^K NK)~ and $\mathcal O(M \log M + 2^K K^2)$ ~\mathcal O(M \log M + 2^K K^2)~ depending on whether the tree compression is done. The former should still solve subtask 3, while the latter only solves subtask 4 to get 78 points.

There is another solution for subtasks 3 and 4 that does not involve adding in Mr G's edges one by one: For each subset of Mr G's edges, compute the MST with the entire subset fixed inside from scratch. Now use all $K$ ~K~ edges that are present in the initial MST but not present in this MST to update the edge weights of Mr G's edges to fulfill Condition 1. The runtime is $\mathcal O(M \log M + 2^K NK)$ ~\mathcal O(M \log M + 2^K NK)~ or $\mathcal O(M \log M + 2^K K^2)$ ~\mathcal O(M \log M + 2^K K^2)~ depending on whether the compression was done. This approach is more widely used, but more difficult to speed up to the model runtime to get full score. Nevertheless, it is possible to do so using specific data structures to quicken the updating of the edge weights. The only contestant to solve subtask 4 used this method.

Credits

Initial task statement by Raymond Kang Seng Ing.
Refinement of task statement by Assoc Prof Chang Ee-Chien.
Solution sketch by Raymond Kang Seng Ing and Shen Chuanqi.
Verification of proofs by Assoc Prof Chang Ee-Chien and Dr Sung Wing Kin.
Additional testing and verification by Harta Wijaya.

Comments

There are no comments at the moment.