Editorial for CCC '18 S5 - Maximum Strategic Savings


Remember to use this editorial only when stuck, and not to copy-paste code from it. Please be respectful to the problem author and editorialist.
Submitting an official solution before solving the problem yourself is a bannable offence.

We assume that the reader has familiarity with minimum spanning trees and knows some algorithm to efficiently compute a minimum spanning tree, like Kruskal's.

In the problem statement, we have a grid of vertices. Every edge in this graph either goes between two vertices in the same column, or between two vertices in the same row. Furthermore, if an edge exists in row i connecting the vertices in columns a and b with weight w, for every other row in the grid, there is an edge connecting the vertices in columns a and b with weight w. Similarly, if an edge exists in column j connecting the vertices in rows p and q with weight z, for every other column in the grid, there is an edge connecting the vertices in rows p and q with weight z.

The problem asks us to maximize the sum of the weights of edges that we can delete from the graph such that it will still be connected. We claim that this is equivalent to computing the minimum spanning tree. If we compute a set of edges to remove such that the remaining edges leave a connected graph, this is equivalent to computing a set of edges to keep such that the chosen set comprises a connected graph and the remaining edges are to be removed. If we wish to maximize the sum of the weights of the edges that we want to remove, that is equivalent to minimizing the sum of the weights of the edges that we wish to keep, which is equivalent to computing the minimum spanning tree.

If we know how to compute the minimum spanning tree, we can compute the desired answer by computing the sum of the weights of all the edges, and subtract the weight of the minimum spanning tree.

In the first subtask, we are told that every edge in the graph has weight 1.

It is well-known that a tree with N vertices requires N-1 edges to be connected. We are guaranteed that the graph is originally connected, so we know that the weight of the minimum spanning tree is N-1 since every edge has weight 1. We can enumerate exactly how many edges are present in the graph, and therefore we can compute the desired answer without needing to process specifically which edges are connected.

In the second subtask, we are told that there are at most 200 pairs of columns which are connected and at most 200 pairs of rows which are connected.

We claim that this upper-bounds the number of vertices in the grid to 201^2 = 40\,401. Because each edge connects two vertices either in the same row or in the same column, this means that rows and columns are independent in terms of their connectivity. Therefore, if there are more than 201 rows or 201 columns, there will exist two rows or two columns that are orphaned from each other, which would imply the graph is not connected.

Since the graph has at most 201^2 = 40\,401 vertices, it has at most 2 \cdot 201 \cdot 200 = 80\,200 edges. Naively running an algorithm like Kruskal's, which runs in \mathcal O(E \log V), will be fast enough to compute the minimum spanning tree.

In the third subtask, we are explicitly told that the number of rows and columns is at most 200. However, the number of pairs of columns or pairs of rows that are explicitly connected goes up to 10^5.

With the larger bounds, running a minimum spanning tree algorithm verbatim should be too slow, as the number of edges can scale to around 40 million. However, as previously stated, rows and columns are independent. Imagine a hypothetical grid of just one row and M columns. In this case, we only need to connect M-1 pairs of columns. Similarly, imagine a hypothetical grid of just N rows with one column. In this case, we only need to connect N-1 pairs of rows. We can run Kruskal's on these smaller grids, and then using only the pairs that we used in Kruskal's, use the solution from the second subtask and pass in just those pairs.

To solve the problem in the general case, we need to extend the above observation. In particular, if we consider how Kruskal's works, since we're adding in edges in nondecreasing weight order, when we add in all edges of a given weight, we've effectively shrunk the grid by some number of rows and some number of columns. Therefore, we can do the following:

Keep track of how many rows and columns still need to be merged.
While there are at least two rows that need to be merged or two columns that need to be merged:
    Grab the next smallest edge in the graph.
    If this edge connects two columns which have not yet been merged:
        Multiply the edge weight by the number of unmerged rows and merge the columns
    Else, if this edge connects two rows which have not yet been merged:
        Multiply the edge weight by the number of unmerged columns and merge the rows
Return the sum of the multiplied edge weights

This only requires sorting the edges, which can be done in \mathcal O(P \log P + Q \log Q).


Comments


  • 1
    Plasmatic  commented on Dec. 11, 2021, 12:08 a.m.

    Another way to think about the solution to this problem is as if you're still preforming Naive Kruskal's like in the 1st subtask, but instead you're merging all edges of weight w at once (for every w in the input), as each line of input denotes another batch of edges to connect (and then you think of a more efficient way to represent the graph after each merge operation). You still end up with the same code involving 2 DSUs though.


  • 3
    asdfg9240  commented on Jan. 24, 2020, 4:43 a.m. edited

    i dont understand the general case solution

    edit: nvm i think i understand. for those who didnt understand like me:

    In this problem, an edge is duplicated along all columns/rows. that means when kruskals is taking an edge, we can speed it up by making it take all the duplicated edges (since they have the same lowest weight). to prevent cycles, we merge rows and columns


  • 2
    devansh289  commented on July 9, 2018, 1:02 a.m.

    Is this doable in python?


    • 4
      d  commented on July 12, 2018, 12:43 a.m.

      Yes.


  • 10
    bqi343  commented on Feb. 24, 2018, 10:19 p.m.

    Basically just USACO fencedin. :(


    • 1
      account_disabled  commented on April 15, 2018, 2:33 p.m.

      Although some aspects are similar here, the grid may not be perfect here. Consider if, in this case, there were flights from 1-3, 2-3, and 1-2. This would not be a perfect grid at all! Furthermore, considering the much larger numbers in fencedin, I would imagine that the problems have very different solutions indeed.