Editorial for COCI '21 Contest 3 #3 Akcija


Remember to use this editorial only when stuck, and not to copy-paste code from it. Please be respectful to the problem author and editorialist.
Submitting an official solution before solving the problem yourself is a bannable offence.

The subsets from the problem have a very rich structure and it is possible to arrive at the solution from different angles. The scoring was generous, but the last subtask should be challenging. For most contestants, the most natural idea was some sort of dynamic programming, but we will present a solution which uses a different (perhaps more complex) idea, which can be generalized to other problems that ask for the k^\text{th} best answer.

The presented solution consists of several parts:

  1. How to check if a subset is obtainable?
  2. How to find the best obtainable subset?
  3. How to find candidates for the next best obtainable subset?
  4. How to choose between the candidates to arrive at the k best ones?

The proofs are left as an exercise for the reader.

How to check if a subset is obtainable?

Claim 1: If a subset is obtainable, a valid sequence of calls can be made by ordering the products by d_i.

Claim 2: In an empty array put +1 on each position i (i = 1, \dots, n), and -1 on each position d_i and make prefix sums over it. A set of products is obtainable if there are no negative numbers in such an array.

We can build a min. segment tree over the array from claim 2, which allows us to insert/erase products while maintaining the information whether the current set is obtainable. An update is simply \pm 1 for some suffix of the array.

How to find the best obtainable subset?

The following greedy algorithm works:
Sort the set in question so that w_1 < \dots < w_n.
We iterate over the products in order and try to add the current product if it maintains the attainability property (which we can check with the segment tree).

Claim 3: The greedy algorithm produces a set which is maximum in size, and out of all such sets it has the minimum cost, i.e. it is the best choice.

Proof outline: The greedy algorithm produces an obtainable subset which can't be enlarged (i.e. is maximal). Every two maximal subsets are actually of the same size - the maximum one. The order w_1 < \dots < w_n gives the minimum cost.

How to find candidates for the next best obtainable subset?

Let S be the best obtainable subset of the index set \{1, \dots, n\}. For each i \in S we want to know what is the best obtainable subset of \{1, \dots, n\} \setminus \{i\}. One of them will clearly be the second best, but as we'll see later, it's useful to know the cost for all of them.

Claim 4: The best obtainable subset of \{1, \dots, n\} \setminus \{i\} is made by removing i from S and adding some j \notin S.

We can find the desired j for some i in the following way:
In the segment tree, we store the array from claim 2 for set S, and we make an update of +1 for the suffix starting at d_i, which corresponds to removing i from S. A valid choice for j is now any index such that there are no zeroes in the suffix of the array starting from d_j. Therefore, d_j has to be to the right of the rightmost zero in the current array, which is actually the rightmost zero to the left of d_i (now we see that the update was unnecessary, and we could have made a segment tree query to find the rightmost zero left of d_i). In any case, the only condition for choosing j is that d_j \ge c, for some c that we know how to calculate. Out of all such j, we should choose the one with minimum w_j. The answer can be precomputed for all c by doing a sweep over the indices outside of S, decreasing by d_j.

How to choose between the candidates to arrive at the k best ones?

In a given moment, the current search spaces will be described by describing the status of each index:

  • This index must be in the set.
  • This index can't be in the set.
  • For this index, we have a choice whether it is in the set or not.

We'll have a priority queue to store all the different search spaces not yet explored. The search spaces will be disjoint and their union will cover all possibilities not yet visited. In the priority queue, they will be sorted by the cost of the best obtainable subset within that search space.

It was already mentioned how to find the best obtainable subset within a search space. The situation is a bit different since this time we have indices which must/can't be taken. But the idea can easily be modified just by not even taking into consideration the indices which we can't take, and the ones we must take we use for updating the segment tree before executing the greedy algorithm. It's easy to see that the mentioned claims are still true in this modified case.

We pop the minimum element from the priority queue to determine the next best obtainable subset. The popped search space then needs to be partitioned into smaller pieces which don't include the best obtainable subset. Let S be this best subset. The smaller subsets have the following form:

  • The first element from S can't be taken, for the rest we can choose.
  • The first element from S must be taken, the second must not, for the rest we can choose.
  • The first two elements from S must be taken, the third must not, for the rest we can choose.

Of course, the indices outside of S are left to be of the same status as they were before. According to what was mentioned so far, when deleting an index from S, it is enough to add a single new index. Therefore, when adding new search spaces to the priority queue, we won't keep track of the status of all indices, rather we'll keep track of just the indices which change their status.

We'll have a total of k popmin operations. For each of them, we'll have to find the best obtainable subset for some search space, and then partition it into smaller pieces (and for each of them determine its cost, to be able to add it to the priority queue). Using the segment tree, this can all be done in \mathcal O(n \log n) for each popmin operation. The priority queue will have at most nk elements. The total complexity is then \mathcal O(nk \log(nk)).

The presented idea is called fracturing search. The reason a lot of things are true for these obtainable subsets is because they have a matroid structure.


Comments

There are no comments at the moment.