Editorial for COCI '21 Contest 3 #3 Akcija
Submitting an official solution before solving the problem yourself is a bannable offence.
The subsets from the problem have a very rich structure and it is possible to arrive at the solution from different angles. The scoring was generous, but the last subtask should be challenging. For most contestants, the most natural idea was some sort of dynamic programming, but we will present a solution which uses a different (perhaps more complex) idea, which can be generalized to other problems that ask for the best answer.
The presented solution consists of several parts:
- How to check if a subset is obtainable?
- How to find the best obtainable subset?
- How to find candidates for the next best obtainable subset?
- How to choose between the candidates to arrive at the best ones?
The proofs are left as an exercise for the reader.
How to check if a subset is obtainable?
Claim 1: If a subset is obtainable, a valid sequence of calls can be made by ordering the products by .
Claim 2: In an empty array put on each position , and on each position and make prefix sums over it. A set of products is obtainable if there are no negative numbers in such an array.
We can build a min. segment tree over the array from claim 2, which allows us to insert/erase products while maintaining the information whether the current set is obtainable. An update is simply for some suffix of the array.
How to find the best obtainable subset?
The following greedy algorithm works:
Sort the set in question so that .
We iterate over the products in order and try to add the current product if it maintains the attainability property (which we can check with the segment tree).
Claim 3: The greedy algorithm produces a set which is maximum in size, and out of all such sets it has the minimum cost, i.e. it is the best choice.
Proof outline: The greedy algorithm produces an obtainable subset which can't be enlarged (i.e. is maximal). Every two maximal subsets are actually of the same size - the maximum one. The order gives the minimum cost.
How to find candidates for the next best obtainable subset?
Let be the best obtainable subset of the index set . For each we want to know what is the best obtainable subset of . One of them will clearly be the second best, but as we'll see later, it's useful to know the cost for all of them.
Claim 4: The best obtainable subset of is made by removing from and adding some .
We can find the desired for some in the following way:
In the segment tree, we store the array from claim 2 for set , and we make an update of for the suffix starting at , which corresponds to removing from . A valid choice for is now any index such that there are no zeroes in the suffix of the array starting from . Therefore, has to be to the right of the rightmost zero in the current array, which is actually the rightmost zero to the left of (now we see that the update was unnecessary, and we could have made a segment tree query to find the rightmost zero left of ). In any case, the only condition for choosing is that , for some that we know how to calculate. Out of all such , we should choose the one with minimum . The answer can be precomputed for all by doing a sweep over the indices outside of , decreasing by .
How to choose between the candidates to arrive at the best ones?
In a given moment, the current search spaces will be described by describing the status of each index:
- This index must be in the set.
- This index can't be in the set.
- For this index, we have a choice whether it is in the set or not.
We'll have a priority queue to store all the different search spaces not yet explored. The search spaces will be disjoint and their union will cover all possibilities not yet visited. In the priority queue, they will be sorted by the cost of the best obtainable subset within that search space.
It was already mentioned how to find the best obtainable subset within a search space. The situation is a bit different since this time we have indices which must/can't be taken. But the idea can easily be modified just by not even taking into consideration the indices which we can't take, and the ones we must take we use for updating the segment tree before executing the greedy algorithm. It's easy to see that the mentioned claims are still true in this modified case.
We pop the minimum element from the priority queue to determine the next best obtainable subset. The popped search space then needs to be partitioned into smaller pieces which don't include the best obtainable subset. Let be this best subset. The smaller subsets have the following form:
- The first element from can't be taken, for the rest we can choose.
- The first element from must be taken, the second must not, for the rest we can choose.
- The first two elements from must be taken, the third must not, for the rest we can choose.
- …
Of course, the indices outside of are left to be of the same status as they were before. According to what was mentioned so far, when deleting an index from , it is enough to add a single new index. Therefore, when adding new search spaces to the priority queue, we won't keep track of the status of all indices, rather we'll keep track of just the indices which change their status.
We'll have a total of popmin operations. For each of them, we'll have to find the best obtainable subset for some search space, and then partition it into smaller pieces (and for each of them determine its cost, to be able to add it to the priority queue). Using the segment tree, this can all be done in for each popmin operation. The priority queue will have at most elements. The total complexity is then .
The presented idea is called fracturing search. The reason a lot of things are true for these obtainable subsets is because they have a matroid structure.
Comments