Editorial for IOI '16 P6 - Aliens
Submitting an official solution before solving the problem yourself is a bannable offence.
Subtask 1. means that you can use a photo for each point of interest: the minimum photo containing point is the square containing points and . Constraints in this subtask were small enough to iterate over all cells in that square and mark them as photographed. After processing all photos, calculate and return the number of cells which have been marked at least once.
Subtask 2 required the participants to come up with a dynamic programming solution. Some important observations:
- If a photo covers two points and , then it also covers all points between them.
- Each photo's boundary must be equal to some .
Now we can treat this problem as a dynamic programming problem: cover points on a line using segments such that the sum of squares of their lengths is as small as possible. Start with preprocessing the input data: sort all points by and remove duplicates. Notice that each photo should cover some contiguous set of points.
- Let be the minimum cost to cover the first points with at most photos.
- for all .
- .
- contains the answer.
- states, calculating transitions from each state takes time.
- Overall running time: .
Subtask 3 dropped the restriction. We'll describe a similar DP solution for this subtask. It's possible to prove that a photo containing points and covers point if and only if segment is fully contained in segment (). So if we consider segments instead of points , the problem is reduced to the following: cover all segments with larger segments such that their total area (considering intersections) is minimized.
If segment is included in some other segment , then any photo that covers also covers , so we can safely remove . Removing all such segments can be done in time:
- First, sort all the segments in order of increasing left endpoint.
- In case of equality, sort them in order of decreasing right endpoint.
- Iterate over all segments in this order.
- If the current segment is included in the last non-removed segment, remove it. Otherwise keep it.
Now, since all left endpoints are increasing and no segment is included in the other, then for all : and . Observations and definition of are almost identical to the previous solution:
- for all
- (1)
- Last term in this formula accounts for the intersection of segments and ().
- contains the answer.
- Overall running time: .
Subtasks 4 and 5. Here you were required to come up with an optimization of the DP solution described above. Subtask 4 allowed solutions. One possible solution uses Knuth's optimization.
Define as the optimal in (1) and .
Lemma:
This allows us to prune the search space on each step, reducing the running time to . If you calculate in order of increasing and decreasing , then at the moment of calculating , values of and are already known, so you can only check .
It can be rather difficult to prove the correctness formally, but it's easy to be convinced this is true. A possible strategy for the competition would be to implement this solution without a formal proof, then test the hypothesis on smaller inputs using the solution for subtask 3. It is known that this optimization results in running time. Also, don't forget about 64 bit integers.
Subtask 5 required a different kind of optimization, running in or time. Implementing any of the two following optimizations was enough to pass all the tests from this subgroup.
Divide and Conquer optimization ()
Using the fact that and that all can be calculated from all , we can apply divide and conquer optimization.
Consider this recursive function Calculate
that calculates all for all and a given using known .
function Calculate(j, I_min, I_max, T_min, T_max)
if I_min > I_max then
return
I_mid ← (I_min + I_max) / 2
calculate f_{I_mid,j} naively, let T_opt be the optimal t ∈ [T_min, T_max]
Calculate(j, I_min, I_mid - 1, T_min, T_opt)
Calculate(j, I_mid + 1, I_max, T_opt, T_max)
The initial call to this function will be Calculate
for all from to . The time speedup comes up from the fact that all naive calculations of on each level of recursion take in total, because each recursive call splits the segment into 2 (almost) disjoint segments. The depth of recursion is , so the running time of each Calculate
call is . After calculating all layers in time, we get the answer.
Convex Hull Trick optimization ()
Another possible optimization is called Convex Hull Trick. Let's expand (1):
where , , .
We see that the formula can be expressed in terms of minimum of linear functions , evaluated at . Notice that as increases, decreases and increases. That allows us to maintain the lower envelope of these linear functions using a stack and query the minimum value at a given . Adding a line and querying a point can be implemented in amortized time, so the total running time is . This technique is often referred to as the Convex Hull Trick. We will also use it to get the 100 point solution for this problem.
Subtask 6. Let's look at as a function of and study the differences between two adjacent values. The following theorem states that these differences are non-increasing. We'll call such functions convex.
Theorem:
Let's assign some constant penalty for each photo. The new cost function is still convex, because .
Let's introduce another optimization problem without the restriction on the number of photos.
This equation for can also be expressed only in terms of previous values of ().
Using this formula, all can be computed in time using Convex Hull Trick optimization, if all and are sorted beforehand. The solution from subtask 5 can also be modified to find the minimum number of photos required to achieve the optimum, call it .
Since is convex, is also equal to the minimum such that which is equivalent to , so is monotone. Also if we set , optimum value of is achieved with the photos, and if we set , then the optimal solution only contains one photo. Combining everything above, we can use binary search to find such that and .
This means that all differences are equal, and is a linear function of on this interval. Since the desired value of is somewhere in this interval, it's possible to calculate it just by linear interpolation, because all slopes are equal.
This solution requires sorting the segments once and doing iterations of binary search to find , each iteration running in linear time. Total running time: .
Comments