Editorial for IOI '10 P3 - Quality of Living


Remember to use this editorial only when stuck, and not to copy-paste code from it. Please be respectful to the problem author and editorialist.
Submitting an official solution before solving the problem yourself is a bannable offence.

This problem looks like many other grid tasks. Such problems have also appeared on some previous IOIs. Heavy range-search algorithms might seem to be useful, but actually a much simpler 100\% solution exists.

Let N = R \times C measure the size of a problem instance.

Subtask 1

Subtask 1 can be solved by the most obvious brute force algorithm that considers each rectangle (there are (R-H+1) \times (C-W+1) of these), quadratically sorts its contents ((H \times W)^2 steps), and directly picks out the median rank, and optimizes this. The worst case situation is obtained by H = \frac R 2 and W = \frac C 2. Therefore, this algorithm's time complexity is \mathcal O(N^3).

Subtask 2

Using any \mathcal O(N \log N) sort algorithm (these are well-known), the brute force algorithm can be improved to \mathcal O(N^2 \log N). This solves subtask 2.

Also, an \mathcal O(N) sort (bucket sort) is possible, to obtain a simple \mathcal O(N^2) algorithm, but this does not suffice to solve Subtask 3.

There are some obvious opportunities for improvement, such as exploiting the large overlap between certain rectangles when filling/emptying the array to be sorted. But these improvements do not affect the time complexity.

Subtask 3

\mathcal O(N^{1.5} \log N) algorithms are also possible and they solve Subtask 3. Here is one: say the vertical offset of the final rectangle is known [\mathcal O(\sqrt N) possibilities]. Then scan across the row, using some efficient data structure to keep track of the median (think of incremental/sliding window, such as a range tree plus binary search, or a pair of heaps for values less/greater than the median). [Each value is added/subtracted from the data structure exactly once, for an \mathcal O(N \log N) scan length.]

This is rather involved to code; see Subtask 5 for a simpler and better solution.

Subtask 4

This subtask accommodates possible \mathcal O(N \log^2 N) algorithms, although we have not encountered them.

Subtask 5

Here is an \mathcal O(N \log N) solution. Observe that the program's output can be verified by some algorithm which answers the question "Does any rectangle have median \le X?" This query can be answered in \mathcal O(N^2) time. A rectangle has median \le X if and only if it contains more values \le X than otherwise. Assign all cells in the grid a 'value' according to a 'threshold' function: -1 if greater than X, 0 if equal to X, 1 if less than X. Using the well-known cumulative data structure for queries on rectangular sums, try all possible rectangle locations and return "yes" if the 'values' inside any sum to \ge 0. We simply binary search values of X to find the minimum value for which the answer is "yes".

N.B. An \mathcal O(N \log N) algorithm suffices for 100\% score.

Linear Solution

Mihai Patrascu (ROM) found an \mathcal O(N) solution with the following reasoning.

Claim 1: Given a value X, one can identify in \mathcal O(HW) time which rectangles have a median better than X.

Proof: In linear time, build a partial-sums array: A[i][j] = #{elements in Q[0 \dots i][0 \dots j] better than X}. The number of elements better than X in any rectangle can now be found by combining 4 values of A. The median of an H by W rectangle is better than X if and only if the number of elements better than X is less than \frac{HW}{2}. QED

Claim 2: One can find the best median of K designated rectangles, in running time \mathcal O(K^2 \log(HW) + HW).

Proof: Compress everything to a K \times K grid and binary search for the best median. In each step of the binary search, I need to go over the entire K \times K grid, and a number of elements that are originally HW, but decreasing geometrically each time. QED

Claim 3: One can find the best median of all rectangles in \mathcal O(HW) time.

Proof: By the above, one can set K = \frac{\sqrt{HW}}{\log(HW)} and still get linear time. Sample K rectangles; apply Claim 2. Apply Claim 1 to filter everybody with worse medians. One is left with \frac{HW}{K} rectangles. Do the same again, one is left with \frac{HW}{K^2} \le \log(HW) rectangles. Now just apply Claim 2. QED


Comments

There are no comments at the moment.