Editorial for DMOPC '22 Contest 3 P6 - Compressibility

Remember to use this editorial only when stuck, and not to copy-paste code from it. Please be respectful to the problem author and editorialist.
Submitting an official solution before solving the problem yourself is a bannable offence.

Author: 4fecta

Subtask 1

It suffices to compute the repeatability of each of the $\mathcal{O}(N^2)$ ~\mathcal{O}(N^2)~ substrings in $\mathcal{O}(N \log N)$ ~\mathcal{O}(N \log N)~ time. This can be done by iterating over all possible periods and checking whether each period is valid via hashing.

Time Complexity: $\mathcal{O}(N^3 \log N)$ ~\mathcal{O}(N^3 \log N)~

Subtask 2

There are many approaches here. One possible solution is to iterate over all possible starting points and periods, using binary lifting and hashing to compute the maximum repetition.

Time Complexity: $\mathcal{O}(N^2 \log N)$ ~\mathcal{O}(N^2 \log N)~

Subtask 3

We optimize the solution from the previous subtask by modifying the Main-Lorentz algorithm. We first generate all $\mathcal{O}(N \log N)$ ~\mathcal{O}(N \log N)~ batches of repetitions, and sort them in increasing order of length. For each batch of repetitions, we iterate through all possible starting points, using binary lifting and hashing to see how far the repeats go. However, doing this naively yet again yields an $\mathcal{O}(N^2 \log N)$ ~\mathcal{O}(N^2 \log N)~, since the number of repetitions can be up to $\mathcal{O}(N^2)$ ~\mathcal{O}(N^2)~. The key observation now is that if there is a repetition of length $l$ ~l~ starting at a certain position, the next longest repetition starting at the same position must have length around $2l$ ~2l~. With this observation, each starting position will be processed at most $\mathcal{O}(\log N)$ ~\mathcal{O}(\log N)~ times, yielding an $\mathcal{O}(N \log^2 N)$ ~\mathcal{O}(N \log^2 N)~ solution overall. Correct implementation of this algorithm requires some further insights that will be left as an exercise for the reader.

Time Complexity: $\mathcal{O}(N \log^2 N)$ ~\mathcal{O}(N \log^2 N)~

Alternate Solution

Note that every substring with a repeatability greater than $1$ ~1~ belongs to a unique run in the string. Using an algorithm that can enumerate over all of the runs, we can easily compute the compressibility using some elementary math.

Time Complexity: $\mathcal{O}(N)$ ~\mathcal{O}(N)~ or $\mathcal{O}(N \log N)$ ~\mathcal{O}(N \log N)~

Remarks: The alternate solution using run enumeration was discovered by zhouzixiang2004 just two days before the contest, far too late to consider replacing the problem. Although it almost completely trivializes the problem, we assumed the algorithm was obscure enough that it would not be too big of a deal. Unfortunately though, many of our top contestants this time were well aware of the algorithm, making the problem boring and turning it into a simple knowledge test. Better luck next time...

Comments

There are no comments at the moment.