Editorial for Baltic OI '06 P5 - RLE Compression
Submitting an official solution before solving the problem yourself is a bannable offence.
The optimal solution should work in .
First, we decode the input and store the sequence as a sequence of blocks. Each block represents a repetition of one character. Let the number of blocks be . It satisfies .
A straightforward dynamic solution leads to time and memory . For each and each we calculate a value — the length of the shortest code of first blocks such that at the end of the code the special character is set to .
This algorithm can be accelerated using the following observations. First, the code of a block of repetition of a character using the special character other than is not longer than the code of the block using the special character . Therefore, if then it is always a good choice to code the block using special character , thus , where is the length of the shortest code of block given . So all values of for differ from by the same number .
More attention is needed when calculating . We want to encode block in a such way that after encoding it, the special character will be set to . This can be done in two ways: with or without switching the special character. If we don't want to switch, then the length of the code will be equal to plus the length of the code of block using as the special character. If we want to switch the special character to somewhere, it is always worthwhile to do it at the end of block of . This costs additional characters. The rest of the code will contain character plus the smallest value from the set . To reconstruct later the code we need to save only how this particular value was calculated.
We don't have to store for all . We need only values for the current block.
There can be developed a fast data structure with the following operations running in constant time:
- getting the value for any ,
- calculation of from ,
- getting the smallest value with .
The key observation for doing it is that two values may differ at most by . This can be proved by a simple induction on the number of blocks.
The task requires from a contestant some short insight into the problem. The more difficult part of the solution should be careful implementation.
Comments