Editorial for DMOPC '19 Contest 1 P6 - Bob and Binary Strings


Remember to use this editorial only when stuck, and not to copy-paste code from it. Please be respectful to the problem author and editorialist.
Submitting an official solution before solving the problem yourself is a bannable offence.

Author: george_chen

For subtask 1, it suffices to iterate over all pairs of binary strings of length a_i and check if they are similar. Suppose the recursive function for checking similarity has complexity T(M) for comparing two strings of length M. If we directly implement the procedure described in the problem statement, we get

T(M) = M+4T(M/2)
T(M) = M^2

Time complexity: \mathcal{O}(N a_i^2 2^{2a_i})

For subtask 2, observe that if a_i is odd, then the only way for two binary strings to be similar is if they are equal. Since each binary string can only be paired with itself, this gives a total of 2^{a_i} pairs. Since a_i is quite large, we must use repeated squaring to calculate 2^{a_i} \bmod 10^9+7.

Time complexity: \mathcal{O}(N \log(a_i))

For subtask 3, we must form the crucial observation that the strings can be split into several equivalence classes. If some string A is similar to B, and string B is similar to C, then we can show that A is also similar to C. Therefore, all strings belonging to the same class are similar and strings belonging to different equivalence classes aren't similar. For each equivalence class, we can define the lexicographically smallest possible string that is reachable from every element in said class as the representative of the class. Now, it is a matter of assigning all 2^{a_i} strings to the corresponding class. This can be done by recursively minimizing the left and right halves and then minimizing the concatenation of the minimized left and right halves (i.e. if the left half is larger than the right half, swap them). Our recursive procedure now has complexity,

T(M) = 2T(M/2)
T(M) = M

Time complexity: \mathcal{O}(a_i 2^{a_i})

Approach 1:

For subtask 4, we will speed up the procedure described in subtask 3. Suppose a_i = 2k (if a_i is odd we can use the algorithm for subtask 2). Instead of generating the equivalence classes for a_i, we will do it instead for k. In the recursive procedure, notice that the classes for a_i will be generated by pairing up the classes for k on the left with the classes for k on the right. So we can consider a list, b, which is the sizes of the equivalence classes for k. Notice that when we multiply b_i by b_j where j \le i, we obtain the size of an equivalence class for a_i. Be careful that when j > i we must add this number to the resulting class when i and j are swapped instead of creating a new class. If we square the terms and modify this procedure slightly, we will get the sum of the squares of the equivalence classes for a_i without actually generating the terms.

Consider pairing two terms b_i and b_j where i \ne j. The equivalence class they form has size 2b_i b_j and the square of the size would be 4b_i^2 b_j^2. Since they will be multiplied twice by the procedure, we must multiply their product again by 2 in order to get the square of the resulting size to be correct.

Time complexity: \mathcal{O}(a_i 2^{a_i/2})

Approach 2:

We will generalize approach 1. Suppose a_i = 2^m k. We will first generate the classes for k. However, since k must be odd, there are just 2^k groups of size 1. By using the explicit procedure we will generate the terms for 2k, 2^2 k, \dots, 2^{m-1} k. Since the size of the list squares itself each iteration, the first m-1 iterations will use up negligible time compared to the last iteration. Once again, we can use the implicit generation method to get the answer for 2^m k without generating the actual terms.

Time complexity: \mathcal{O}(2^{a_i/2})

For subtask 5, we observe that there are many duplicated group sizes and that the order of the groups doesn't actually matter. Therefore, we will store the number of groups of size i instead of explicitly storing the groups. This works well if you observe that the sizes of the equivalence classes must be a power of 2. If all the terms going into a doubling operation are powers of two, then all resulting terms will also be powers of two since they must be a product of two input terms or two input terms multiplied by 2. With this form, note that the doubling operation we used in the previous subtask is similar to a convolution of this array. As with approach 2 from the previous subtask, if a_i = 2^m k we will start with the representation of the classes of k and double the representation until we get to a_i. The representation doubles in size with each iteration and performing one iteration takes \mathcal{O}(S^2) where S is the size of the representation at the current iteration. Once again, the first m-1 iterations contribute very little to the complexity which gives \mathcal{O}(a_i^2) per query.

Time complexity: \mathcal{O}(N a_i^2)

For subtask 6, we will try to find a way of directly generating the answer instead of storing the number of each term as this will quickly become unmanageable. Observe that the final form we seek is the sum of the squares of the sizes of the equivalence classes for a_i. Let S_d(i) be the sums of 2^i-th powers of the sizes of the groups for d so the sum of squares for d is just S_d(1). The sum of squares of the sizes for 2d is close to C_1 S_d(1)^2 but we overcount. How much do we overcount by? Notice that we are overcounting the terms that are being multiplied by themselves so we should subtract C_1-1 times this sum. Suppose we call the sum of fourth powers of the classes of d, S_d(2). Then we have S_{2d}(1) = C_1 S_d(1)^2-(C_1-1)S_d(2). The interesting thing is that this works exactly the same for higher powers so we have the general formula S_{2d}(i) = C_i S_d(i)^2-(C_i-1)S_d(i+1). The coefficients C_i can be found in the same way as approach 1 in subtask 4.

Therefore, our algorithm becomes as follows: suppose a_i = 2^m k. We start the procedure at k where we know that S_k(i) = 2^k since the sum of arbitrary powers of 2^k groups of 1s is always 2^k. We then repeat this procedure to generate the results for 2k, 2^2 k, \dots, 2^m k. Note that at iteration i, we cannot generate S_{2^ik}(m-i+1) since we don't have S_{2^ik}(m-i+2). But, if we start by generating S_k(i) for i \le m then we will have enough terms to generate S_{a_i}(1) which is the answer.

Time complexity: \mathcal{O}(N \log^2(a_i))


Comments

There are no comments at the moment.