Editorial for Baltic OI '19 P5 - Necklace - DMOJ: Modern Online Judge

Editorial for Baltic OI '19 P5 - Necklace

Remember to use this editorial only when stuck, and not to copy-paste code from it. Please be respectful to the problem author and editorialist.
Submitting an official solution before solving the problem yourself is a bannable offence.

Let's represent the strings given to the girls by $S$ ~S~ and $T$ ~T~. A pair of matching necklaces can be found as a concatenation of strings $A$ ~A~ and $B$ ~B~ such that $AB$ ~AB~ is a substring of $S$ ~S~ and $BA$ ~BA~ is a substring of $T$ ~T~.

You might need to reverse $T$ ~T~ first. This converts the following case into the previous one.

2-approximation. As each necklace match consists of two substring matches, at least one of them has to be no shorter than half the length of the necklace. Let $L_{SS}(i, j)$ ~L_{SS}(i, j)~ be the length of the common suffix of $S[:i]$ ~S[:i]~ and $T[:j]$ ~T[:j]~. Depending on if $S[i] = T[j]$ ~S[i] = T[j]~, $L_{SS}(i+1, j+1)$ ~L_{SS}(i+1, j+1)~ is $L_{SS}(i, j)+1$ ~L_{SS}(i, j)+1~ or $0$ ~0~. To find the longest common substring, we try all possible $d = j-i$ ~d = j-i~ and for each loop over $k$ ~k~ in increasing order calculating $L_{SS}(k+1, k+1+d)$ ~L_{SS}(k+1, k+1+d)~ from $L_{SS}(k, k+d)$ ~L_{SS}(k, k+d)~. This takes $\mathcal O(N^2)$ ~\mathcal O(N^2)~ time, $\mathcal O(1)$ ~\mathcal O(1)~ extra memory.

$\mathcal O(N^4)$ ~\mathcal O(N^4)~ and $\mathcal O(N^3)$ ~\mathcal O(N^3)~. As we have seen, a necklace match can be decomposed into two substring matches by cutting the substrings that give the necklace match at some points. For each possible pair of cut points $(i, j)$ ~(i, j)~ (all pairs of indexes of $S$ ~S~ and $T$ ~T~), we'll find the longest necklace that has these cut points. To find it, we can maximize length of the halves of the necklace separately. Let $L_{SP}(i, j)$ ~L_{SP}(i, j)~ be the length of longest suffix of $S[:i]$ ~S[:i]~, that is a prefix of $T[j:]$ ~T[j:]~. Similarly let $L_{PS}(i, j)$ ~L_{PS}(i, j)~ be the length of longest prefix of $S[i:]$ ~S[i:]~ that is a suffix of $T[:j]$ ~T[:j]~. The longest necklace with cut points $(i, j)$ ~(i, j)~ has length $L_{SP}(i, j)+L_{PS}(i, j)$ ~L_{SP}(i, j)+L_{PS}(i, j)~. To find $L_{SP}(i, j)$ ~L_{SP}(i, j)~ we can check all lengths naively in $\mathcal O(N^2)$ ~\mathcal O(N^2)~, giving an $\mathcal O(N^4)$ ~\mathcal O(N^4)~ solution overall. Comparing equal length prefixes and suffixes with a rolling polynomial hash gives an $\mathcal O(N^3)$ ~\mathcal O(N^3)~ solution overall.

Full DP solution. To get a faster solution, we need to find $L_{SP}(i, j)$ ~L_{SP}(i, j)~ for many pairs of indexes at once. To do this, we will use $L_{SS}(i, j)$ ~L_{SS}(i, j)~. If $L_{SS}(i, j) = l$ ~L_{SS}(i, j) = l~ then $L_{SP}(i, j-l) \ge l$ ~L_{SP}(i, j-l) \ge l~, $L_{SP}(i, j-l+1) \ge l-1$ ~L_{SP}(i, j-l+1) \ge l-1~, etc. Passing the length from $L_{SS}(i, j)$ ~L_{SS}(i, j)~ to $L_{SP}(i, j-l), L_{SP}(i, j-l+1), \dots, L_{SP}(i, j-1)$ for all $(i,j)$ ~(i,j)~ is enough to calculate $L_{SP}$ ~L_{SP}~. Doing this naively would take $\mathcal O(N^3)$ ~\mathcal O(N^3)~ time. We can optimize it by doing $L_{SP}(i, j-L_{SS}(i, j)) = \max(L_{SP}(i, j-L_{SS}(i, j)), L_{SS}(i, j))$ for all $(i, j)$ ~(i, j)~ and then $L_{SP}(i, j) = \max(L_{SP}(i, j), L_{SP}(i, j-1)-1)$ for all $(i, j)$ ~(i, j)~. This gives an $\mathcal O(N^2)$ ~\mathcal O(N^2)~ solution. To improve the memory usage to $\mathcal O(N)$ ~\mathcal O(N)~ you need to analyze the DP transitions carefully.

Full randomized solution. Choose a pair of indexes randomly. Extend $(i, j)$ ~(i, j)~ to $([l_1, r_1), [l_2, r_2))$ ~([l_1, r_1), [l_2, r_2))~ describing the longest substring match that $(i, j)$ ~(i, j)~ is part of. This takes time proportional to the length of the substring match. If the longest common substring has length $l$ ~l~, then it takes on average $\frac{N^2}{l}$ ~\frac{N^2}{l}~ attempts to find it. So, this is a randomized $\mathcal O(N^2)$ ~\mathcal O(N^2)~ solution to finding the longest common substring.

To find necklaces, we'll generate substring matches this way. For a match of length $l$ ~l~, we'll try to extend it with strings of length up to $l$ ~l~ to get a necklace match. We can check all lengths naively in $\mathcal O(l^2)$ ~\mathcal O(l^2)~, giving an $\mathcal O(lN^2)$ ~\mathcal O(lN^2)~ solution. Using a rolling polynomial hash gives an $\mathcal O(N^2)$ ~\mathcal O(N^2)~ solution. The memory usage is $\mathcal O(N)$ ~\mathcal O(N)~. This solution is on average faster than the DP solution.

Credits

Task: Jakub Radoszewski (Poland)
Solutions and tests: Oliver Nisumaa, Andres Unt (Estonia)

Comments

There are no comments at the moment.