Editorial for COCI '13 Contest 1 #6 Slastičar
Submitting an official solution before solving the problem yourself is a bannable offence.
Let us first solve a simpler version of the problem, where the robot doesn't stop upon finding a matching serial number, but tries out all segments instead.
Let be the number of suffixes of string that have as a prefix. Let denote a substring of from position to position .
The number of comparisons that the robot will make for a word with length is then:
where is the total number of segments that the robot has started comparison with. With our simplification assumption, always equals , beaus the robot doesn't stop comparison upon finding a matching word.
A more efficient way to compute is needed. One possible method is building a suffix array of . A suffix array is a set of all suffixes of a string sorted lexicographically. Binary search can be applied to the suffix array in order to find the interval of suffixes beginning with the character . Let us denote this interval by . The value of then equals . Next, within this interval, we need to find the interval of suffixes with as the second letter, . is then . We repeat the procedure for all remaining characters in .
The complexity of this search is , which is fast enough since the sum of all query lengths will not exceed .
What about the full problem, where the robot stops upon finding a match?
For each word , let us denote the first position where it appears with . Let be the number of suffixes of string with the prefix starting at a position less than or equal to . We can now derive a new formula for the total number of comparisons:
where now equals since we stop comparing after that position.
In order to make computing the function easier, we will not respond to queries in the order that they appear in the input, but sort them in a suitable order instead and store the solutions in an array. After computing all the solutions, we can output them in the original order.
We will sort the queries by in ascending order. We will also utilize a structure supporting two operations: add to some position, find the sum of an interval. This structure will enable us to find the number of important prefixes in an interval. It can be implemented using a Fenwick tree or an interval (tournament) tree.
We will process queries as follows:
- suppose we are currently responding to a query first matching at
- in the structure, we set to corresponding positions for all suffixes up to ; notice that it is not necessary to do so for all suffixes for each query, but only for suffixes after the one we stopped at in the previous query, since the queries are sorted by
- as in the simpler version of the problem, we find the intervals ; however, we do not add to the solution, but instead the sum of the interval in the structure, which corresponds to the number of suffixes starting at position or less in the interval in the suffix array
- we store the result into the solution array at the appropriate position
Finally, we just need to output the solution array.
There is one more detail left unresolved: how do we compute the numbers ? Upon finding the last interval for some word , we know that all suffixes from that interval have as a prefix. The first position where appears is thus the starting position of the leftmost of those suffixes, that is, the suffix with the smallest index. We therefore need a structure for finding the minimum number in an interval. This can be done using an interval tree or a similar structure.
There are multiple algorithms for building a suffix array with varying complexities that can be found on the Internet. The official solution uses an algorithm with complexity . Also, both an interval tree and a Fenwick tree are implemented in the official solution. The complexity of a query in both structures is .
The total complexity of the algorithm is .
Comments