Editorial for JOI '16 Open P2 - Selling RNA Strands
Submitting an official solution before solving the problem yourself is a bannable offence.
Subtask 1 [10 points]
, , , , are small. We can solve this subtask by checking all RNA sequences for each order.
Subtask 2 [25 points]
Let's consider a simpler version of this problem:
There are RNA sequences and orders . For each , compute the number of such that the first characters of are .
We can store all of to a trie. Now we identify a string and the node of the trie which corresponds to . For any two strings , the following are equivalent:
- is a prefix of , that is, the first characters of are .
- Node is an ancestor of node .
Let be the number of occurrences of in . Then the problem is computing for each .
This can be done by precalculation with DFS, but here we use a different approach. Using the Euler-Tour technique, checking whether node is an ancestor of node can be reduced to checking whether a point (corresponding to ) is included in a range (corresponding to ). Now we get an algorithm for this problem:
- Store to a trie.
- Using the Euler-Tour technique, compute the range and the point for nodes in the trie. Note that we need not compute the values for nodes which don't correspond to any of .
- For each , count the number of such that the point corresponding to node is included in the range corresponding to node .
Let's go back to the original problem. Now we have the constraint by as well as by . We saw that handling can be done by a trie and the Euler-Tour technique. It can be easily checked that the same approach can be used for handling if we consider (where is the reverse string of ).
Finally, we get an algorithm as follows:
- Store to a trie.
- Using the Euler-Tour technique, compute the range and the point for nodes in the trie.
- Do the same for .
- For each , compute the answer by checking two ranges.
Subtask 3 [25 points]
Let , , be , , , respectively.
We saw that this problem becomes relatively easy if we can ignore or . Now consider the following algorithm:
- For each , extract only RNA sequences whose last characters are . Then compute the number of RNA sequences in the extracted ones whose first characters are .
The later part of the algorithm can be done by using a trie. The former part also can be done with a trie. This algorithm is not so efficient because it may be possible that all of the RNA sequences are extracted and thus stored to a trie for each . In this case, it works in . But we can add a (seemingly small) optimization:
- Extract RNA sequences only once for the same .
In fact, it significantly reduces the complexity to .
Here is a proof. For each , is extracted only if is a suffix of (that is, the last characters of are ). Let be the set of such 's. Each string of has different length, so . Also holds, so we have . Thus . It means that each is added to a trie times. Adding to a trie once can be done in . So adding strings to tries can be completed overall in . Now we have tries, so computing the answer can be done by accessing node of a trie. This can be done in . Also, checking all of itself takes time.
After all, it was proved that this algorithm works in .
Subtask 4 [40 points]
The approach to the solution for Subtask 2 is reducing the problem to the following:
There are points and rectangles (whose axes are parallel to the or axis) on a 2D plane. For each rectangle, compute the number of points which are included in it.
If we ignore irrelevant nodes of tries, the coordinates of the points and the rectangles will be integers between and .
Let be the number of such that and . As the coordinates of the points are not less than , we have . So we just need to compute for some queries efficiently. This can be done by the following algorithm:
- Sort the points and the queries by (for points) or (for queries). If a point and a query have the same value, make the point appear earlier in the sequence.
- Initialize an array with .
- Scan the sorted sequence from the beginning and do the following:
- For a point , set .
- For a query , answer .
In this algorithm, we need to do array operations efficiently. This can be done with a Binary Indexed Tree.
Comments