Editorial for IOI '17 P6 - Ancient Books


Remember to use this editorial only when stuck, and not to copy-paste code from it. Please be respectful to the problem author and editorialist.
Submitting an official solution before solving the problem yourself is a bannable offence.

Subtask 1

The first subtask, where there are at most 4 books in the library, can easily be solved by hand. There are only 1! + 2! + 3! + 4! = 33 possible inputs and so they can all be precomputed by hand simulating the process with pen and paper.

Alternatively, we do a brute force state exploration. We can take the triple (state of the shelf, Aryan's position, Aryan's current book) as the state in a breadth-first search. This state space is roughly of size (N+1)! \times N and so for N \le 4, we can still find the shortest path to the sorted state fast enough.

Medium subtask (start/end on the very left)

To compute the number of steps (= seconds) needed without actually simulating the entire sorting process, we need a couple of observations:

Balancing property Across every edge of the underlying path graph, Aryan will walk equally often from the left to the right as she walks from the right to the left. A similar balance also applies to the books. For every edge of the path, there are equally many books that need to be moved from the left to the right as there are from the right to the left.

Cycles of the permutation The array order specifies a permutation. We will denote that permutation by \pi from here on. We will see that the answer only depends on how \pi partitions the set of slots \{0, \dots, n-1\} into disjoint cycles. A book that is placed correctly from the beginning, we call trivial. Their corresponding trivial cycles (cycles of length one) can almost be ignored, as we will always just move past them.

Single cycle If \pi consists of a single cycle then the answer is easy to find: Aryan can grab the book at S, bring it to \pi(S), take the book from there to \pi(\pi(S)) and so on until she returns to S. [Here is a video illustrating this.] (DMOJ curator's note: The link doesn't seem to be in the pdf) As she brings one book on slot closer to its target position in every step, her walk surely is optimal. We can compute this number of steps by:

\displaystyle d(\pi) = \sum_{i=0}^{n-1} |i-\pi(i)|

which is exactly the sum of the distances between the initial and target position of every book.

Lower bound d(\pi) The sum of distances d(\pi) is a lower bound on the answer even if \pi consists of multiple cycles. We distinguish two kinds of steps for Aryan: A step is called essential if Aryan brings one book one step closer to its target position than this book ever was before. Otherwise, the step is called non-essential. Every way of sorting the books consists of exactly d(\pi) many essential steps. The number of non-essential steps needed depends on how the cycles of \pi overlap.

Two cycles Every cycle of \pi covers some interval I of the bookshelf which extends from the leftmost book to the rightmost book that is part of this cycle. We have I \subseteq [0, n-1], where we use [i, j] as a shorthand for \{i, i+1, \dots, j\}. Let \pi consist of exactly two non-trivial cycles C_1 and C_2 with their respective intervals I_1, I_2 and let S = 0 with S \in C_1. Then the answer only depends on whether the cycles overlap (I_1 \cap I_2 \ne \emptyset) or not. If they overlap, the answer is d(\pi), otherwise it is d(\pi)+2.

Why? If they overlap, Aryan can sort along C_1 until she encounters the first book belonging to C_2. She then leaves the book she was carrying at that slot to fully sort C_2 and return to the same slot. She can then pick up that book again and finish sorting C_1 without ever spending a non-essential step.

If the two cycles do not overlap (so C_1 is entirely to the left of C_2), Aryan can do something similar. She starts sorting C_1 until she encounters the rightmost slot belonging to C_1. She then takes the book from there and non-essentially walks with one step to the right to the leftmost slot of C_2. There, she sorts C_2 and picks up the same book again to non-essentially walk back to C_1. Finally, she finishes sorting C_1 and returns to S. This is optimal since the only two non-essential steps of Aryan's walk are spent across an edge that no book needs to cross but has to be crossed by Aryan eventually as there are non-trivial books on both sides. [Here is a video illustrating this.]

Multiple cycles The two cases from above generalize to the case where \pi consists of many cycles. Any two overlapping cycles can be interleaved without non-essential steps, and Aryan has to spend two non-essential steps across every edge that no book needs to cross, but where there are some non-trivial books or S on both sides of the edge. [Here is a video illustrating this.] More formally, let E' be the subset of the edges of the path with the following property:

\displaystyle \begin{align*}
e = (i, i+1) \in E' \iff& \text{no book has to cross }e\text{ and} \\
& \text{some book to the left of }e\text{ is non-trivial or at }S\text{ and} \\
& \text{some book to the right of }e\text{ is non-trivial or at }S \\
\iff& ((\not\exists j \in [0, i] : \pi(j) \ge i+1) \land (\not\exists j \in [i+1, n-1] : \pi(j) \le i))\text{ and} \\
& (\exists l \in [0, i]\text{ s.t. }(l \ne \pi(l) \lor l = S))\text{ and} \\
& (\exists r \in [i+1, n-1]\text{ s.t. }(r \ne \pi(r) \lor r = S))
\end{align*}

If S = 0, these are all the non-essential steps needed, so the answer is d(\pi)+2|E'|.

Implementation With these observations, it is easy to compute both d(\pi) and |E'|. If it is done in quadratic time (e.g., by just checking the above conditions for E' by looping over all indices for every edge), this will solve subtask 2. However, it is not hard to compute E' in linear time (e.g., in a scanline fashion from left to right), which will then score for the first three subtasks.

Harder subtasks (with S \ne 0)

If Aryan does start somewhere in the middle of the shelf, we might need some additional non-essential steps across edges not in E'. It is not obvious whether Aryan should first go to the left or to the right, as there might be non-trivial books on both sides.

We could try out both options and define the following subproblem:

How many non-essential steps do we need, if we already know how to connect all the cycles in the interval [l, r] with S \in [l, r]?

This gives rise to a dynamic programming formulation with a state of quadratic size (all intervals containing S).

For any given interval, we define the function extend(l, r) with [l', r'] = extend(l, r) being the largest part of the shelf that we can sort without spending any additional non-essential steps. So extend has to repeatedly add all cycles C whose interval I partially or fully overlap with [l, r] and then continue with [l, r] := [l, r] \cup I until no more cycles can extend the interval.

Once there is no other overlapping cycle (so [l, r] = extend(l, r)), we are either done or we know that we have to spend some non-essential steps. Let combine(l, r) be the function that computes the cost of connecting all cycles to the interval [l, r]. We can recursively compute it using combine(l, r) = 2+\min(combine(l-1, r), combine(l, r+1)). We need to take care of the border cases (when l-1 or r+1 are outside the shelf) and initialize it with combine(l', r') = 0 for [l', r'] being the smallest interval that contains all non-trivial books and S.

By implementing extend carefully, we can achieve an amortized constant time complexity across all calls, so that the dynamic program runs in quadratic time overall. Note that for S = 0, the set of states is only of linear size, so this solution also passes subtask 3.

To solve the problem in linear time, we note that we can decide whether to go left or right somewhat locally without exploring quadratically many states. If [l, r] is some extended interval ([l, r] = extend(l, r)), we look for two special cycles C_l and C_r. C_l is the first S-containing2 cycle that we encounter when walking from l to the left. Similarly, C_r is the first S-containing cycle when walking from r to the right.

2 A cycle C with interval I is S-containing if and only if S \in I.

Let c_l be the cost of reaching C_l from l (and define c_r). Note that c_l is not just twice the distance between l and the closest book of C_l as there might be some small cycles along the way that help us save some non-essential steps. But we can compute c_l quickly by solving the (S = 0)-problem between l and the first box of C_l.

Observe that if C_l does not exist, C_r does also not exist (as [l, r] is maximally extended, the book of C_l to right of S also has to be to the right of r and vice versa). Also note that once we reach either C_l or C_r, we also reach C_l \cup C_r and therefore get extend(C_l \cup C_r) without any further cost. This means that we can greedily decide for the cheaper of the two sides (of cost \min(c_l, c_r)) and then continue with the interval extend(C_l \cup C_r) regardless of whether we decided to go left or right.

Finally, one has to take care of the border region, everything outside of the outermost S-containing cycles (so once C_l and/or C_r no longer exist). But this is easy, as this is just another (S = 0)-case on each side.

We can answer all the extend(l, r) calls and compute all the c_l and c_r costs using only one overall sweep over the shelf (if we precompute the cycle of each shelf and the interval of each cycle, which we can also do in \mathcal O(n)). Therefore, we can determine the answer d(\pi)+combine(S, S) in linear time.

Almost correct solutions

One thing a contestant might overlook is that the answer can be of the order \Theta(n^2) and therefore does not necessarily fit into a 32-bit integer. This causes an overflow in subtasks 3 and 5.

Other submissions might consider only near-optimal sorting walks. Some might also visit all the trivial books at the ends of the paths, even though there is nothing to sort there. Others might just traverse the path once without carrying any book and then sort every cycle individually along the way without interleaving anything and hence spend 2n-2 many non-essential steps. Both of these would result in non-optimal answers.


Comments

There are no comments at the moment.