Editorial for COI '13 #1 CSS


Remember to use this editorial only when stuck, and not to copy-paste code from it. Please be respectful to the problem author and editorialist.
Submitting an official solution before solving the problem yourself is a bannable offence.

Given HTML document is comprised of a tree-like structure where the nodes are div elements. Div element A is a parent of div element B if B is directly contained in A. Each node in the tree is associated with classes which belong to a corresponding div element.

We can build such a structure in our program simply by using stack and parsing carefully.

A selector can now be viewed as a series of instructions for traversing the tree. For instance, the classifier .banana_.apple.orange > .orange corresponds to the following series of instructions:

  1. beginning: pick any node in the tree as a starting point for the search
  2. .banana: check if the current node's class is banana, if not, end search
  3. _: continue searching on any descendant of the current node
  4. .apple: check if the current node's class is .apple
  5. .orange: check if the current node's class is .orange
  6. >: continue searching on a direct child of the current node
  7. .orange: check if the current node's class is .orange
  8. end: end search, this node corresponds to the selector

Such a series of instructions is easy to implement with careful parsing.

Now the tree traversal can be implemented as a recursive function which has two parameters, node and pos. The parameter node describes the current position in the tree, and the parameter pos describes the current position in the series of instructions for traversing the tree.

The recursion's transitions are now performed as described above. Therefore, for > we move onto a direct descendant, and for .class we check whether this class exists in the current node.

The transition for _ is a bit more complicated. If we implemented it naively, by iterating over the whole list of descendants, it would be too slow. This is why the recursion will have an additional parameter jump that tells us whether we're in the process of moving onto a child node.

It is necessary to memoize the states of the recursion to ensure that we visit one state only once. The complexity is \mathcal O(M \cdot \text{number of states}). The number of states is equal to 2 \cdot \text{number of nodes} \cdot \text{number of instructions} which is approximately N \cdot L, where L is the maximum line length.

Complexity: \mathcal O(MNL)


Comments

There are no comments at the moment.