Editorial for IOI '96 P5 - Longest Prefix


Remember to use this editorial only when stuck, and not to copy-paste code from it. Please be respectful to the problem author and editorialist.
Submitting an official solution before solving the problem yourself is a bannable offence.

Let S be a sequence of letters and let P be a set of primitives. Denote by Suff(S, P) the set of sequences v such that the following two conditions hold:

  1. v is a prefix of a primitive in P.
  2. S = uv for some u. (For two sequences of letters u and v, we denote the concatenation of u and v by uv.)

It is obvious that S can be composed from primitives in P iff the empty sequence is in Suff(S, P). Moreover, S has an extension u on the right that makes Su a composition of primitives in P iff Suff(S, P) is non-empty. Therefore the following algorithm gives a solution to the task if DataFile contains the sequence to be examined.

Res:=0;
S:=empty; NoS:=1;
ReadLn(DataFile,X);
Slength:=1;
While (X<>'.') And (NoS>0) Do Begin
  Append X to the end of S;
  Q:=Suff(S,P);
  NoS:=number of elements of Q;
  If the empty sequence is element of Q Then Res:=Slength;
  ReadLn(DataFile,X);
  Inc(Slength);
End;
WriteOut(Res);

One obvious problem with this algorithm is that the data file is too large to read into the memory (unless you know how to use the machine's extra memory).

Fortunately, it is not necessary to read the whole sequence into memory. Let us observe that Suff(Sx, P) for a sequence S and a letter x can be computed from the set Suff(S, P). Indeed, the following algorithm satisfies the requirement that if Q = Suff(S, P) holds before executing Next(Q, X) then after the execution Q = Suff(SX, P) will hold.

Procedure Next(Q,X);
  Begin
    Q1:=empty;
    Forall u in Q Do Begin
      If ux is a prefix of some primitives in P Then
        Begin
          include ux in Q1;
          If ux is equal to a primitive in P
            Then include the empty sequence in Q1
        End;
    End;
    Q:=Q1;
  End;

In order to refine the algorithm Next, we have to answer the questions:

  • Is a sequence ux a prefix of some primitive in P?
  • Is a sequence u equal to a primitive in P?

Consider the following data structure for the set of primitives P.

Const
  MaxN=100; (* maximum possible number of primitives *)
  MaxL=20; (* maximum possible length of primitives *)
Var
  P:Array[1..MaxN,1..MaxL] Of Char; (* array of primitives *)
  L:Array[1..MaxN] Of Word; (* length of the primitives *)

Let us represent a sequence u which is a prefix of a primitive in P by the pair (i, j), such that the prefix of P[i] consisting of the first j letters of the primitive P[i] equals u, and i is the least such index for u. Note that the empty sequence is represented by the pair (1, 0).

We can preprocess the set of primitives to build a transition table T:

T[i, j, x] is 0 if there is no primitive in P with prefix P[i][1 \dots j] x, otherwise the least index k such that P[i][1 \dots j] x is a prefix of P[k]. (P[i][1 \dots j] denotes the sequence of letters consisting of the first j elements of the primitive P[i].)

In other words, if a sequence u is represented by the pair (i, j) then the sequence ux is a prefix of a primitive in P iff T[i, j, x] > 0, and in this case ux is a prefix of P[T[i, j, x]] and is represented by the pair (T[i, j, x], j+1). A procedure computes the transition table T and builds the array Full as well. Full[i, j] is true iff the sequence represented by (i, j) is equal to a primitive in P.

We can easily implement the algorithm Next(Q, X) using the arrays T and Full.


Comments

There are no comments at the moment.