ECOO '12 R1 P2 - Decoding DNA

View as PDF

Submit solution

Points: 5 (partial)
Time limit: 2.0s
Memory limit: 64M

Problem type

DNA is made up of two twisted strands that encode genes using long combinations of four bases: Adenine, Cytosine, Guanine and Thymine. The strands are complementary to one another, meaning that Adenine and Thymine are always opposite each other, and Cytosine and Guanine are always opposite each other, like this.

A double strand of DNA:

In order for the information in a gene to be used, it has to be transcribed into a strand of RNA. During this process, a portion of one strand of DNA is transcribed – this portion is known as the transcription unit. The start of the sequence to be transcribed is signaled by a sequence of bases known as a promoter, and the end is signaled by a sequence known as the terminator. For our purposes, the promoter is the sequence TATAAT, which begins 10 bases before the start of the transcription unit, and the terminator consists of two distinct, complementary, reversed sequences of at least length 6 that cause the RNA molecule to coil back on itself and pinch off the transcribed strand. If TATAAT appears twice on a strand, only the first occurrence counts as the promoter. An example is shown below.


In this example the promoter and terminator sequences are boldfaced, and the transcription unit is underlined. The resulting RNA will be complementary to the transcription unit, except that in RNA Uracil takes the place of Thymine. For this example, the result looks like this:


The input will contain five single strands of DNA, one on each line. Write a program to output the RNA sequence that results from the transcription process. The sequences should be numbered starting at 1, with a colon and a single space character following the number, as shown below.

Sample Input


Sample Output


Educational Computing Organization of Ontario - statements, test data and other materials can be found at


  • 0
    valegrete  commented on Jan. 3, 2023, 12:55 p.m. edit 4

    Really starting to get frustrated with problems explained so poorly that you have to depend on sample I/O unrepresentative of the test case possibilities.

    There can be multiple competing terminator sequence pairs in the strand:



    Matching whichever gives you the shortest transcription unit as in the example fails tests 2 and 3. The challenge is supposed to be implementing the algorithm, not failing test cases because the information to implement said algorithm is incomplete. Is there a maximum distance between the terminator pairs? Should it always be the longest pairs? What is the selection criteria when multiple sequences meet the problem description?

  • 1
    InfinityAtEnd  commented on July 23, 2022, 3:04 a.m.

    can someone help me here a bit... I'm quite sure i covered all the issues with the code and the out of the ordinary results but I still get failure to Test case #3 - Case 2 ....all the oders passed exept this one and I do't understand why...

    can anyone else give me a idea about what I'm missing of I fail to see ?!

  • 1
    Jzaragoza98  commented on April 15, 2022, 10:26 p.m.

    might be a dumb question, but in the sample input, we are treating each line of input as separate altogether? As in line 1 and line 2 are independent of each other

  • 0
    Rhumph  commented on March 29, 2022, 1:04 p.m.

    Can I impose on someone to give me a hint. I have passed the first test but failed the other two. I looked at the editorial and I’m pretty sure I’ve got the typical problems covered. Thanks.

    • 0
      Spitfire720  commented on March 29, 2022, 5:02 p.m.

      Test cases like TATAATATGCATCGATATCGAT seem to fail.

      • 1
        Rhumph  commented on April 5, 2022, 1:26 p.m.

        Thanks for your tip. Am I mistaken? It looks like that should be an empty set. If so, does it get omitted or included as an empty set…ie. say like 2:(blank).

  • 0
    neo_coder  commented on Jan. 20, 2022, 5:34 a.m.

    I still don't understand what's the condition for terminator sequence. In a given example why GTCATGCA is terminator? Could someone please explain?

    • 3
      Spitfire720  commented on Jan. 21, 2022, 12:04 p.m.

      If you look at the editorial, we can see how they find the terminator sequences.

      Our first sequence, GTCATGCA, first gets reversed to ACGTACTG.

      Then, find its complement, which would be TGCATGAC.

      This problem took me a good long time to understand too.

      • 0
        darek  commented on Jan. 30, 2022, 6:00 a.m.

        So in other words: GTCATGCA > CAGTACGT(complementary to GTCATGCA) > TGCATGAC(reversed CAGTACGT), I guess.

      • 0
        neo_coder  commented on Jan. 23, 2022, 8:47 a.m.

        Thanks for the explanation, it made sense now!