ECOO '12 R1 P2 - Decoding DNA

View as PDF

Submit solution


Points: 5 (partial)
Time limit: 2.0s
Memory limit: 64M

Problem type

DNA is made up of two twisted strands that encode genes using long combinations of four bases: Adenine, Cytosine, Guanine and Thymine. The strands are complementary to one another, meaning that Adenine and Thymine are always opposite each other, and Cytosine and Guanine are always opposite each other, like this.

A double strand of DNA:
ATCAAGGCCTATTCCGGGAAAGG
TAGTTCCGGATAAGGCCCTTTCC

In order for the information in a gene to be used, it has to be transcribed into a strand of RNA. During this process, a portion of one strand of DNA is transcribed – this portion is known as the transcription unit. The start of the sequence to be transcribed is signaled by a sequence of bases known as a promoter, and the end is signaled by a sequence known as the terminator. For our purposes, the promoter is the sequence TATAAT, which begins 10 bases before the start of the transcription unit, and the terminator consists of two distinct, complementary, reversed sequences of at least length 6 that cause the RNA molecule to coil back on itself and pinch off the transcribed strand. If TATAAT appears twice on a strand, only the first occurrence counts as the promoter. An example is shown below.

AGATTATATAATGATAGGATTTAGATTGACCCGTCATGCAAGTCCATGCATGACAGC

In this example the promoter and terminator sequences are boldfaced, and the transcription unit is underlined. The resulting RNA will be complementary to the transcription unit, except that in RNA Uracil takes the place of Thymine. For this example, the result looks like this:

CCUAAAUCUAACUGGG

The input will contain five single strands of DNA, one on each line. Write a program to output the RNA sequence that results from the transcription process. The sequences should be numbered starting at 1, with a colon and a single space character following the number, as shown below.

Sample Input

AGATTATATAATGATAGGATTTAGATTGACCCGTCATGCAAGTCCATGCATGACAGC
AGTCTTCAAGGGGATTCCCAGGTATATAATGCAGATCGCGACGAAATATCGGGCGGGATCCATACCGACCCAGCCGCCCGA
TATAATGGGGGAGAGACCGAGTGTTTAAGTCCCGAGGGATCGGGAGTGAGATTGAGGGAATTCCGGGAATCTCACT

Sample Output

1: CCUAAAUCUAACUGGG
2: UAGCGCUGCUUUAU
3: CUCUCUGGCUCACAAAUUC

Educational Computing Organization of Ontario - statements, test data and other materials can be found at ecoocs.org


Comments


  • 0
    code721  commented on Aug. 30, 2024, 1:29 p.m.

    The terminator sequences have to be length 6, not at least length 6


  • 2
    AlexOldest  commented on May 24, 2024, 7:11 p.m.
    1. The RNA sequence has a certain length. Is it possible that this length could be 0 or 1?
    2. The condition for a terminator: "...two different, complementary, inverted sequences of length at least 6..." is not satisfactory, since I found "two" different such terminators of length both 6 and 7. Both terminators bring "Incorrect answers" with the second or third sample. However, if we assume that the RNA length is >= 2, then a terminator of length 6 results in a match to all the 3 samples. I am not sure that this problem has unique solution.