In the season finale of the Canadian TV show Orphan Black, the main characters discover a message written into their DNA using binary coded ASCII (pronounced "ASKee"). ASCII is a system for numbering letters and other printable characters. Binary coded ASCII uses -bit binary integers instead of regular decimal integers. The table below shows the decimal and binary ASCII codes for uppercase letters and the space character.
Space | 00100000 | |
A | 01000001 | |
B | 01000010 | |
C | 01000011 | |
D | 01000100 | |
E | 01000101 | |
F | 01000110 | |
G | 01000111 | |
H | 01001000 |
I | 01001001 | |
J | 01001010 | |
K | 01001011 | |
L | 01001100 | |
M | 01001101 | |
N | 01001110 | |
O | 01001111 | |
P | 01010000 | |
Q | 01010001 |
R | 01010010 | |
S | 01010011 | |
T | 01010100 | |
U | 01010101 | |
V | 01010110 | |
W | 01010111 | |
X | 01011000 | |
Y | 01011001 | |
Z | 01011010 |
Binary integers are just like regular decimal integers except that the only digits allowed are 1
and 0
and instead of place values rising by s they rise by s. In decimal integers, the rightmost digit is the 's place, then 's then 's and so on. In binary the rightmost digit (bit) is the 's place, then the next is 's, then 's and so on.
DNA comes in a double strand made up of four bases (A
, C
, G
, and T
). A
pairs with T
and C
pairs with G
so that in a double strand, A
's and T
's are always across from each other and so are C
's and G
's, like this:
CAACACGTGGCGTGCCAGTGACCTTGGCAGGTTGCGTCGAAATCCC
GTTGTGCACCGCACGGTCACTGGAACCGTCCAACGCAGCTTTAGGG
To turn the above DNA into a binary string, just assign one of the base pairs (AT
or CG
) to be a 1
and the other to be a 0
. So the above DNA string might represent this binary number:
1001011011110111010101100111011001110110000111
Or it might represent this one:
0110100100001000101010011000100110001001111000
In this case, the correct one is the second one and the message starts at the fourth position with a few bits left over at the end:
011 01001000 01000101 01001100 01001100 01001111 000
H E L L O
The input will contain test cases. Each test case will consist of two lines representing a pair of chromosomes. Each pair contains a coded message as described above, with from to extra bits on the beginning and another to extra bits at the end (the chemists who read the DNA weren't exactly sure where the message was). The messages themselves consist entirely of the capital letters A
through Z
and space characters.
Sample Input
ACCAATGTAGATATCATACTCTCTTGCTATGTTCGTTACATGCCCAA
TGGTTACATCTATAGTATGAGAGAACGATACAAGCAATGTACGGGTT
ACACTTTTGGTGTTCGTATCAACCGCAGATCCGATGTTACACTTCATATTTGATCGTCACTATCTC
TGTGAAAACCACAAGCATAGTTGGCGTCTAGGCTACAATGTGAAGTATAAACTAGCAGTGATAGAG
CCAGACGAGTAGTCCGACAGAGCCTGTGAGAGGGGCGTCGGGGGTGCCGTAGTCCGGCTGAGCAAGCCAGGTTCACC
GGTCTGCTCATCAGGCTGTCTCGGACACTCTCCCCGCAGCCCCCACGGCATCAGGCCGACTCGTTCGGTCCAAGTGG
AACCCCACTAATCCTGAAATTCTGTGTAGTTGTGAACATGTCCTAGATCATATTTGTTGCGGTCAAGCCTAAA
TTGGGGTGATTAGGACTTTAAGACACATCAACACTTGTACAGGATCTAGTATAAACAACGCCAGTTCGGATTT
ACCAGTTATACTACATAAAAGATTACTTACTTTAATCTTTAGCTTGTTTTATGTTAGTTA
TGGTCAATATGATGTATTTTCTAATGAATGAAATTAGAAATCGAACAAAATACAATCAAT
ACGACGCAGTCGACCGCCGAGGCTTCCCACGCCCGTGGCTTTCGACGGCGGAGGAGCCCC
TGCTGCGTCAGCTGGCGGCTCCGAAGGGTGCGGGCACCGAAAGCTGCCGCCTCCTCGGGG
AACCTGGTGGTCCACCGGGCTCCACTCCCAGGCGGCTCGAGTAGGTGGCCGGAGGATGG
TTGGACCACCAGGTGGCCCGAGGTGAGGGTCCGCCGAGCTCATCCACCGGCCTCCTACC
ACAGTGGTACTGGTGCCCGGTCCATACGCAGCGCGCTGGAATACGAGGGCCCTGTGGGCA
TGTCACCATGACCACGGGCCAGGTATGCGTCGCGCGACCTTATGCTCCCGGGACACCCGT
CACTGGCTGGAGCCCGGTGAGCACGCTCCGCCGACACCTAGCAGGCGCCTCTCTGGCAACAC
GTGACCGACCTCGGGCCACTCGTGCGAGGCGGCTGTGGATCGTCCGCGGAGAGACCGTTGTG
CACCCACACACTCTCGACGCCGGACTCTACGCTCCGGGGACAGTAAGCACGCGGGTGATGGCC
GTGGGTGTGTGAGAGCTGCGGCCTGAGATGCGAGGCCCCTGTCATTCGTGCGCCCACTACCGG
Sample Output
HELLO
CLONE ME
KEEP CALM
CARRY ON
A B C D
E F G H
I J K L
M N O P
Q R S T
U V W X
Educational Computing Organization of Ontario - statements, test data and other materials can be found at ecoocs.org
Comments