CCC '20 S3 - Searching for Strings

View as PDF

Points: 10 (partial)
Time limit: 0.5s
Memory limit: 512M

Author:
Problem types
Allowed languages
Ada, Assembly, Awk, Brain****, C, C#, C++, COBOL, CommonLisp, D, Dart, F#, Forth, Fortran, Go, Groovy, Haskell, Intercal, Java, JS, Kotlin, Lisp, Lua, Nim, ObjC, OCaml, Octave, Pascal, Perl, PHP, Pike, Prolog, Python, Racket, Ruby, Rust, Scala, Scheme, Sed, Swift, TCL, Text, Turing, VB, Zig
Canadian Computing Competition: 2020 Stage 1, Senior #3

You're given a string , called the needle, and a string , called the haystack, both of which contain only lowercase letters a..z.

Write a program to count the number of distinct permutations of which appear as a substring of at least once. Note that can have anywhere between and distinct permutations in total – for example, the string aab has distinct permutations (aab, aba, and baa).

Input Specification

The first line contains , the needle string.

The second line contains , the haystack string.

For of the available marks, and .

For an additional of the available marks, and .

For an additional of the available marks, and .

Output Specification

Output consists of one integer, the number of distinct permutations of which appear as a substring of .

Sample Input

aab
abacabaa

Output for Sample Input

2

Explanation of Output for Sample Input

The permutations aba and baa each appear as substrings of (the former appears twice), while the permutation aab does not appear.

• commented on Jan. 24, 2021, 11:28 p.m.

I officially hate this problem, despite being able to solve it on the CCC.

• commented on May 23, 2020, 4:00 p.m. edit 2

EDIT: Nvm

Do you really need a tuple of hashes? I am getting WA on batch 4 case 30 and I don't know if it is because there are collisions.

• commented on May 13, 2020, 5:13 a.m. edited

I don't get why I am scoring IRs. Can someone help me? EDIT: nvm fixed

• commented on March 24, 2020, 7:32 p.m.

I had a pretty much identical solution on the CCC and it also resulted in a TLE. I think the reason may be that updating the string for every shift of the needle takes O(N) time, although I am not entirely sure though.

• commented on Oct. 20, 2020, 9:50 p.m. edited

• commented on March 24, 2020, 5:00 p.m. edit 7

EDIT: Resolved

Hey there! My code seems to almost get to the end of the last batch, running under 0.02sec for each of those cases, and then suddenly TLEs just a few cases before the end. Is this intended, or is there something I should know about?

I've spent countless frustrated hours on this already, so any help will be tremendously appreciated.

More Info: I've read the editorial and looked into Rabin-Karp. However, as I was unaware about this algorithm on the day of the contest, I was just wondering why what I did resulted in almost instant execution up to Test Case 113 and suddenly TLE on Test Case 114. Instead of using hashes I used a vector<int> to store the number of occurances of each letter within the test string and compared that to the needle string. Once a permutation was found the actual string was checked against a map<string, bool> to see if it has already been counted. To shift the test string I basically subtracted 1 from the count of the leftmost letter and added 1 to the count of the rightmost letter.

I know my algorithm isn't optimal, I just wanted to know if the contest was designed intentionally to run in 0.1sec on Test 113 and TLE people on Test 114, or if there is a size limitation by my algorithm (i.e. exceeding 32-bit integer?) that would cause this?

EDIT: On the Official CCC grader I'm getting RTE signal 6 on that same test case.

• commented on March 24, 2020, 11:08 p.m.

Hey there!

So a couple of things. I strongly advise you to join the dmoj slack so we can exchange messages instead of cluttering up the comments section. In addition, I do not know which submission you are referring to, but looking at the most recent one: https://dmoj.ca/submission/1984929, your current code runs in about time, judging from this for loop:

for(int i = 0; i < haystack.length() - nL + 1; i++){
if(currentPermutation == needlePermutation){
string test;
for(int k = i; k < i+nL; k++){
test.push_back(haystack[k]);
}
}
}


And the result is that this is too slow. please visit the #help section in the dmoj slack so we can help you there. Thank you.

• commented on March 25, 2020, 1:19 p.m. edit 3

Thank you so much! EDIT: Did I seriously use a forloop instead of substr to generate a substring during the contest? sigh.

EDIT 2: Well i guess the limit to my algorithm reveals itself now, as now I MLE instead of TLE :/

• commented on March 19, 2020, 11:11 p.m.

This comment is hidden due to too much negative feedback. Click here to view it.

• commented on March 23, 2020, 5:17 p.m. edited

CCC has become terrible in recent years, thanks to Dr. Troy Vasiga and his team. Lots of server issues and crashes on the main CCC site since at least 2016. Some of them were blamed on the public. This doesn't happen in other programming contests but they got away with it because they didn't provide proper data to back their claims. It is easier to blame your server crashes on the users, your team, or the wind blowing than learn a bit about load balancing to prevent overloading your servers.

This must have emboldened them to make the 2019 and 2020 Senior contests a disaster. Take for example Problem S3 in 2020 - Searching for Strings. It has efficient solutions that involve string algorithms. However, these algorithms are not in the Canadian high school curricula, let alone computer studies.

In fact, string algorithms are SPECIFICALLY EXCLUDED from the IOI syllabus and have been so for many years. See here, on page 13: https://people.ksp.sk/~misof/ioi-syllabus/ioi-syllabus.pdf

String algorithms and data structures: there is evidence that the inter-reducibility between KMP, Rabin-Karp hash-ing, suffix arrays/tree, suffix automaton, and Aho-Corasick makes them diffcult to separate.

Maybe there are other ways to solve it but they'd be extremely contorted. String algorithms provide a decisive advantage. Unsuspecting kids who attempt to circumvent or reinvent string algorithms will burn out their entire contest time and more. As page 4 of the IOI syllabus makes clear:

The tasks will be set with the goal that knowledge of Excluded topics should not help in obtaining simpler solutions / solutions worth more points.

The other problems in 2020 CCC Senior are equally out of whack. Either a strikingly similiar problem to one from 2019 Reprechage, bad data full of whitespace that leaves you wondering why your program crashes, waiting minutes for S3 solutions to judge, or endless constant optimization in S2. No fun graphs or trees which make up the majority of the IOI syllabus.

Since the organizers clearly didn't design this contest for a good IOI selection, nor to encourage the participants, they look like a bunch of one-trick-ponies who have ruined the contest on purpose. They get to look like geniuses to the corporate sponsors. You know, they get to show they are much smarter than the kids who couldn't invent on the spot all the string algorithms that some of the organizers learned in university.

Considering that CCC participants will soon compete with the CCC organizers for jobs and grants, botching the contest in 2016-2020 was a clever strategy for the organizers. They also got to make their favorites win by knocking down everyone else with unfair subject matter and server issues. Or maybe not?

We'll see if the team is willing to fix the 2020 CCC botch. They need to get someone more responsible to redo the contest and properly select the IOI team. This contest used to run smoothly before Troy Vasiga took over and wasted it year after year after year. CCC is a tremendous asset for Canadian education, it is important for thousands and thousands of students each year, and it deserves honest and competent leadership.

https://dmoj.ca/post/158-ccc-19-problems-posted#comment-9190

• commented on March 23, 2020, 9:54 p.m. edit 9

https://dmoj.ca/problem/ccc20s5#comment-11869

The CCC this year was far to computer science-y. Olympiad students like xiaowuc2 and wieung_bvg really need to stop setting problems. It's all just people who learned the same data structures and algorithms and put it on the CCC.

The failed CCC of 2020 is a clear demonstration of how during the months prior, as students across Canada were working hard in preparation for the competition, the contest organizers definitely were not. Despite the fast grader, which allowed sub-optimal solutions to achieve optimal scores, the website crashed several times and participants were often forced to wait in long queues in order to receive feedback for their submission. Supporters of this queue may attempt to justify the wait as an inevitable consequence of the scale of the competition; however, to this I point out the more popular USA Computing Olympiad. Not only does the USACO frequently serve over 6000 users, but it is also hosted over a 4 day period, allowing participants to view and discuss the problems and improve soon after they compete. Even after the frustratingly long CCC window of over 30 days was over, CCC organizers were slow to publicize the problems. Unlike the February 2020 USACO, which was hosted at a later date, CCC 2020 results are still unavailable to this day, and I could not locate an editorial or test data anywhere on the official website. I also couldn’t locate a single word of gratitude towards Mike "MikeMirzayanov" Mirzayanov for his great Polygon and Codeforces systems.

Incompetence is also shown in the CCC problems, whether it be in the unsorted input of S1, unclear constraints of S2, mundane debugging of S3, or mathematical thinking (!) of problems S4 and S5. It is easy to see the bias in these problem topics due to the problemsetters' shared knowledge perspective from their classes at olympiads.

After moving to Canada, an IOI gold medalist graduated early without participating in this excuse of a contest. But you don’t need to be 300iq to know that the CCC isn’t the “fun challenge for secondary school students” it claims to be. As a result, the Canadian Computer Community has devolved into a more or less homogeneous population of CS nerds who don't have a gf, leading to a decline in birth rates significantly impacting economic growth. Moreover, this group typically places their individual needs of "[getting] into cco because iT lOoKs GoOd oN [mY] rEsUmE" over the glorious needs of the party. If CCC organizer "Troy Vasiga" has any respect for the advancement of the human race, he will rename the CCC to the Союз Советских Социалистов and purge the all setters who keep putting algorithms in their problems.

Edit: On a more serious note, it does seem like insufficient preparation has gone into the recent years of CCC. The test data has been, on many occasions, too weak or simply wrong. The constraints should be set so that solutions with incorrect complexities do not pass, while solutions with correct complexities do not require too much constant optimization. The grader and website could be better made to not crash and show the verdict to a submission in a timely manner. For these reasons the score and ranking of many participants deviate from their usual performance on other contests such as DMOJ contests or USACO and are often a poor representation of their programming ability. This is a concern considering CCC scores are used to select the Canadian IOI team. It is also unfair towards students who have spent a lot of time practicing competitive programming and wish to apply for the University of Waterloo.

Some competitions with (usually, in my opinion) relatively good problems:

• DMOPC and similar DMOJ contests
• USACO/COCI/APIO/most other OIs
• commented on Jan. 6, 2021, 10:20 p.m. edited

(￣ω￣)

• commented on Oct. 29, 2020, 10:19 a.m.

This comment is hidden due to too much negative feedback. Click here to view it.

• commented on May 21, 2020, 4:53 p.m.

Hello, hello.

Well, I've not moved to Canada. I don't even have my study permit yet...

• commented on March 20, 2020, 11:58 a.m.

Not really.

• commented on March 20, 2020, 4:58 p.m.

This comment is hidden due to too much negative feedback. Click here to view it.

• commented on March 20, 2020, 10:48 p.m.

This comment is hidden due to too much negative feedback. Click here to view it.

• commented on March 20, 2020, 10:53 p.m.

It would be pretty ridiculous if they just tested the same things year after year. Each algorithm already seen had its "first time" in the CCC at some point, and it just so happens this was the year for rolling hash.

• commented on March 21, 2020, 10:59 a.m.

Good point, guess we have to prepare for new stuff in the future.

• commented on March 20, 2020, 8:44 p.m.

This comment is hidden due to too much negative feedback. Click here to view it.

• commented on March 20, 2020, 11:57 p.m.

This comment is hidden due to too much negative feedback. Click here to view it.

• commented on March 19, 2020, 9:54 p.m.

Every test case for my code finishes a bit below 0.1 seconds up until test case #2 of batch #4, where it immediately TLE's. Where could the problem be with my code? Does that test case normally take over 6 times as long as the others?

• commented on March 19, 2020, 11:02 p.m.

Take careful consideration of your time complexity and the constraints of the problem.

If you need further help, feel free to join the slack and ask in #help

• commented on March 19, 2020, 10:16 p.m. edited

I think that would be the case.

• commented on March 17, 2020, 4:34 p.m.

please change the sample input to be separated onto 2 lines

• commented on March 17, 2020, 5:00 p.m.

The issue has been fixed. I apologize for the confusion.

• commented on March 15, 2020, 8:14 p.m.

can someone tell what's wrong with my code? Are my hashing functions wrong?

• commented on March 15, 2020, 8:24 p.m.

The hasher() function is extermely weak, the best way to check if two strings are permutations of each other is to use a frequency array.

• commented on May 12, 2020, 12:35 p.m. edit 2

I added a frequency array and modified my rabin karp hashing function as well. And now I can pass the first 3 batches, but I am getting TLE on the 4th batch. It goes perfectly upto case #29 with time of 0.008 seconds, but gets tle on case#40.

I am using a sliding window to maintain a frequency array and i check it everytime i update it. If the frequency array matches I insert a hash into a set.

• commented on May 12, 2020, 2:30 p.m. edited

you are hashing the entire substring everytime, that is slow, there is a way to check substring hash in time

• commented on May 12, 2020, 6:01 p.m. edited

:O HOW? If i just compare the strings then I get MLE. So it's either MLE or TLE :/

• commented on May 12, 2020, 6:05 p.m.

Do you really need to recalculate the hash again for the ENTIRE new substring? Only one new character is introduced and the last character is deleted. Think about how you can use your previous hash to efficiently compute your new hash.

• commented on May 12, 2020, 1:21 a.m. edit 2

thanks a lot, idk why i never thought of a frequency array during the contest :(

• commented on March 13, 2020, 8:39 p.m.

Are the other CCC problems going to be posted?

• commented on March 14, 2020, 1:24 a.m.

We are waiting for test data for those problems.