String Finding

View as PDF

Submit solution

Points: 5
Time limit: 1.0s
Memory limit: 64M

Author:
Problem type

You have a string (indexed from 0) with no more than 10^6 lowercase characters. Find the first occurrence of a string T (1 \le |T| \le |S| \le 10^6), or print -1 if T is not a substring of S.

Input Specification

The first line will have the string S.
The second line will have the string T.

Output Specification

Print the index of the first occurrence of the string T in S, or -1 if it is not a substring of S.

Sample Input

higuyshowsitgoinghiguys
higuys

Sample Output

0

Comments


  • -1
    gavin_chen  commented on March 21, 2022, 7:33 p.m.

    why am i getting a value error using (type str).index(type str) but not using (type str).find(type str)?


    • -1
      Spitfire720  commented on March 21, 2022, 7:59 p.m.

      I assume you're using Python. In that case, the proper syntax is (type list).index(type str). The .find method uses type string.


      • -1
        gavin_chen  commented on March 25, 2022, 7:55 p.m.

        Thanks for that :)


  • 2
    aeternalis1  commented on July 7, 2017, 9:57 a.m. edited

    Is it possible to create a solution for test case # 15 using python 3? I can't seem to run it under the time limit.


    • 6
      Kirito  commented on July 7, 2017, 11:28 a.m. edited

      Python's builtin find uses Boyer-Moore, which has a worst-case runtime of \mathcal O(|S| \times |T|). Rabin-Karp and KMP are two algorithms which have a worst-case runtime of \mathcal O(|S| + |T|).


      • -3
        TechnobladeNeverDies  commented on Sept. 18, 2022, 9:39 a.m.

        This is incorrect; Boyer-Moore runs in linear time when the pattern does not occur in the text, which is not the case for Python.


        • -2
          Kirito  commented on Sept. 20, 2022, 11:22 a.m. edited

          To be pedantic: it's described as "a simplication of Boyer-Moore, incorporating ideas from Horspool and Sunday". The \mathcal O(|S||T|) worst-case bound still holds. Whether you want to consider this "using Boyer-Moore" or not is up to you.

          If you want to get more specific, the CPython implementation is described in their code as

          /* fast search/count implementation, based on a mix between boyer-moore and horspool, with a few more bells and whistles on the top. for some more background, see: https://web.archive.org/web/20201107074620/http://effbot.org/zone/stringlib.htm */

          /* note: fastsearch may access s[n], which isn't a problem when using Python's ordinary string types, but may cause problems if you're using this code in other contexts. also, the count mode returns -1 if there cannot possibly be a match in the target string, and 0 if it has actually checked for matches, but didn't find any. callers beware! */

          /* If the strings are long enough, use Crochemore and Perrin's Two-Way algorithm, which has worst-case O(n) runtime and best-case O(n/k). Also compute a table of shifts to achieve O(n/k) in more cases, and often (data dependent) deduce larger shifts than pure C&P can deduce. See stringlib_find_two_way_notes.txt in this folder for a detailed explanation. */

          Source: https://github.com/python/cpython/blob/main/Objects/stringlib/fastsearch.h

          A challenge for anyone who's sufficiently bored would be figuring out if it is possible to force CPython to switch to the Two-Way algorithm on the given test data.