Editorial for Facebook Hacker Cup '15 Round 2 P3 - Autocomplete Strikes Back


Remember to use this editorial only when stuck, and not to copy-paste code from it. Please be respectful to the problem author and editorialist.
Submitting an official solution before solving the problem yourself is a bannable offence.

This problem can be approached in a few different ways, but the intended algorithm consists of inserting all N words into a trie (while marking which nodes in the trie represent the end of a word), and then breaking the problem down into subproblems with dynamic programming. Let F(i, k) be the minimum number of letters to type in order to text k different words which end at nodes in node i's subtree (or rather, subtrie). We can compute F by recursively traversing the trie, with the final answer being F(\text{root}, K).

Let's establish that D(i) = depth of node i. Note that D(i) is then also the length of the prefix (and possibly complete word) represented by node i.

To establish some simple base cases, F(i, 0) = 0 for any node i, and F(i, 1) = D(i) for any node i besides the root. This is because, if only 1 word will be used from node i's subtree, then the prefix represented by node i won't be a prefix of any other chosen word, meaning that it can be used to type this word.

For any internal node i and any positive k \le K, we can compute F(i, k) by considering various distributions of k words amongst node i and its children's subtries. Consider that node i has children c_1, c_2, \dots, c_m, and we use v_1, v_2, \dots, v_m words from each of those children's subtries respectively. If node i represents the end of one of the N given words, then F(i, k) = F(c_1, v_1) + \dots + F(c_m, v_m) + D(i), with v_1 + \dots + v_m = k-1. Otherwise, F(i, k) = F(c_1, v_1) + \dots + F(c_m, v_m) with v_1 + \dots + v_m = k. Either way, we can consider all of the possible combinations of v_{1 \dots m} with a classic instance of dynamic programming across the children. In this fashion, we can in fact compute the values F(i, 1 \dots K) all at once in \mathcal O(mK^2) time.

Every node is the child of at most one other node, meaning that the overall time complexity of this algorithm is \mathcal O(CK^2), where C is the number of nodes in the trie (which is at most the sum of the lengths of the words).


Comments

There are no comments at the moment.