Editorial for WC '15 Contest 1 J4 - Trip Budgeting


Remember to use this editorial only when stuck, and not to copy-paste code from it. Please be respectful to the problem author and editorialist.
Submitting an official solution before solving the problem yourself is a bannable offence.

This question is the trickiest of the entire junior contest, and tests the competitor's ability to perform a brute force search. Given N attractions, we would like to know the subset of attractions we select in order to minimize the cost they incur. Now, notice that if we are given a set of attractions, we can very easily follow the instructions to calculate out how much the attraction will cost (as well as the level of excitement it generates). After solving the last problem, it should seem like basic busywork for us to calculate the sum and max of a bunch of values. So the crux of the question is really "which attractions do we pick?" In such contests, it's always a good idea to keep it simple. A simple, inefficient idea that passes under the time limit is always better than an advanced, efficient idea which is hard to implement. This is why we should always first ask ourselves whether we can "brute-force" it (try all possibilities). That is, we want to generate every possible subset of attractions and then compute the excitement level and cost of each one, comparing the ones with at least an excitement of E_{min} each other to find the best cost.

To determine whether an idea is feasible (in this case, fast enough to pass under the time limit), we will have to introduce the concept of asymptotic analysis, which is a method to describe how rapidly an algorithm slows down as the input grows in size. Big O notation is used to describe the rate of growth of a function as a function of the input size. These concepts are key in analyzing algorithms and understanding them is critical for solving harder contest problems. There are many great articles on these topics that may be found online, so we will not explain it here.

In this case, we want to know how quickly our brute-force algorithm will slow down as the number of attractions increases. So for every attraction, we have two choices: either choose to attend the attraction or don't. Since we have 2 possible choices for each of N total times, the number of total ways to pick out attractions is 2^N. Since for every way we pick out a configuration, we also have to calculate the excitement level as well as the costs by looping over the N attractions, we have to multiply the number of configurations by the number of "steps" we take per configuration to obtain the total number of "steps" that our program can take in the worst case. In big O notation, we can say that our program runs in \mathcal O(2^N \times N). Actually plugging numbers into this for the worst case (largest possible N, which is 20), we get 2^{20} \times 20 = 20\,971\,520 (roughly 20 million) possible ways to pick out the attractions. A good rule of thumb for programming contests is to have at most 60 million "steps" per second of time limit. Since we have 1 second of time limit, 20 million steps in our brute-force algorithm should comfortably pass.

After confirming that our idea is efficient enough, the next step is to implement it. There are two approaches we will explore: recursion (backtracking) and bitmasks.


Most competitors have decided to solve the problem using recursion. The idea is to enumerate every possible subset of the N attractions using recursive "backtracking", and after we've done it for all N attractions (reaching the base case), we calculate the excitements and costs and update our answer accordingly. Once again, you can look up articles online to better understand the concept. Many recursive approaches submitted during the contest apply certain forms of optimization involving pruning out configurations before we've generated all N choices, but our analysis above proved that this was not necessary. As an aside, passing arrays as arguments to functions (especially throwing them around in recursive ones) is a common cause of confusion amongst many programmers (passing by reference vs. value works differently across languages, and programmers may not be familiar with these concepts), so it's good to avoid altogether and try to handle everything globally.

All we need is to keep a global array of N boolean values where each index stores whether we've decided to keep a particular attraction and a single integer as the parameter of our recursive function (the current attraction which we are making a decision for). At each step of the recursion, we will mark the current attraction as "attended" in the global array, then let the next layer of recursion handle the next attraction. Then in the same step, we mark the attraction as "not attended", and let the next layer of recursion handle the next attraction. At the start of the function, we check if all N attractions have been decided. If so, we calculate and update our answer before cutting off the recursion.


Alternatively, we can make the brute-force even simpler with a trick using "bitmasks" and binary numbers. Every integer has a base-2 (binary) representation, where it is simply a string of 0's and 1's. For N binary digits, it just so happens that there are 2^N distinct binary numbers that can be represented (with the lowest number being a string of 0's in base-2 and the highest being a string of 1's). By the basic properties of binary numbers, it is actually true that all possible configurations of 0's and 1's using exactly N digits will be represented. Our boolean array can actually be stored in a single 32-bit integer, with 12 bits left to spare! If the i^\text{th} binary digit of a number is 1, we will say that we attend attraction i, otherwise if it's a 0, we say that we don't attend it. By simply looping a number (which we will call a "mask") upwards from 0 to 2^N-1, we will have enumerated all possible subsets of the N things. Now, we could convert each integer to a binary string using base-conversion algorithms and then access each bit, but this misses the beauty of bitmasks. Nearly all programming languages support bitwise operations of AND, OR, NOT, and shifts on integers making the accessing of individual bits really easy and efficient. Given our mask that is really just an integer (call it M), the only operation we care about is querying the status of a bit (say at the i^\text{th} position of the binary representation). The operation we need is M and (1 leftshift i) (or equivalently, (M rightshift i) and 1). If the bit is non-zero, then we know the i^\text{th} bit is set (and so we will consider the i^\text{th} attraction part of our current selected subset), otherwise we consider the attraction not to have been selected. More information can be found here.

Now why is this seemingly overcomplicated method advantageous to recursion? First, recursion is done by your computer with an internal call-stack which can make your program less efficient (not asymptotically as we've discussed above, but by a constant factor). Avoiding it and just sticking to loops may improve your program's performance significantly. But more importantly, bitmasks often make solutions ridiculously short and sweet. The following is a solution using bitmasks to demonstrate this. Note that << is the left shift operator in C++, and (1 << N) results in a number that is 2^N (so looping while we're less than 1 << N means we're looping up to and including 2^N-1).


Comments

There are no comments at the moment.