IOI18F Meetings Post 1

xiaowuc1, rpeng, npinsker posted on Oct. 27, 2018, 2:00 a.m.

This is the first of a series of blog posts about the task `meetings' from IOI18. It is also available in a contest that will take place in conjunction with these posts.

We chose this problem as an experiment to see whether we can present a single problem in a way that it covers a lot of useful knowledge for a contest related topic. In this aspect, meetings is a great task in that it starts with dynamic programming, then moves into range queries and tree constructions, and finally, finishes it off with geometric data structures.

Furthermore, the setting of meetings is as classical as a data structure problem can get: you get a static array, and you query about some statistic on its subintervals. This is a setting that has been viewed as "completely understood" by most competitors who aim for top 40 or higher at the IOI since around 2005. However, this problem was given, and had 0 solvers. I've also heard several past contestants who placed in the Top-3 at IOIs refer to this problem as "Godly", so that's another reason to discuss it in more details.

In short, the ideas underlying meetings are extremely novel, especially given how well understood the topic is. Also, if mechanically you can solve meetings in 2 hours reliably, you shouldn't have trouble with most data structural topics.

Problem Statement

Meetings is a problem with a backstory about $N$ ~N~ mountains that lie in a row. These mountains are numbered from $0$ ~0~ through $N-1$ ~N-1~ in order, and the only way to travel from mountain $i$ ~i~ to mountain $j$ ~j~, assuming $i < j$ ~i < j~, is to visit mountains $i+1$ ~i+1~ through $j-1$ ~j-1~. Mountain $i$ ~i~ has height $A_i$ ~A_i~. One person lives at the top of each mountain.

We say that the cost of traveling from mountain $i$ ~i~ to mountain $j$ ~j~ is equal to the maximum height of all the mountains from mountain $i$ ~i~ to mountain $j$ ~j~, inclusive.

On each of $Q$ ~Q~ days, a meeting will be held. On day $j$ ~j~, the people in the mountains from $L_j$ ~L_j~ to $R_j$ ~R_j~, inclusive, want to meet at the top of a specific mountain between mountain $L_j$ ~L_j~ and mountain $R_j$ ~R_j~. They wish to select a mountain that minimizes the sum of the costs that each individual incurs. The total sum of the costs that each individual incurs is the cost of the meeting.

Problem Formalization

The problem here is to compute, for each of the days, the minimum possible cost of the meeting on that day.

We start by formalizing the problem. Imagine that we are given an array $A$ ~A~ of $N$ ~N~ integers. Define $m(i, j)$ ~m(i, j)~ for $i \le j$ ~i \le j~ to be the maximum of $A[i], A[i+1], \dots\, A[j-1], A[j]$ ~A[i], A[i+1], \dots\, A[j-1], A[j]~, and define $m(i, j) = m(j, i)$ ~m(i, j) = m(j, i)~ if $i > j$ ~i > j~. For $Q$ ~Q~ pairs of integers $(L_i, R_i)$ ~(L_i, R_i)~, compute $f(L_i, R_i) = \displaystyle\min_{L_i \le k \le R_i} \displaystyle\sum_{j=L_i}^{R_i} m(j, k)$ .

Problem Examples

We'll use the examples in the provided problem statement to help make this more clear. Let $A = [2, 4, 3, 5]$ ~A = [2, 4, 3, 5]~, let $(L_0, R_0) = (0, 2)$ ~(L_0, R_0) = (0, 2)~, and let $(L_1, R_1) = (1, 3)$ ~(L_1, R_1) = (1, 3)~.

We first analyze $(L_0, R_0)$ ~(L_0, R_0)~.

If we set $k = 0$ ~k = 0~, then we're computing $m(0, 0) + m(1, 0) + m(2, 0)$ ~m(0, 0) + m(1, 0) + m(2, 0)~. $m(0, 0) = \max(2) = 2$ ~m(0, 0) = \max(2) = 2~. $m(1, 0) = \max(2, 4) = 4$ ~m(1, 0) = \max(2, 4) = 4~. $m(2, 0) = \max(2, 4, 3) = 4$ ~m(2, 0) = \max(2, 4, 3) = 4~, for a total sum of $10$ ~10~.

If we set $k = 1$ ~k = 1~, then we're computing $m(0, 1) + m(1, 1) + m(2, 1)$ ~m(0, 1) + m(1, 1) + m(2, 1)~. $m(0, 1) = \max(2, 4) = 4$ ~m(0, 1) = \max(2, 4) = 4~. $m(1, 1) = \max(4) = 4$ ~m(1, 1) = \max(4) = 4~. $m(2, 1) = \max(4, 3) = 4$ ~m(2, 1) = \max(4, 3) = 4~, for a total sum of $12$ ~12~.

If we set $k = 2$ ~k = 2~, then we're computing $m(0, 2) + m(1, 2) + m(2, 2)$ ~m(0, 2) + m(1, 2) + m(2, 2)~. $m(0, 2) = \max(2, 4, 3) = 4$ ~m(0, 2) = \max(2, 4, 3) = 4~. $m(1, 2) = \max(4, 3) = 4$ ~m(1, 2) = \max(4, 3) = 4~. $m(2, 2) = \max(3) = 3$ ~m(2, 2) = \max(3) = 3~, for a total sum of $11$ ~11~.

Therefore, the optimal choice for $k$ ~k~ is $0$ ~0~, giving a sum of $10$ ~10~.

We now analyze $(L_1, R_1)$ ~(L_1, R_1)~.

If we set $k = 1$ ~k = 1~, then we're computing $m(1, 1) + m(2, 1) + m(3, 1)$ ~m(1, 1) + m(2, 1) + m(3, 1)~. $m(1, 1) = \max(4) = 4$ ~m(1, 1) = \max(4) = 4~. $m(2, 1) = \max(4, 3) = 4$ ~m(2, 1) = \max(4, 3) = 4~. $m(3, 1) = \max(4, 3, 5) = 5$ ~m(3, 1) = \max(4, 3, 5) = 5~, for a total sum of $13$ ~13~.

If we set $k = 2$ ~k = 2~, then we're computing $m(1, 2) + m(2, 2) + m(3, 2)$ ~m(1, 2) + m(2, 2) + m(3, 2)~. $m(1, 2) = \max(4, 3) = 4$ ~m(1, 2) = \max(4, 3) = 4~. $m(2, 2) = \max(3) = 3$ ~m(2, 2) = \max(3) = 3~. $m(3, 2) = \max(3, 5) = 5$ ~m(3, 2) = \max(3, 5) = 5~, for a total sum of $12$ ~12~.

If we set $k = 3$ ~k = 3~, then we're computing $m(1, 3) + m(2, 3) + m(3, 3)$ ~m(1, 3) + m(2, 3) + m(3, 3)~. $m(1, 3) = \max(4, 3, 5) = 5$ ~m(1, 3) = \max(4, 3, 5) = 5~. $m(2, 3) = \max(3, 5) = 5$ ~m(2, 3) = \max(3, 5) = 5~. $m(3, 3) = \max(5) = 5$ ~m(3, 3) = \max(5) = 5~, for a total sum of $15$ ~15~.

Therefore, the optimal choice for $k$ ~k~ is $k=2$ ~k=2~, giving a sum of $12$ ~12~.

Rephrasing the Problem

In the original problem, we were given $Q$ ~Q~ pairs of integers $(L_i, R_i)$ ~(L_i, R_i)~ for which we were asked to compute $\displaystyle\min_{L_i \le k \le R_i} \displaystyle\sum_{j=L_i}^{R_i} m(j, k)$ . Imagine that we had to report the answers for all $\binom{N+1}{2}$ ~\binom{N+1}{2}~ pairs of integers. How quickly can we generate the answers for all of these pairs? Because there are $\binom{N+1}{2}$ ~\binom{N+1}{2}~ pairs of integers, any such approach would be $\Omega(N^2)$ ~\Omega(N^2)~, so the best that we can aim for is a $\Theta(N^2)$ ~\Theta(N^2)~ algorithm.

Initial Thoughts

If we wanted to maximize the given sum for a given interval $(L_i, R_i)$ ~(L_i, R_i)~, then we would wish to pick $k$ ~k~ such that $A[k]$ ~A[k]~ was the maximum value in the subarray from $A[L_i]$ ~A[L_i]~ to $A[R_i]$ ~A[R_i]~ - this is because every subarray contains $A[k]$ ~A[k]~, and the sum we get is $(R_i - L_i + 1) \cdot A[k]$ ~(R_i - L_i + 1) \cdot A[k]~, which is the maximum attainable sum. In our problem though, we're trying to minimize the given sum. A natural thing to try would be to pick the minimum value in the subarray. If we look closely at the examples, we see that the optimal choice of $k$ ~k~ was the minimum value in the subarray. Does this value always give us the minimum sum?

It turns out that the answer is no. Consider the following example: $A = [2, 2, 2, 100, 1]$ ~A = [2, 2, 2, 100, 1]~, with $(L_0, R_0) = (0, 4)$ ~(L_0, R_0) = (0, 4)~. If we picked $k=4$ ~k=4~ because $A[4]$ ~A[4]~ is the smallest value in that subarray, the cost would be $100 + 100 + 100 + 100 + 1 = 401$ ~100 + 100 + 100 + 100 + 1 = 401~. However, if we picked $k=2$ ~k=2~, then our cost would be $2 + 2 + 2 + 100 + 100 = 206$ ~2 + 2 + 2 + 100 + 100 = 206~, which turns out to be optimal in this scenario.

Starting with Brute Force

Without a clear way to tell, for a given pair of integers $(L_i, R_i)$ ~(L_i, R_i)~, which value of $k$ ~k~ minimizes the desired sum, we can start just by trying all of them. The most naive brute force approach might look something like this:

def m(A, i, j):
  if j < i:
    return m(A, j, i)
  maximum_value = A[i]
  for k in i+1 to j, inclusive:
    maximum_value = max(maximum_value, A[k])
  return maximum_value

# assume A is the given array of integers, and the desired interval is [l, r]
def f(A, l, r):
  best_cost = infinity
  for k in l to r, inclusive:
    cost_with_meeting_at_k = 0
    for i in l to r, inclusive:
      cost_with_meeting_at_k += m(A, i, k)
    best_cost = min(best_cost, cost_with_meeting_at_k)
  return best_cost

What is the runtime of this algorithm? We can annotate the above code.

def m(A, i, j):
  # runs in O(N) time
  if j < i:
    return m(A, j, i)
  maximum_value = A[i]
  for k in i+1 to j, inclusive: # O(N)
    maximum_value = max(maximum_value, A[k])
  return maximum_value

# assume A is the given array of integers, and the desired interval is [l, r]
def f(A, l, r):
  # runs in O(N^3) time
  best_cost = infinity
  for k in l to r, inclusive: # O(N)
    cost_with_meeting_at_k = 0
    for i in l to r, inclusive: # O(N)
      cost_with_meeting_at_k += m(A, i, k) # O(N)
    best_cost = min(best_cost, cost_with_meeting_at_k)
  return best_cost

If we passed in all possible inputs to this function, we would solve the problem in $O(N^5)$ ~O(N^5)~ time, which is very far from the best-case performance of $\Theta(N^2)$ ~\Theta(N^2)~.

One way we can start optimizing this is by precomputing all of the possible outputs that m can give. If we precompute all of these results, then the computation in the innermost for loop is $O(1)$ ~O(1)~ instead of $O(N)$ ~O(N)~. This means that f runs in $O(N^2)$ ~O(N^2)~ time, and as long as we can precompute all of the possible outputs for m in $O(N^4)$ ~O(N^4)~, we can solve the problem in $O(N^4)$ ~O(N^4)~ time.

Computing range maxima quickly

The simplest approach to precomputing all possible outputs for m is as follows:

def m(A, i, j):
  # runs in O(N) time
  if j < i:
    return m(A, j, i)
  maximum_value = A[i]
  for k in i+1 to j, inclusive: # O(N)
    maximum_value = max(maximum_value, A[k])
  return maximum_value

def precompute_m(A):
  n = len(A)
  m_values = {}
  for i in 0 to n-1, inclusive:
    for j in i to n-1, inclusive:
      m_values[i][j] = m(A, i, j)
  return m_values

This approach runs in $O(N^3)$ ~O(N^3)~ time, which is a sufficient improvement initially. However, if we want to solve the original problem in $O(N^2)$ ~O(N^2)~ time, then we also need to precompute all of these values in $O(N^2)$ ~O(N^2)~ time.

Imagine the following hypothetical situation: $A = [1, 3, 2, 4]$ ~A = [1, 3, 2, 4]~. We note the following:

m(A, 0, 0) = max([1]) = 1
m(A, 0, 1) = max([1, 3]) = 3
m(A, 0, 2) = max([1, 3, 2]) = 3
m(A, 0, 3) = max([1, 3, 2, 4]) = 4

If we look carefully at the lists that we're computing maxima for, if we hold $i$ ~i~ constant and increment $j$ ~j~, we're merely adding new integers to the list. Adding a new integer cannot decrease the maximum of the list - in fact, the only way that it increases the maximum of the list is if the given integer is larger than any of the previous integers. This leads us to the following conclusion when $j > i$ ~j > i~: $m(i, j) = \max(m(i, j-1), A[j])$ ~m(i, j) = \max(m(i, j-1), A[j])~. This means that, for a fixed value of $i$ ~i~, we can compute $m(i, j)$ ~m(i, j)~ for $i \le j \le N-1$ ~i \le j \le N-1~ in $O(N)$ ~O(N)~ time. Since there are $N$ ~N~ distinct possibilities for $i$ ~i~, we can compute all possible values of m in $O(N^2)$ ~O(N^2)~ time. Sample pseudocode for this is as follows:

def precompute_m(A):
  n = len(A)
  m_values = {}
  for i in 0 to n-1, inclusive:
    m_values[i][i] = A[i]
    for j in i+1 to n-1, inclusive:
      m_values[i][j] = max(m_values[i][j-1], A[j])
  return m_values

If we also needed to maintain where to find these values, we can add that information as well, as follows:

def precompute_m(A):
  n = len(A)
  m_values = {}
  m_index = {}
  for i in 0 to n-1, inclusive:
    m_values[i][i] = A[i]
    m_index[i][i] = i
    for j in i+1 to n-1, inclusive:
      if A[j] > m_values[i][j-1]:
        m_values[i][j] = m_values[i][j-1]
        m_index[i][j] = m_index[i][j-1]
      else:
        m_values[i][j] = A[j]
        m_index[i][j] = j
  return m_values, m_index

Further Observations

We'll return to the mountain analogy for ease of explanation. We noticed earlier that if everyone met on the highest mountain, this would maximize the cost of the meeting. Outside of the case where there's only one mountain, this means that we should consider a mountain at a lower elevation, either to the left or to the right of the mountain with the highest elevation. For notational convenience, let the meeting comprise the individuals from mountains $l$ ~l~ to $r$ ~r~, and let mountain $k$ ~k~ be the mountain with the highest elevation (choosing one arbitrarily if multiple mountains share the highest elevation).

Assume without loss of generality that the meeting is held to the right of mountain $k$ ~k~. Everyone on mountains $l$ ~l~ through $k$ ~k~ will contribute cost equal to the height of mountain $k$ ~k~ since they must travel through it to meet with the other individuals and it is the highest mountain. Since their costs are now fixed, we can ignore them and consider only the individuals on mountains $k+1$ ~k+1~ to $r$ ~r~. Note though that this is a strictly smaller instance of the original problem, and a solution in the situation of considering the individuals from $k+1$ ~k+1~ to $r$ ~r~ generalizes to the original problem instance of considering the individuals on mountains $l$ ~l~ through $r$ ~r~. By symmetric logic, if the mountain is to be held to the left of mountain $k$ ~k~, then all the individuals on mountains $k$ ~k~ through $r$ ~r~ contribute cost equal to the height of mountain $k$ ~k~, and we are solving a smaller instance of organizing a meeting for the individuals on mountains $l$ ~l~ through $k-1$ ~k-1~.

Example

Consider $A = [9, 8, 10, 6, 7]$ ~A = [9, 8, 10, 6, 7]~, and imagine that we are computing $f(0, 4)$ ~f(0, 4)~. $A[2] = 10$ ~A[2] = 10~ is the maximum value in the array. If the optimal choice is to the right, then three individuals incur a cost of $10$ ~10~, and we solve the problem instance $f(3, 4)$ ~f(3, 4)~, and the choices are index $3$ ~3~, giving $6+7=13$ ~6+7=13~, or index $4$ ~4~, giving $7+7=14$ ~7+7=14~. The final cost in this situation is $3\cdot10 + 13 = 43$ ~3\cdot10 + 13 = 43~. If the optimal choice is to the left, then three individuals incur a cost of $10$ ~10~, and we solve the problem instance $f(0, 1)$ ~f(0, 1)~, and the choices are index $0$ ~0~, giving $9+9=18$ ~9+9=18~, or index $1$ ~1~, giving $9+8=17$ ~9+8=17~. The final cost in this situation is $3 \cdot 10 + 17 = 47$ ~3 \cdot 10 + 17 = 47~. The minimum cost of the meeting is claimed to be $43$ ~43~.

If we compute $f(0, 4)$ ~f(0, 4)~ manually for all possible choices of location, we get that location $0$ ~0~ has a cost of $48$ ~48~, location $1$ ~1~ has a cost of $47$ ~47~, location $2$ ~2~ has a cost of $50$ ~50~, location $3$ ~3~ has a cost of $43$ ~43~, and location $4$ ~4~ has a cost of $44$ ~44~. This demonstrates that the minimum cost is $43$ ~43~.

Pseudocode and Performance

We outline the following pseudocode for this approach:

def precompute_m(A):
  n = len(A)
  m_values = {}
  m_index = {}
  for i in 0 to n-1, inclusive:
    m_values[i][i] = A[i]
    m_index[i][i] = i
    for j in i+1 to n-1, inclusive:
      if A[j] > m_values[i][j-1]:
        m_values[i][j] = m_values[i][j-1]
        m_index[i][j] = m_index[i][j-1]
      else:
        m_values[i][j] = A[j]
        m_index[i][j] = j
  return m_values, m_index

def f(A, l, r):
  precompute m_index with precompute_m, if it hasn't already been done
  if l > r:
    # base case, the array is empty
    return 0
  if we've computed f(A, l, r), return it directly
  k = m_index[i][j]
  # if the optimal choice is in [l, k-1]
  left_cost = f(A, l, k-1) + A[k] * (r-k+1)
  # if the optimal choice is in [k+1, r]
  right_cost = f(A, k+1, r) + A[k] * (k-l+1)
  optimal_cost = min(left_cost, right_cost)
  memoize f(A, l, r) = optimal_cost
  return optimal_cost

What's the complexity of this approach? This isn't immediately obvious - note for example that calling $f(0, n-1)$ ~f(0, n-1)~ might end up calling $f(1, n-1)$ ~f(1, n-1)~, which calls $f(2, n-1)$ ~f(2, n-1)~, going all the way down to $f(n-1, n-1)$ ~f(n-1, n-1)~, so a single call might take linear time to run. If we call it again though, the run returns immediately due to the memoization. Is it possible that answering the queries in a particular order could result in $\omega(N^2)$ ~\omega(N^2)~ runtime?

It turns out that we can obviate the need to think about the order in which we answer the queries if we order them in such a way that the performance is guaranteed to be $\Theta(N^2)$ ~\Theta(N^2)~. Imagine if we answer the queries in nondecreasing order of $R_i - L_i$ ~R_i - L_i~ - we would start by computing $f(i, i)$ ~f(i, i)~ for all valid $i$ ~i~, then $f(i, i+1)$ ~f(i, i+1)~ for all valid $i$ ~i~, going all the way to $f(i, i+n-1)$ ~f(i, i+n-1)~. Note that the intervals $[l, k-1]$ ~[l, k-1]~ and $[k+1, r]$ ~[k+1, r]~ are strictly smaller than the interval $[l, r]$ ~[l, r]~. Therefore, if we process the queries in increasing size order, then because we memoize all the results, a given invocation of f will hit memoized values, or zero, immediately, and each query is guaranteed to be answered in constant time.

Runtime Analysis

When considering if an algorithm will be fast enough to solve a problem, keep in mind that modern-day machines are capable of executing on the order of 100 million operations a second. As a result, one quick way of evaluating if an algorithm is good enough to solve a problem is to plug in the values directly into the bound and compare it to 100 million.

For example, when $N$ ~N~ is 5000, $O(N^2)$ ~O(N^2)~ gives 25 million whereas $O(N^3)$ ~O(N^3)~ gives 125 billion, so an $O(N^3)$ ~O(N^3)~ algorithm is too slow but an $O(N^2)$ ~O(N^2)~ algorithm will be fast enough.

Comments

There are no comments at the moment.