Parallel Binary Search Algorithm

Note: For discussion, please jump to the blog on Codeforces. This post is re-written and hence a bit more refined, but carries the same underlying idea.

Pre-requisites

Binary search
Range trees such as segment tree

Problem

We aim to solve this SPOJ problem: Meteors. The question simply states :

There are \(N\) member states and \(M\) sectors. Each sector is owned by a member state. There are \(Q\) updates, each of which denote the amount of meteor shower in a \([L, R]\) range of sectors on that day. The \(i^{th}\) member state wants to collect \(reqd_i\) meteors over all its sectors. For every member state, what is the minimum number of days it would have to wait to collect atleast the required amount of meteors?

Naive Solution

For each member state, we can apply all queries one by one and check if their requirement is satisfied. This will have a time complexity of \(O(N \cdot Q \cdot logM)\) by keeping a range tree for supporting range updates and queries. This will fail to run within the allocated time for the given problem. The implementation details are not necessary but as a hint – you can compress the space while working on a single member state to only focus on the relevant sectors for counting meteors.

Even slower solution

We are going to come up with a solution that is slower than the naive solution to build up some intuition on how to come up with the optimized solution.

You can observe that for a given member state \(X\), the collected meteors across it’s owned sectors increases with the days. Hence, we can binary search over the number of days and use the range tree to perform every update. This gives us a time complexity of \(O(N \cdot log Q \cdot Q \cdot logM)\). This will fail to run within the allocated time for the given problem.

Optimized Solution

Let’s observe one binary search step in our above algorithm:

We find \(mid\) of the binary search
We apply all queries \(q_i\) for \(i \le mid\)
We test if the member state’s requirement is satisfied, and depending on it change either \(low\) or \(high\) for our next step

Notice how step (2) is exactly the same for the first step of all \(N\) member states. It would be really helpful if we could do it exactly once for the first step across all member states instead of per member state.

Let’s try to change our naive binary search to implement some sort of caching. Instead of doing a complete binary search per member state, we will do a single binary search over the \(N\) member states.

For the first step, every member state will have \(low = 1\) and \(high = Q\) and hence the same mid point.
After this step, we will end up with two groups – depending on if the members will update their \(low\) or \(high\) value. The member states will have a range of either \([1, \frac{Q}{2})\) or a range \([\frac{Q}{2}, Q]\).
After one more step, we will have 4 ranges – \([1, \frac{Q}{4})\), \([\frac{Q}{4}, \frac{Q}{2})\), \([\frac{Q}{2}, \frac{3Q}{4})\), \([\frac{3Q}{4}, Q]\).
You can see that after \(logQ\) steps, every range will be just a single point – the answer for that member state.

A cool way to visualize this is to think of a binary search tree. Suppose we are doing standard binary search, and we reject the right interval — this can be thought of as moving left in the tree. Similarly, if we reject the left interval, we are moving right in the tree. So what Parallel Binary Search does is move one step down in N binary search trees simultaneously in one “sweep”, taking \(O(N \cdot X)\) time, where \(X\) is dependent on the problem and the data structures used in it. Since the height of each tree is \(logN\), the complexity is \(O(N \cdot X \cdot logN)\).

Analysis

Overall, the algorithm runs for \(logQ\) steps. At every step, we process all \(Q\) queries, and use a range tree which supports range updates and point queries for \(M\) sectors in \(logM\). This gives us a time complexity of \(O(logQ \cdot (Q + M) \cdot logM)\), which is way more efficient that our naive approach. Note that this is the amortized time complexity, since at every step we visit every query, every member state and every meteor sector (for point query) at most once.

Implementation Details

Since this topic is non trivial, I have added this section to discuss some implementation details to make things more concrete. We would need the following data structures in our implementation :

linked list for every member state, denoting the sectors he owns.
arrays \(L\) and \(R\) denoting range of binary search for each member state.
range tree to support range update and point query for \(Q\) queries.
array of linked list \(check\). At every index \(x\), we store the member state IDs whose \(mid\) for their current binary search range is \(x\).

The implementation is actually pretty straight forward once you get the idea. For every step in the simulation, we do the following :

Clear range tree, and update \(check\) array based on the \(mid\) values for all member states.
For each query
- Apply the query on the range tree
- Loop over \(check\) for the current query ID and update the binary search range for member states in the linked list.

Pseudocode

steps = log(len(queries))
for step in range(steps):
    T = range tree
    check = array of linked list
    for all member states i:
        if L[i] != R[i]:
            mid = (L[i] + R[i]) / 2
            check[mid].append(i)
    for q in queries:
        T.apply(q)
        for all member states m in check[q]:
            if m has requirements fulfilled:
                R[m] = q
            else:
                L[m] = q + 1

In this code, the \(apply()\) function applies the query update on the range tree. Also to check if the requirements are fulfilled, one needs to traverse over all the sectors owner by that member state and find out the sum. In case you still have doubts, go over to the next section and see my code for this problem.

Practice

Alternate solutions are possible – almost every problem for parallel binary search can also be solved with some persistent data structure. I picked up this trick from some comments on codeforces blog on Meteors problem, and wrote this due to the lack of a well written tutorial to grasp the concept. Happy coding!