7

I am looking for a data structure to store a set such that given two instances of size $O(n)$ which are known to have non-empty intersection, the minimum element of the intersection can be found in $O(\log n)$ time. Is this possible to achieve, either for worst-case or amortized complexity? Other requirements for the data structure: $O(\log n)$ deletion, $O(n \log n)$ initialization.

Here is an example application of such a data structure, to clarify the requirements. The input consists of n subsets of $\{1, ..., n\}$ all containing the number n. The output is an n by n matrix whose $i, j$ entry is the minimal element in the intersection of sets i and j. With a basic approach one can solve this problem in $O(n^3)$ time. With a data structure satisfying the conditions above, one could solve it in $O(n^2 \log n)$ time.

pre-kidney
  • 121
  • 6

3 Answers3

2

You can't. There is no such data structure. Assuming you have a separate instance per set, and each instance is initialized separately (using only information about the set it represents and not any information about any of the other sets), these running times are not achievable.

In particular, when you have two sets, finding the minimum common element takes $\Omega(n)$ time. Indeed, testing disjointness requires $\Omega(n)$ time, as explained here. Now, imagine starting with two sets $S_1,S_2$ over the universe $\{1,2,\dots,n-1\}$. Let $T_1=S_1 \cup \{n\}$ and $T_2 = S_2 \cup \{n\}$. Now $T_1,T_2$ are guaranteed to have a common element. So, if you had a good data structure for your problem, store $T_1$ in one instance of the data structure and $T_2$ in another. Then, if we had a way to find the minimum element of $T_1 \cap T_2$ in $o(n)$ time, this would give us a way to test disjointness of $S_1,S_2$ in $o(n)$ time (just test whether the minimum element is smaller than $n$) -- but we already know the latter is not possible. It follows that the former is not possible, either, i.e., any data structure for your problem must take $\Omega(n)$ time to find the minimum common element of two sets.

This doesn't mean that your application can't be solved efficiently. There still could be a way to solve your application in $O(n^2 \log n)$ time; this result doesn't rule that out.

D.W.
  • 167,959
  • 22
  • 232
  • 500
-1

Here is an idea to solve the problem, given 2 sets:

You can hold "sets" by a red black tree. In addition, for every node in the tree we associate one bit to determine if its subtree contains an element in both sets. For sake of presentation, it is called the insertion bit. I assume the red black tree sorts the elements from left to right.

When inserting an element to the tree, the algorithm checks if the element exists in the tree (i.e., in the other the set). If not- we insert the element as usual. If not, by traveling from the root to the leaf containing the element, the algorithm turns on the insertion bit of the corresponding nodes. In the worst case it takes $O(\log n)$.

When deleting an element, the algorithm checks if the element exists in the tree, and if the insertion bit is turned on. If the element does not exists in the tree- we return an error. If the element exists, and the insertion bit is off, then we delete the element as in the Red Black tree algorithm. Otherwise, by traveling from the root to the leaf containing the element, the algorithm turns off the insertion bit of the corresponding nodes. Deletion takes $O(\log n)$.

Finally, the algorithm for finding minimal element shared by both sets begins with the root. If the insertion bit of the root is turn off- then the sets are disjointed, the the algorithm returns an error. Otherwise, the algorithm travels recursively to the left child if its insertion bit is turned on, and otherwise it travels to the right child. The algorithm stops at the element with the minimal value. The algorithm runs at $O(\log n)$.

I am trying to think how to generalized the for a larger number of sets...

user3563894
  • 377
  • 1
  • 12
-1

Initialize:
1) create a red-black tree containing all elements of list #1 - O(n log n) for the entire list.
2) iterate through all elements of list #2, and check if it exists in the red-black tree - O(n log n) for the entire list
3) if it exists in the red-black tree, insert that element from list #2 into your favorite min heap - O(n log n) for the entire list

To then search find the min intersecting element just look at the top of the heap, so that's O(1).

Churro
  • 1