2

Given a self-balancing binary search tree of size $n$, I want to perform the following operations:

  1. InsertInOrderSequentialBatch an ordered sequence of $k$ values (specialized $k \in \{2, 3, 4\}$ or generalized $k \in N $) which are guaranteed to be sequential in an in-order tree traversal immediately after insertion.

    • For example, insert $[310, 320, 330, 340]$ into a balanced tree containing $[100, 200, 500]$.

    • Future insertions might still be between the inserted nodes' values.

  2. DeleteRange all of $k$ nodes (likewise specialized or generalized) between 2 values in the BST.

For both operations, I want the tree to remain balanced.

With a Red-Black or AVL tree, I can achieve both operations through a sequence of $k$ insertions/deletions in $\mathcal{O}(k \log n)$, but wonder if a data structure (e.g. AVL/RB tree variant) could achieve $\mathcal{O}(\log n + k \log k)$ time for a real-world performance gain (e.g. traverse tree for insertion point, insert balanced tree into insertion point, perform one auto-balance pass)?

My needs are less theoretical and more in terms of wall-time & memory pressure for an implementation of Fortune's Algorithm - an algorithm with high constant-time multipliers is unfortunately not useful. My insertion sequence is biased & frequently multimodal (modes not known prior) with long streaks of insertions/deletions within modes. My tree size is anywhere from 100 to 3000 nodes.

Warty
  • 141
  • 4

2 Answers2

2

This can be solved with split and join operations, both achievable in $\mathcal{O}(\log n)$ for Red-Black and AVL trees.

For Red-Black Trees this is doable either leveraging finger trees or via extending a regular Red-Black Tree as in Ron Wein's "Efficient Implementation of Red-Black Trees with Split and Catenate Operations" implemented in CGAL with criticisms mentioned in https://stackoverflow.com/questions/29029894/red-black-tree-split-concatenate-in-logn-time (Tarjan might have a better paper).

For AVL trees this is doable according to Ramzi Fadel and Kim Vagn Jakobsen's "Data structures and algorithms in a two-level memory".

InsertInOrderSequentialBatch is expressable as a split followed by 2 joins. DeleteRange is 2 splits followed by 1 join.

Warty
  • 141
  • 4
0

Red-Black tree insertion requires an amortized O(1) recolorings and rotations; if you know where to insert, doing so and rebalancing is O(1). Therefore multi-insertion can be achieved via a single O(logN) search followed by k amortized O(1) insert-successor-and-rebalance operations, since it is possible to implement insert-successor-and-rebalance such that the new node's Black Height is always <= 2, which bounds the next search to a constant time.

Alternatively, but more complex, if blackHeight(treeN) > blackHeight(treeK), insertion can be done in log(n/k)+k time: treeify k values, then insert the tree k into tree n at a leaf and "pull" it upward to maintain black height invariants. This is very similar to implementing a split-join.

For my personal code, I'm doing the O(logN + k) approach first described if k's tree is smaller in black height than N's. Otherwise, I suspect that past some threshold the split-join approach is going to be faster.

Warty
  • 141
  • 4