A data structure for optimistic concurrency control on ranges of bitwise-lexicographically-ordered keys. Intended to replace FoundationDB's skip list. Hardware for all benchmarks is a mac m1 2020. # FoundationDB's benchmark ## Skip list ``` New conflict set: 1.964 sec 0.637 Mtransactions/sec 2.546 Mkeys/sec Detect only: 1.859 sec 0.672 Mtransactions/sec 2.690 Mkeys/sec Skiplist only: 1.275 sec 0.980 Mtransactions/sec 3.921 Mkeys/sec Performance counters: Build: 0.0496 Add: 0.0539 Detect: 1.86 D.Sort: 0.412 D.Combine: 0.0139 D.CheckRead: 0.682 D.CheckIntraBatch: 0.00673 D.MergeWrite: 0.593 D.RemoveBefore: 0.148 ``` ## Radix tree (this implementation) ``` New conflict set: 1.289 sec 0.970 Mtransactions/sec 3.879 Mkeys/sec Detect only: 1.199 sec 1.043 Mtransactions/sec 4.170 Mkeys/sec Skiplist only: 0.542 sec 2.305 Mtransactions/sec 9.220 Mkeys/sec Performance counters: Build: 0.0395 Add: 0.0492 Detect: 1.2 D.Sort: 0.411 D.Combine: 0.0135 D.CheckRead: 0.22 D.CheckIntraBatch: 0.00652 D.MergeWrite: 0.322 D.RemoveBefore: 0.223 ``` # Our benchmark ## Skip list | ns/op | op/s | err% | total | benchmark |--------------------:|--------------------:|--------:|----------:|:---------- | 249.47 | 4,008,533.72 | 0.5% | 0.01 | `point reads` | 236.78 | 4,223,252.12 | 0.4% | 0.01 | `prefix reads` | 432.76 | 2,310,737.74 | 0.1% | 0.01 | `range reads` | 449.42 | 2,225,105.96 | 0.5% | 0.01 | `point writes` | 438.47 | 2,280,674.39 | 0.3% | 0.01 | `prefix writes` | 242.92 | 4,116,581.59 | 3.7% | 0.02 | `range writes` | 553.31 | 1,807,319.60 | 0.7% | 0.01 | `monotonic increasing point writes` ## Radix tree (this implementation) | ns/op | op/s | err% | total | benchmark |--------------------:|--------------------:|--------:|----------:|:---------- | 19.35 | 51,669,510.11 | 0.2% | 0.01 | `point reads` | 56.64 | 17,655,057.00 | 0.2% | 0.01 | `prefix reads` | 217.04 | 4,607,450.05 | 0.2% | 0.01 | `range reads` | 26.31 | 38,012,015.29 | 0.0% | 0.01 | `point writes` | 41.45 | 24,124,003.19 | 0.2% | 0.01 | `prefix writes` | 48.33 | 20,691,082.14 | 0.0% | 0.01 | `range writes` | 84.92 | 11,775,856.81 | 4.0% | 0.01 | `monotonic increasing point writes` # "Real data" test Point queries only, best of three runs. Gc ratio is the ratio of time spent doing garbage collection to time spent adding writes or doing garbage collection. Lower is better. ## skip list ``` Check: 11.267 seconds, 331.812 MB/s, Add: 5.33323 seconds, 131.635 MB/s, Gc ratio: 45.4671% ``` ## radix tree ``` Check: 2.48508 seconds, 1504.38 MB/s, Add: 2.07295 seconds, 338.666 MB/s, Gc ratio: 42.2825% ``` ## hash table (The hash table implementation doesn't work on range queries, and its purpose is to provide an idea of how fast point queries can be) ``` Check: 1.83931 seconds, 2032.56 MB/s, Add: 0.607861 seconds, 1154.93 MB/s, Gc ratio: 48.7142% ```