A data structure for optimistic concurrency control on ranges of bitwise-lexicographically-ordered keys. Intended to replace FoundationDB's skip list. Hardware for all benchmarks is a mac m1 2020. # FoundationDB's benchmark ## Skip list ``` New conflict set: 1.962 sec 0.637 Mtransactions/sec 2.549 Mkeys/sec Detect only: 1.853 sec 0.674 Mtransactions/sec 2.698 Mkeys/sec Skiplist only: 1.269 sec 0.985 Mtransactions/sec 3.940 Mkeys/sec Performance counters: Build: 0.0526 Add: 0.054 Detect: 1.85 D.Sort: 0.413 D.Combine: 0.0148 D.CheckRead: 0.678 D.CheckIntraBatch: 0.00679 D.MergeWrite: 0.591 D.RemoveBefore: 0.147 ``` ## Radix tree (this implementation) ``` New conflict set: 1.342 sec 0.931 Mtransactions/sec 3.726 Mkeys/sec Detect only: 1.227 sec 1.018 Mtransactions/sec 4.074 Mkeys/sec Skiplist only: 0.576 sec 2.169 Mtransactions/sec 8.676 Mkeys/sec Performance counters: Build: 0.0568 Add: 0.0564 Detect: 1.23 D.Sort: 0.414 D.Combine: 0.0162 D.CheckRead: 0.228 D.CheckIntraBatch: 0.00665 D.MergeWrite: 0.348 D.RemoveBefore: 0.211 ``` # Our benchmark ## Skip list | ns/op | op/s | err% | total | benchmark |--------------------:|--------------------:|--------:|----------:|:---------- | 254.17 | 3,934,426.23 | 0.3% | 0.01 | `point reads` | 275.58 | 3,628,769.92 | 2.1% | 0.01 | `prefix reads` | 494.87 | 2,020,718.51 | 0.7% | 0.01 | `range reads` | 495.91 | 2,016,512.61 | 1.7% | 0.01 | `point writes` | 478.11 | 2,091,578.00 | 0.8% | 0.01 | `prefix writes` | 307.08 | 3,256,480.40 | 1.9% | 0.04 | `range writes` | 610.89 | 1,636,953.81 | 0.7% | 0.01 | `monotonic increasing point writes` ## Radix tree (this implementation) | ns/op | op/s | err% | total | benchmark |--------------------:|--------------------:|--------:|----------:|:---------- | 22.39 | 44,667,832.17 | 0.2% | 0.01 | `point reads` | 53.89 | 18,554,840.41 | 3.4% | 0.01 | `prefix reads` | 217.71 | 4,593,220.24 | 0.2% | 0.01 | `range reads` | 29.72 | 33,642,099.66 | 1.1% | 0.01 | `point writes` | 44.86 | 22,294,037.02 | 0.8% | 0.01 | `prefix writes` | 55.00 | 18,181,818.18 | 0.8% | 0.03 | `range writes` | 109.21 | 9,156,917.53 | 2.9% | 0.01 | `monotonic increasing point writes` # "Real data" test Point queries only, best of three runs. Gc ratio is the ratio of time spent doing garbage collection to time spent adding writes or doing garbage collection. Lower is better. ## skip list ``` Check: 11.7282 seconds, 318.763 MB/s, Add: 5.76499 seconds, 121.776 MB/s, Gc ratio: 48.0772% ``` ## radix tree ``` Check: 3.27475 seconds, 1141.62 MB/s, Add: 2.13594 seconds, 328.678 MB/s, Gc ratio: 50.1739% ``` ## hash table (The hash table implementation doesn't work on range queries, and its purpose is to provide an idea of how fast point queries can be) ``` Check: 1.86291 seconds, 2006.81 MB/s, Add: 0.923653 seconds, 760.067 MB/s, Gc ratio: 54.2605% ```