A data structure for optimistic concurrency control on ranges of bitwise-lexicographically-ordered keys. Intended to replace FoundationDB's skip list. Hardware for all benchmarks is a mac m1 2020. # FoundationDB's benchmark ## Skip list ``` New conflict set: 1.927 sec 0.649 Mtransactions/sec 2.595 Mkeys/sec Detect only: 1.838 sec 0.680 Mtransactions/sec 2.721 Mkeys/sec Skiplist only: 1.256 sec 0.995 Mtransactions/sec 3.981 Mkeys/sec Performance counters: Build: 0.0381 Add: 0.0499 Detect: 1.84 D.Sort: 0.411 D.Combine: 0.0141 D.CheckRead: 0.667 D.CheckIntraBatch: 0.00673 D.MergeWrite: 0.589 D.RemoveBefore: 0.146 ``` ## Radix tree (this implementation) ``` New conflict set: 1.318 sec 0.949 Mtransactions/sec 3.795 Mkeys/sec Detect only: 1.202 sec 1.040 Mtransactions/sec 4.160 Mkeys/sec Skiplist only: 0.542 sec 2.307 Mtransactions/sec 9.227 Mkeys/sec Performance counters: Build: 0.0566 Add: 0.058 Detect: 1.2 D.Sort: 0.411 D.Combine: 0.0136 D.CheckRead: 0.22 D.CheckIntraBatch: 0.00659 D.MergeWrite: 0.322 D.RemoveBefore: 0.226 ``` # Our benchmark ## Skip list | ns/op | op/s | err% | total | benchmark |--------------------:|--------------------:|--------:|----------:|:---------- | 257.12 | 3,889,241.18 | 0.2% | 0.01 | `point reads` | 276.38 | 3,618,145.21 | 0.3% | 0.01 | `prefix reads` | 494.19 | 2,023,531.84 | 0.2% | 0.01 | `range reads` | 451.22 | 2,216,229.54 | 1.3% | 0.01 | `point writes` | 435.80 | 2,294,622.46 | 0.3% | 0.01 | `prefix writes` | 246.67 | 4,053,999.27 | 4.2% | 0.02 | `range writes` | 555.46 | 1,800,304.91 | 0.9% | 0.01 | `monotonic increasing point writes` ## Radix tree (this implementation) | ns/op | op/s | err% | total | benchmark |--------------------:|--------------------:|--------:|----------:|:---------- | 19.40 | 51,554,711.61 | 0.2% | 0.01 | `point reads` | 57.10 | 17,514,573.13 | 0.4% | 0.01 | `prefix reads` | 215.65 | 4,637,096.77 | 0.4% | 0.01 | `range reads` | 27.52 | 36,340,784.38 | 0.2% | 0.01 | `point writes` | 42.16 | 23,720,515.40 | 0.7% | 0.01 | `prefix writes` | 48.33 | 20,691,082.14 | 2.7% | 0.01 | `range writes` | 87.93 | 11,372,164.55 | 2.5% | 0.01 | `monotonic increasing point writes` # "Real data" test Point queries only, best of three runs. Gc ratio is the ratio of time spent doing garbage collection to time spent adding writes or doing garbage collection. Lower is better. ## skip list ``` Check: 11.3839 seconds, 328.404 MB/s, Add: 5.32878 seconds, 131.745 MB/s, Gc ratio: 45.5903% ``` ## radix tree ``` Check: 2.55069 seconds, 1465.69 MB/s, Add: 2.08443 seconds, 336.801 MB/s, Gc ratio: 41.748% ``` ## hash table (The hash table implementation doesn't work on range queries, and its purpose is to provide an idea of how fast point queries can be) ``` Check: 1.84205 seconds, 2029.54 MB/s, Add: 0.60281 seconds, 1164.61 MB/s, Gc ratio: 48.8159% ```