4.6 KiB
4.6 KiB
A data structure for optimistic concurrency control on ranges of bitwise-lexicographically-ordered keys.
Intended as an alternative to FoundationDB's skip list.
Hardware for all benchmarks is an AMD Ryzen 9 7900 with (2x32GB) 5600MT/s CL28-34-34-89 1.35V RAM.
$ clang++ --version
Ubuntu clang version 20.0.0 (++20241118082208+63b926af5ff4-1~exp1~20241118082226.549)
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/lib/llvm-20/bin
```
# Microbenchmark
## Skip list
| ns/op | op/s | err% | ins/op | cyc/op | IPC | bra/op | miss% | total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
| 163.82 | 6,104,214.81 | 0.0% | 3,014.03 | 823.99 | 3.658 | 504.59 | 0.0% | 1.96 | `point reads`
| 160.65 | 6,224,551.06 | 0.0% | 2,954.16 | 808.03 | 3.656 | 490.17 | 0.0% | 1.92 | `prefix reads`
| 243.07 | 4,114,000.62 | 0.0% | 3,592.41 | 1,224.77 | 2.933 | 629.31 | 0.0% | 2.90 | `range reads`
| 454.64 | 2,199,540.91 | 0.1% | 4,450.57 | 2,297.43 | 1.937 | 707.92 | 2.1% | 5.43 | `point writes`
| 451.41 | 2,215,265.93 | 0.0% | 4,410.22 | 2,281.37 | 1.933 | 694.74 | 2.1% | 5.39 | `prefix writes`
| 302.92 | 3,301,213.43 | 0.0% | 2,315.38 | 1,530.92 | 1.512 | 396.69 | 3.3% | 3.60 | `range writes`
| 470.36 | 2,126,014.73 | 1.0% | 6,999.33 | 2,457.12 | 2.849 | 1,251.74 | 1.3% | 0.06 | `monotonic increasing point writes`
| 129,198.14 | 7,740.05 | 0.5% | 807,446.00 | 655,645.00 | 1.232 | 144,584.67 | 0.0% | 0.01 | `worst case for radix tree`
| 46.34 | 21,578,562.58 | 0.7% | 902.00 | 232.26 | 3.884 | 132.00 | 0.0% | 0.01 | `create and destroy`
## Radix tree (this implementation)
| ns/op | op/s | err% | ins/op | cyc/op | IPC | bra/op | miss% | total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
| 13.00 | 76,916,274.88 | 0.1% | 247.13 | 65.16 | 3.793 | 32.64 | 0.8% | 0.16 | `point reads`
| 15.03 | 66,535,609.26 | 0.1% | 299.99 | 75.25 | 3.987 | 42.50 | 0.5% | 0.18 | `prefix reads`
| 35.65 | 28,054,245.19 | 0.0% | 782.70 | 178.41 | 4.387 | 106.65 | 0.2% | 0.43 | `range reads`
| 20.79 | 48,099,317.88 | 0.4% | 376.04 | 104.10 | 3.612 | 49.97 | 0.7% | 0.25 | `point writes`
| 45.17 | 22,137,997.58 | 0.0% | 666.07 | 227.27 | 2.931 | 101.33 | 0.3% | 0.54 | `prefix writes`
| 40.35 | 24,784,682.32 | 0.0% | 732.33 | 202.46 | 3.617 | 111.64 | 0.1% | 0.49 | `range writes`
| 79.04 | 12,651,349.21 | 2.4% | 1,462.61 | 399.37 | 3.662 | 279.17 | 0.1% | 0.01 | `monotonic increasing point writes`
| 313,894.00 | 3,185.79 | 0.3% | 4,043,060.00 | 1,585,794.00 | 2.550 | 714,828.00 | 0.1% | 0.01 | `worst case for radix tree`
| 109.01 | 9,173,544.09 | 0.5% | 2,046.00 | 547.40 | 3.738 | 329.00 | 0.0% | 0.01 | `create and destroy`
# "Real data" test
Point queries only. Gc ratio is the ratio of time spent doing garbage collection to time spent adding writes or doing garbage collection. Lower is better.
## skip list
```
Check: 4.65306 seconds, 362.382 MB/s, Add: 3.92107 seconds, 146.729 MB/s, Gc ratio: 33.6261%, Peak idle memory: 5.61007e+06
```
## radix tree
```
Check: 1.02927 seconds, 1638.24 MB/s, Add: 1.33663 seconds, 430.436 MB/s, Gc ratio: 35.3609%, Peak idle memory: 2.32922e+06
```
## hash table
(The hash table implementation doesn't work on range queries, and its purpose is to provide an idea of how fast point queries can be)
```
Check: 0.856661 seconds, 1968.32 MB/s, Add: 0.709563 seconds, 810.831 MB/s, Gc ratio: 35.0388%, Peak idle memory: 0
```