108 lines
3.8 KiB
Markdown
108 lines
3.8 KiB
Markdown
A data structure for optimistic concurrency control on ranges of bitwise-lexicographically-ordered keys.
|
|
|
|
Intended to replace FoundationDB's skip list.
|
|
|
|
Hardware for all benchmarks is a mac m1 2020.
|
|
|
|
# FoundationDB's benchmark
|
|
|
|
## Skip list
|
|
|
|
```
|
|
New conflict set: 1.957 sec
|
|
0.639 Mtransactions/sec
|
|
2.555 Mkeys/sec
|
|
Detect only: 1.845 sec
|
|
0.678 Mtransactions/sec
|
|
2.710 Mkeys/sec
|
|
Skiplist only: 1.263 sec
|
|
0.990 Mtransactions/sec
|
|
3.960 Mkeys/sec
|
|
Performance counters:
|
|
Build: 0.0546
|
|
Add: 0.0563
|
|
Detect: 1.84
|
|
D.Sort: 0.412
|
|
D.Combine: 0.0141
|
|
D.CheckRead: 0.671
|
|
D.CheckIntraBatch: 0.0068
|
|
D.MergeWrite: 0.592
|
|
D.RemoveBefore: 0.146
|
|
```
|
|
|
|
## Radix tree (this implementation)
|
|
|
|
```
|
|
New conflict set: 1.366 sec
|
|
0.915 Mtransactions/sec
|
|
3.660 Mkeys/sec
|
|
Detect only: 1.248 sec
|
|
1.002 Mtransactions/sec
|
|
4.007 Mkeys/sec
|
|
Skiplist only: 0.573 sec
|
|
2.182 Mtransactions/sec
|
|
8.730 Mkeys/sec
|
|
Performance counters:
|
|
Build: 0.0594
|
|
Add: 0.0572
|
|
Detect: 1.25
|
|
D.Sort: 0.418
|
|
D.Combine: 0.0149
|
|
D.CheckRead: 0.232
|
|
D.CheckIntraBatch: 0.0067
|
|
D.MergeWrite: 0.341
|
|
D.RemoveBefore: 0.232
|
|
```
|
|
|
|
# Our benchmark
|
|
|
|
## Skip list
|
|
|
|
| ns/op | op/s | err% | total | benchmark
|
|
|--------------------:|--------------------:|--------:|----------:|:----------
|
|
| 253.76 | 3,940,735.01 | 0.2% | 0.01 | `point reads`
|
|
| 270.83 | 3,692,307.69 | 0.2% | 0.01 | `prefix reads`
|
|
| 355.98 | 2,809,136.40 | 0.6% | 0.01 | `range reads`
|
|
| 455.77 | 2,194,104.53 | 0.3% | 0.01 | `point writes`
|
|
| 448.53 | 2,229,492.31 | 1.8% | 0.01 | `prefix writes`
|
|
| 248.34 | 4,026,737.54 | 1.4% | 0.02 | `range writes`
|
|
| 561.21 | 1,781,878.13 | 0.9% | 0.01 | `monotonic increasing point writes`
|
|
| 149,791.67 | 6,675.94 | 2.7% | 0.01 | `worst case for radix tree`
|
|
|
|
## Radix tree (this implementation)
|
|
|
|
| ns/op | op/s | err% | total | benchmark
|
|
|--------------------:|--------------------:|--------:|----------:|:----------
|
|
| 19.52 | 51,239,417.90 | 0.2% | 0.01 | `point reads`
|
|
| 56.74 | 17,623,200.20 | 1.0% | 0.01 | `prefix reads`
|
|
| 111.36 | 8,979,743.73 | 0.6% | 0.01 | `range reads`
|
|
| 28.63 | 34,931,089.16 | 0.2% | 0.01 | `point writes`
|
|
| 41.82 | 23,913,916.86 | 0.2% | 0.01 | `prefix writes`
|
|
| 48.75 | 20,512,820.51 | 0.8% | 0.01 | `range writes`
|
|
| 93.72 | 10,670,548.15 | 3.2% | 0.01 | `monotonic increasing point writes`
|
|
| 2,467,542.00 | 405.26 | 0.4% | 0.03 | `worst case for radix tree`
|
|
|
|
# "Real data" test
|
|
|
|
Point queries only, best of three runs. Gc ratio is the ratio of time spent doing garbage collection to time spent adding writes or doing garbage collection. Lower is better.
|
|
|
|
## skip list
|
|
|
|
```
|
|
Check: 11.3385 seconds, 329.718 MB/s, Add: 5.35612 seconds, 131.072 MB/s, Gc ratio: 45.7173%
|
|
```
|
|
|
|
## radix tree
|
|
|
|
```
|
|
Check: 2.48583 seconds, 1503.93 MB/s, Add: 2.12768 seconds, 329.954 MB/s, Gc ratio: 41.7943%
|
|
```
|
|
|
|
## hash table
|
|
|
|
(The hash table implementation doesn't work on range queries, and its purpose is to provide an idea of how fast point queries can be)
|
|
|
|
```
|
|
Check: 1.83386 seconds, 2038.6 MB/s, Add: 0.601411 seconds, 1167.32 MB/s, Gc ratio: 48.9776%
|
|
```
|