13 Commits

Author SHA1 Message Date
f85b92f8db Improve next{Physical,Logical} codegen
All checks were successful
Tests / 64 bit versions total: 7949, passed: 7949
Tests / Debug total: 7947, passed: 7947
Tests / SIMD fallback total: 7949, passed: 7949
Tests / Release [clang] total: 7949, passed: 7949
Clang |Total|New|Outstanding|Fixed|Trend |:-:|:-:|:-:|:-:|:-: |0|0|0|0|:clap:
Tests / Release [gcc] total: 7949, passed: 7949
GNU C Compiler (gcc) |Total|New|Outstanding|Fixed|Trend |:-:|:-:|:-:|:-:|:-: |0|0|0|0|:clap:
Tests / Release [clang,aarch64] total: 5268, passed: 5268
Tests / Coverage total: 5315, passed: 5315
Code Coverage #### Project Overview No changes detected, that affect the code coverage. * Line Coverage: 97.67% (3139/3214) * Branch Coverage: 42.05% (18734/44548) * Complexity Density: 0.00 * Lines of Code: 3214 #### Quality Gates Summary Output truncated.
weaselab/conflict-set/pipeline/head This commit looks good
2024-11-09 13:21:36 -08:00
3c44614311 Allocate from freelist with min/max capacity constraints 2024-11-08 21:35:13 -08:00
9c1ac3702e Move index closer to start of Node{3,16}
This should slightly improve cache hit rate
2024-11-08 20:54:00 -08:00
224d21648a Make server_bench workload point writes of tuple-encoded keys 2024-11-08 20:50:54 -08:00
33f9c89328 Try reduceLog: true
All checks were successful
Tests / 64 bit versions total: 7949, passed: 7949
Tests / Debug total: 7947, passed: 7947
Tests / SIMD fallback total: 7949, passed: 7949
Tests / Release [clang] total: 7949, passed: 7949
Clang |Total|New|Outstanding|Fixed|Trend |:-:|:-:|:-:|:-:|:-: |0|0|0|0|:clap:
Tests / Release [gcc] total: 7949, passed: 7949
GNU C Compiler (gcc) |Total|New|Outstanding|Fixed|Trend |:-:|:-:|:-:|:-:|:-: |0|0|0|0|:clap:
Tests / Release [clang,aarch64] total: 5268, passed: 5268
Tests / Coverage total: 5315, passed: 5315
Code Coverage #### Project Overview No changes detected, that affect the code coverage. * Line Coverage: 97.74% (3066/3137) * Branch Coverage: 41.94% (18241/43494) * Complexity Density: 0.00 * Lines of Code: 3137 #### Quality Gates Summary Output truncated.
weaselab/conflict-set/pipeline/head This commit looks good
2024-11-04 17:03:24 -08:00
12c2d5eb95 Try not logging ctest output to stdout
Some checks failed
Tests / Coverage total: 5315, passed: 5315
Code Coverage #### Project Overview No changes detected, that affect the code coverage. * Line Coverage: 97.74% (3066/3137) * Branch Coverage: 41.94% (18241/43494) * Complexity Density: 0.00 * Lines of Code: 3137 #### Quality Gates Summary Output truncated.
Tests / 64 bit versions total: 7949, passed: 7949
Tests / Debug total: 7947, passed: 7947
Tests / SIMD fallback total: 7949, passed: 7949
Tests / Release [clang] total: 7949, passed: 7949
Clang |Total|New|Outstanding|Fixed|Trend |:-:|:-:|:-:|:-:|:-: |0|0|0|0|:clap:
Tests / Release [gcc] total: 7949, passed: 7949
GNU C Compiler (gcc) |Total|New|Outstanding|Fixed|Trend |:-:|:-:|:-:|:-:|:-: |0|0|0|0|:clap:
Tests / Release [clang,aarch64] total: 5268, passed: 5268
weaselab/conflict-set/pipeline/head There was a failure building this commit
2024-11-04 15:36:50 -08:00
db357e747d Add __mem{cpy,set}_chk to allowed symbol imports
Some checks reported errors
Tests / 64 bit versions total: 7949, passed: 7949
Tests / Debug total: 7947, passed: 7947
Tests / SIMD fallback total: 7949, passed: 7949
Tests / Release [clang] total: 7949, passed: 7949
Clang |Total|New|Outstanding|Fixed|Trend |:-:|:-:|:-:|:-:|:-: |0|0|0|0|:clap:
Tests / Release [gcc] total: 7949, passed: 7949
GNU C Compiler (gcc) |Total|New|Outstanding|Fixed|Trend |:-:|:-:|:-:|:-:|:-: |0|0|0|0|:clap:
Tests / Release [clang,aarch64] total: 5268, passed: 5268
Tests / Coverage total: 5315, passed: 5315
Code Coverage #### Project Overview No changes detected, that affect the code coverage. * Line Coverage: 97.74% (3066/3137) * Branch Coverage: 41.94% (18241/43494) * Complexity Density: 0.00 * Lines of Code: 3137 #### Quality Gates Summary Output truncated.
weaselab/conflict-set/pipeline/head Something is wrong with the build of this commit
2024-11-04 15:08:25 -08:00
4494359ca2 Make insert_iterations for interleaved writes more closely match
Some checks failed
Tests / 64 bit versions total: 7949, passed: 7949
Tests / Debug total: 7947, passed: 7947
Tests / SIMD fallback total: 7949, passed: 7949
Tests / Release [clang] total: 7949, passed: 7949
Clang |Total|New|Outstanding|Fixed|Trend |:-:|:-:|:-:|:-:|:-: |0|0|0|0|:clap:
Tests / Release [gcc] total: 7949, failed: 2, passed: 7947
Tests / Release [clang,aarch64] total: 5268, passed: 5268
Tests / Coverage total: 5315, passed: 5315
weaselab/conflict-set/pipeline/head There was a failure building this commit
2024-11-04 14:46:30 -08:00
f079d84bda Update README 2024-11-04 14:33:43 -08:00
724ec09248 Add to corpus 2024-11-04 14:30:12 -08:00
4eaad39294 Maintain capacity invariant strictly 2024-11-04 13:43:02 -08:00
891100e649 Add to corpus
All checks were successful
Tests / 64 bit versions total: 7901, passed: 7901
Tests / Debug total: 7899, passed: 7899
Tests / SIMD fallback total: 7901, passed: 7901
Tests / Release [clang] total: 7901, passed: 7901
Clang |Total|New|Outstanding|Fixed|Trend |:-:|:-:|:-:|:-:|:-: |0|0|0|0|:clap:
Tests / Release [gcc] total: 7901, passed: 7901
GNU C Compiler (gcc) |Total|New|Outstanding|Fixed|Trend |:-:|:-:|:-:|:-:|:-: |0|0|0|0|:clap:
Tests / Release [clang,aarch64] total: 5236, passed: 5236
Tests / Coverage total: 5283, passed: 5283
Code Coverage #### Project Overview No changes detected, that affect the code coverage. * Line Coverage: 97.62% (3112/3188) * Branch Coverage: 42.08% (18883/44869) * Complexity Density: 0.00 * Lines of Code: 3188 #### Quality Gates Summary Output truncated.
weaselab/conflict-set/pipeline/head This commit looks good
2024-11-01 21:32:00 -07:00
22e55309be Update benchmarks 2024-11-01 17:50:03 -07:00
268 changed files with 332 additions and 229 deletions

View File

@@ -210,7 +210,7 @@ enum Type : int8_t {
Type_Node256,
};
template <class T> struct BoundedFreeListAllocator;
template <class T> struct NodeAllocator;
struct TaggedNodePointer {
TaggedNodePointer() = default;
@@ -297,9 +297,9 @@ struct Node {
}
private:
template <class T> friend struct BoundedFreeListAllocator;
template <class T> friend struct NodeAllocator;
// These are publically readable, but should only be written by
// BoundedFreeListAllocator
// NodeAllocator
Type type;
int32_t partialKeyCapacity;
};
@@ -338,11 +338,12 @@ struct Node3 : Node {
constexpr static auto kMaxNodes = 3;
constexpr static auto kType = Type_Node3;
TaggedNodePointer children[kMaxNodes];
InternalVersionT childMaxVersion[kMaxNodes];
// Sorted
uint8_t index[kMaxNodes];
TaggedNodePointer children[kMaxNodes];
InternalVersionT childMaxVersion[kMaxNodes];
uint8_t *partialKey() {
assert(!releaseDeferred);
return (uint8_t *)(this + 1);
@@ -357,11 +358,12 @@ struct Node16 : Node {
constexpr static auto kType = Type_Node16;
constexpr static auto kMaxNodes = 16;
TaggedNodePointer children[kMaxNodes];
InternalVersionT childMaxVersion[kMaxNodes];
// Sorted
uint8_t index[kMaxNodes];
TaggedNodePointer children[kMaxNodes];
InternalVersionT childMaxVersion[kMaxNodes];
uint8_t *partialKey() {
assert(!releaseDeferred);
return (uint8_t *)(this + 1);
@@ -440,7 +442,10 @@ inline void Node3::copyChildrenAndKeyFrom(const Node0 &other) {
inline void Node3::copyChildrenAndKeyFrom(const Node3 &other) {
memcpy((char *)this + kNodeCopyBegin, (char *)&other + kNodeCopyBegin,
kNodeCopySize);
memcpy(children, other.children, sizeof(*this) - sizeof(Node));
memcpy(index, other.index, kMaxNodes);
memcpy(children, other.children, kMaxNodes * sizeof(children[0])); // NOLINT
memcpy(childMaxVersion, other.childMaxVersion,
kMaxNodes * sizeof(childMaxVersion[0]));
memcpy(partialKey(), &other + 1, partialKeyLen);
for (int i = 0; i < numChildren; ++i) {
assert(children[i]->parent == &other);
@@ -644,7 +649,7 @@ constexpr int kMinNodeSurplus = 104;
constexpr int kBytesPerKey = 112;
constexpr int kMinNodeSurplus = 80;
#endif
// Cound the entry itself as a child
// Count the entry itself as a child
constexpr int kMinChildrenNode0 = 1;
constexpr int kMinChildrenNode3 = 2;
constexpr int kMinChildrenNode16 = 4;
@@ -669,50 +674,40 @@ static_assert(kNode3Surplus >= kMinNodeSurplus);
static_assert(kBytesPerKey - sizeof(Node0) >= kMinNodeSurplus);
// setOldestVersion will additionally try to maintain this property:
// We'll additionally maintain this property:
// `(children + entryPresent) * length >= capacity`
//
// Which should give us the budget to pay for the key bytes. (children +
// entryPresent) is a lower bound on how many keys these bytes are a prefix of
constexpr int64_t kFreeListMaxMemory = 1 << 20;
constexpr int getMaxCapacity(int numChildren, int entryPresent,
int partialKeyLen) {
return (numChildren + entryPresent) * (partialKeyLen + 1);
}
template <class T> struct BoundedFreeListAllocator {
constexpr int getMaxCapacity(Node *self) {
return getMaxCapacity(self->numChildren, self->entryPresent,
self->partialKeyLen);
}
constexpr int64_t kMaxFreeListBytes = 1 << 20;
// Maintains a free list up to kMaxFreeListBytes. If the top element of the list
// doesn't meet the capacity constraints, it's freed and a new node is allocated
// with the minimum capacity. The hope is that "unfit" nodes don't get stuck in
// the free list.
//
// TODO valgrind annotations
template <class T> struct NodeAllocator {
static_assert(sizeof(T) >= sizeof(void *));
static_assert(std::derived_from<T, Node>);
static_assert(std::is_trivial_v<T>);
T *allocate_helper(int partialKeyCapacity) {
if (freeList != nullptr) {
T *n = (T *)freeList;
VALGRIND_MAKE_MEM_DEFINED(freeList, sizeof(freeList));
memcpy(&freeList, freeList, sizeof(freeList));
VALGRIND_MAKE_MEM_UNDEFINED(n, sizeof(T));
VALGRIND_MAKE_MEM_DEFINED(&n->partialKeyCapacity,
sizeof(n->partialKeyCapacity));
VALGRIND_MAKE_MEM_DEFINED(&n->type, sizeof(n->type));
assert(n->type == T::kType);
VALGRIND_MAKE_MEM_UNDEFINED(n + 1, n->partialKeyCapacity);
freeListBytes -= sizeof(T) + n->partialKeyCapacity;
if (n->partialKeyCapacity >= partialKeyCapacity) {
return n;
} else {
// The intent is to filter out too-small nodes in the freelist
removeNode(n);
safe_free(n, sizeof(T) + n->partialKeyCapacity);
}
}
auto *result = (T *)safe_malloc(sizeof(T) + partialKeyCapacity);
result->type = T::kType;
result->partialKeyCapacity = partialKeyCapacity;
addNode(result);
return result;
}
T *allocate(int partialKeyCapacity) {
T *result = allocate_helper(partialKeyCapacity);
T *allocate(int minCapacity, int maxCapacity) {
assert(minCapacity <= maxCapacity);
assert(freeListSize >= 0);
assert(freeListSize <= kMaxFreeListBytes);
T *result = allocate_helper(minCapacity, maxCapacity);
result->endOfRange = false;
result->releaseDeferred = false;
if constexpr (!std::is_same_v<T, Node0>) {
@@ -732,37 +727,93 @@ template <class T> struct BoundedFreeListAllocator {
}
void release(T *p) {
if (freeListBytes >= kFreeListMaxMemory) {
if (freeListSize + sizeof(T) + p->partialKeyCapacity > kMaxFreeListBytes) {
removeNode(p);
return safe_free(p, sizeof(T) + p->partialKeyCapacity);
}
memcpy((void *)p, &freeList, sizeof(freeList));
p->parent = freeList;
freeList = p;
freeListBytes += sizeof(T) + p->partialKeyCapacity;
VALGRIND_MAKE_MEM_NOACCESS(freeList, sizeof(T) + p->partialKeyCapacity);
freeListSize += sizeof(T) + p->partialKeyCapacity;
}
BoundedFreeListAllocator() = default;
void deferRelease(T *p, Node *forwardTo) {
p->releaseDeferred = true;
p->forwardTo = forwardTo;
if (freeListSize + sizeof(T) + p->partialKeyCapacity > kMaxFreeListBytes) {
p->parent = deferredListOverflow;
deferredListOverflow = p;
} else {
if (deferredList == nullptr) {
deferredListFront = p;
}
p->parent = deferredList;
deferredList = p;
freeListSize += sizeof(T) + p->partialKeyCapacity;
}
}
BoundedFreeListAllocator(const BoundedFreeListAllocator &) = delete;
BoundedFreeListAllocator &
operator=(const BoundedFreeListAllocator &) = delete;
BoundedFreeListAllocator(BoundedFreeListAllocator &&) = delete;
BoundedFreeListAllocator &operator=(BoundedFreeListAllocator &&) = delete;
void releaseDeferred() {
if (deferredList != nullptr) {
deferredListFront->parent = freeList;
freeList = std::exchange(deferredList, nullptr);
}
for (T *n = std::exchange(deferredListOverflow, nullptr); n != nullptr;) {
auto *tmp = n;
n = (T *)n->parent;
release(tmp);
}
}
~BoundedFreeListAllocator() {
for (void *iter = freeList; iter != nullptr;) {
VALGRIND_MAKE_MEM_DEFINED(iter, sizeof(Node));
auto *tmp = (T *)iter;
memcpy(&iter, iter, sizeof(void *));
removeNode((tmp));
NodeAllocator() = default;
NodeAllocator(const NodeAllocator &) = delete;
NodeAllocator &operator=(const NodeAllocator &) = delete;
NodeAllocator(NodeAllocator &&) = delete;
NodeAllocator &operator=(NodeAllocator &&) = delete;
~NodeAllocator() {
assert(deferredList == nullptr);
assert(deferredListOverflow == nullptr);
for (T *iter = freeList; iter != nullptr;) {
auto *tmp = iter;
iter = (T *)iter->parent;
removeNode(tmp);
safe_free(tmp, sizeof(T) + tmp->partialKeyCapacity);
}
}
private:
int64_t freeListBytes = 0;
void *freeList = nullptr;
int64_t freeListSize = 0;
T *freeList = nullptr;
T *deferredList = nullptr;
// Used to concatenate deferredList to freeList
T *deferredListFront;
T *deferredListOverflow = nullptr;
T *allocate_helper(int minCapacity, int maxCapacity) {
if (freeList != nullptr) {
freeListSize -= sizeof(T) + freeList->partialKeyCapacity;
assume(freeList->partialKeyCapacity >= 0);
assume(minCapacity >= 0);
assume(minCapacity <= maxCapacity);
if (freeList->partialKeyCapacity >= minCapacity &&
freeList->partialKeyCapacity <= maxCapacity) {
auto *result = freeList;
freeList = (T *)freeList->parent;
return result;
} else {
auto *p = freeList;
freeList = (T *)p->parent;
removeNode(p);
safe_free(p, sizeof(T) + p->partialKeyCapacity);
}
}
auto *result = (T *)safe_malloc(sizeof(T) + minCapacity);
result->type = T::kType;
result->partialKeyCapacity = minCapacity;
addNode(result);
return result;
}
};
uint8_t *Node::partialKey() {
@@ -827,18 +878,19 @@ struct WriteContext {
WriteContext() { memset(&accum, 0, sizeof(accum)); }
template <class T> T *allocate(int c) {
template <class T> T *allocate(int minCapacity, int maxCapacity) {
static_assert(!std::is_same_v<T, Node>);
++accum.nodes_allocated;
if constexpr (std::is_same_v<T, Node0>) {
return node0.allocate(c);
return node0.allocate(minCapacity, maxCapacity);
} else if constexpr (std::is_same_v<T, Node3>) {
return node3.allocate(c);
return node3.allocate(minCapacity, maxCapacity);
} else if constexpr (std::is_same_v<T, Node16>) {
return node16.allocate(c);
return node16.allocate(minCapacity, maxCapacity);
} else if constexpr (std::is_same_v<T, Node48>) {
return node48.allocate(c);
return node48.allocate(minCapacity, maxCapacity);
} else if constexpr (std::is_same_v<T, Node256>) {
return node256.allocate(c);
return node256.allocate(minCapacity, maxCapacity);
}
}
template <class T> void release(T *c) {
@@ -858,49 +910,37 @@ struct WriteContext {
}
// Place in a list to be released in the next call to releaseDeferred.
void deferRelease(Node *n, Node *forwardTo) {
n->releaseDeferred = true;
n->forwardTo = forwardTo;
n->parent = deferredList;
deferredList = n;
template <class T> void deferRelease(T *n, Node *forwardTo) {
static_assert(!std::is_same_v<T, Node>);
if constexpr (std::is_same_v<T, Node0>) {
return node0.deferRelease(n, forwardTo);
} else if constexpr (std::is_same_v<T, Node3>) {
return node3.deferRelease(n, forwardTo);
} else if constexpr (std::is_same_v<T, Node16>) {
return node16.deferRelease(n, forwardTo);
} else if constexpr (std::is_same_v<T, Node48>) {
return node48.deferRelease(n, forwardTo);
} else if constexpr (std::is_same_v<T, Node256>) {
return node256.deferRelease(n, forwardTo);
}
}
// Release all nodes passed to deferRelease since the last call to
// releaseDeferred.
void releaseDeferred() {
for (Node *n = std::exchange(deferredList, nullptr); n != nullptr;) {
auto *tmp = n;
n = n->parent;
switch (tmp->getType()) {
case Type_Node0:
release(static_cast<Node0 *>(tmp));
break;
case Type_Node3:
release(static_cast<Node3 *>(tmp));
break;
case Type_Node16:
release(static_cast<Node16 *>(tmp));
break;
case Type_Node48:
release(static_cast<Node48 *>(tmp));
break;
case Type_Node256:
release(static_cast<Node256 *>(tmp));
break;
default: // GCOVR_EXCL_LINE
__builtin_unreachable(); // GCOVR_EXCL_LINE
}
}
node0.releaseDeferred();
node3.releaseDeferred();
node16.releaseDeferred();
node48.releaseDeferred();
node256.releaseDeferred();
}
private:
Node *deferredList = nullptr;
BoundedFreeListAllocator<Node0> node0;
BoundedFreeListAllocator<Node3> node3;
BoundedFreeListAllocator<Node16> node16;
BoundedFreeListAllocator<Node48> node48;
BoundedFreeListAllocator<Node256> node256;
NodeAllocator<Node0> node0;
NodeAllocator<Node3> node3;
NodeAllocator<Node16> node16;
NodeAllocator<Node48> node48;
NodeAllocator<Node256> node256;
};
int getNodeIndex(Node3 *self, uint8_t index) {
@@ -1177,7 +1217,8 @@ void setMaxVersion(Node *n, InternalVersionT newMax) {
}
}
TaggedNodePointer &getInTree(Node *n, ConflictSet::Impl *);
// If impl is nullptr, then n->parent must not be nullptr
TaggedNodePointer &getInTree(Node *n, ConflictSet::Impl *impl);
TaggedNodePointer getChild(Node0 *, uint8_t) { return nullptr; }
TaggedNodePointer getChild(Node3 *self, uint8_t index) {
@@ -1430,9 +1471,14 @@ TaggedNodePointer getFirstChildExists(Node *self) {
// GCOVR_EXCL_STOP
}
// self must not be the root
void maybeDecreaseCapacity(Node *&self, WriteContext *writeContext,
ConflictSet::Impl *impl);
void consumePartialKeyFull(TaggedNodePointer &self, TrivialSpan &key,
InternalVersionT writeVersion,
WriteContext *writeContext) {
WriteContext *writeContext,
ConflictSet::Impl *impl) {
// Handle an existing partial key
int commonLen = std::min<int>(self->partialKeyLen, key.size());
int partialKeyIndex =
@@ -1444,7 +1490,8 @@ void consumePartialKeyFull(TaggedNodePointer &self, TrivialSpan &key,
InternalVersionT oldMaxVersion = exchangeMaxVersion(old, writeVersion);
// *self will have one child (old)
auto *newSelf = writeContext->allocate<Node3>(partialKeyIndex);
auto *newSelf = writeContext->allocate<Node3>(
partialKeyIndex, getMaxCapacity(1, 0, partialKeyIndex));
newSelf->parent = old->parent;
newSelf->parentsIndex = old->parentsIndex;
@@ -1466,9 +1513,8 @@ void consumePartialKeyFull(TaggedNodePointer &self, TrivialSpan &key,
old->partialKeyLen - (partialKeyIndex + 1));
old->partialKeyLen -= partialKeyIndex + 1;
// We would consider decreasing capacity here, but we can't invalidate
// old since it's not on the search path. setOldestVersion will clean it
// up.
// Maintain memory capacity invariant
maybeDecreaseCapacity(old, writeContext, impl);
}
key = key.subspan(partialKeyIndex, key.size() - partialKeyIndex);
}
@@ -1477,9 +1523,10 @@ void consumePartialKeyFull(TaggedNodePointer &self, TrivialSpan &key,
// `key` such that `self` is along the search path of `key`
inline __attribute__((always_inline)) void
consumePartialKey(TaggedNodePointer &self, TrivialSpan &key,
InternalVersionT writeVersion, WriteContext *writeContext) {
InternalVersionT writeVersion, WriteContext *writeContext,
ConflictSet::Impl *impl) {
if (self->partialKeyLen > 0) {
consumePartialKeyFull(self, key, writeVersion, writeContext);
consumePartialKeyFull(self, key, writeVersion, writeContext, impl);
}
}
@@ -1489,7 +1536,8 @@ consumePartialKey(TaggedNodePointer &self, TrivialSpan &key,
// `maxVersion` for result.
TaggedNodePointer &getOrCreateChild(TaggedNodePointer &self, TrivialSpan &key,
InternalVersionT newMaxVersion,
WriteContext *writeContext) {
WriteContext *writeContext,
ConflictSet::Impl *impl) {
int index = key.front();
key = key.subspan(1, key.size() - 1);
@@ -1502,7 +1550,8 @@ TaggedNodePointer &getOrCreateChild(TaggedNodePointer &self, TrivialSpan &key,
auto *self3 = static_cast<Node3 *>(self);
int i = getNodeIndex(self3, index);
if (i >= 0) {
consumePartialKey(self3->children[i], key, newMaxVersion, writeContext);
consumePartialKey(self3->children[i], key, newMaxVersion, writeContext,
impl);
self3->childMaxVersion[i] = newMaxVersion;
return self3->children[i];
}
@@ -1511,7 +1560,8 @@ TaggedNodePointer &getOrCreateChild(TaggedNodePointer &self, TrivialSpan &key,
auto *self16 = static_cast<Node16 *>(self);
int i = getNodeIndex(self16, index);
if (i >= 0) {
consumePartialKey(self16->children[i], key, newMaxVersion, writeContext);
consumePartialKey(self16->children[i], key, newMaxVersion, writeContext,
impl);
self16->childMaxVersion[i] = newMaxVersion;
return self16->children[i];
}
@@ -1521,7 +1571,7 @@ TaggedNodePointer &getOrCreateChild(TaggedNodePointer &self, TrivialSpan &key,
int secondIndex = self48->index[index];
if (secondIndex >= 0) {
consumePartialKey(self48->children[secondIndex], key, newMaxVersion,
writeContext);
writeContext, impl);
self48->childMaxVersion[secondIndex] = newMaxVersion;
self48->maxOfMax[secondIndex >> Node48::kMaxOfMaxShift] =
std::max(self48->maxOfMax[secondIndex >> Node48::kMaxOfMaxShift],
@@ -1532,7 +1582,7 @@ TaggedNodePointer &getOrCreateChild(TaggedNodePointer &self, TrivialSpan &key,
case Type_Node256: {
auto *self256 = static_cast<Node256 *>(self);
if (auto &result = self256->children[index]; result != nullptr) {
consumePartialKey(result, key, newMaxVersion, writeContext);
consumePartialKey(result, key, newMaxVersion, writeContext, impl);
self256->childMaxVersion[index] = newMaxVersion;
self256->maxOfMax[index >> Node256::kMaxOfMaxShift] = std::max(
self256->maxOfMax[index >> Node256::kMaxOfMaxShift], newMaxVersion);
@@ -1543,9 +1593,10 @@ TaggedNodePointer &getOrCreateChild(TaggedNodePointer &self, TrivialSpan &key,
__builtin_unreachable(); // GCOVR_EXCL_LINE
}
auto *newChild = writeContext->allocate<Node0>(key.size());
auto *newChild = writeContext->allocate<Node0>(
key.size(), getMaxCapacity(0, 1, key.size()));
newChild->numChildren = 0;
newChild->entryPresent = false;
newChild->entryPresent = false; // Will be set to true by the caller
newChild->partialKeyLen = key.size();
newChild->parentsIndex = index;
memcpy(newChild->partialKey(), key.data(), key.size());
@@ -1555,7 +1606,8 @@ TaggedNodePointer &getOrCreateChild(TaggedNodePointer &self, TrivialSpan &key,
case Type_Node0: {
auto *self0 = static_cast<Node0 *>(self);
auto *newSelf = writeContext->allocate<Node3>(self->partialKeyLen);
auto *newSelf = writeContext->allocate<Node3>(
self->partialKeyLen, getMaxCapacity(1, 1, self->partialKeyLen));
newSelf->copyChildrenAndKeyFrom(*self0);
writeContext->deferRelease(self0, newSelf);
self = newSelf;
@@ -1565,7 +1617,9 @@ TaggedNodePointer &getOrCreateChild(TaggedNodePointer &self, TrivialSpan &key,
case Type_Node3: {
if (self->numChildren == Node3::kMaxNodes) {
auto *self3 = static_cast<Node3 *>(self);
auto *newSelf = writeContext->allocate<Node16>(self->partialKeyLen);
auto *newSelf = writeContext->allocate<Node16>(
self->partialKeyLen,
getMaxCapacity(4, self->entryPresent, self->partialKeyLen));
newSelf->copyChildrenAndKeyFrom(*self3);
writeContext->deferRelease(self3, newSelf);
self = newSelf;
@@ -1594,7 +1648,9 @@ TaggedNodePointer &getOrCreateChild(TaggedNodePointer &self, TrivialSpan &key,
case Type_Node16: {
if (self->numChildren == Node16::kMaxNodes) {
auto *self16 = static_cast<Node16 *>(self);
auto *newSelf = writeContext->allocate<Node48>(self->partialKeyLen);
auto *newSelf = writeContext->allocate<Node48>(
self->partialKeyLen,
getMaxCapacity(17, self->entryPresent, self->partialKeyLen));
newSelf->copyChildrenAndKeyFrom(*self16);
writeContext->deferRelease(self16, newSelf);
self = newSelf;
@@ -1625,7 +1681,9 @@ TaggedNodePointer &getOrCreateChild(TaggedNodePointer &self, TrivialSpan &key,
if (self->numChildren == 48) {
auto *self48 = static_cast<Node48 *>(self);
auto *newSelf = writeContext->allocate<Node256>(self->partialKeyLen);
auto *newSelf = writeContext->allocate<Node256>(
self->partialKeyLen,
getMaxCapacity(49, self->entryPresent, self->partialKeyLen));
newSelf->copyChildrenAndKeyFrom(*self48);
writeContext->deferRelease(self48, newSelf);
self = newSelf;
@@ -1676,7 +1734,7 @@ Node *nextPhysical(Node *node) {
if (node == nullptr) {
return nullptr;
}
auto nextChild = getChildGeq(node, index + 1);
Node *nextChild = getChildGeq(node, index + 1);
if (nextChild != nullptr) {
return nextChild;
}
@@ -1695,7 +1753,7 @@ Node *nextLogical(Node *node) {
if (node == nullptr) {
return nullptr;
}
auto nextChild = getChildGeq(node, index + 1);
Node *nextChild = getChildGeq(node, index + 1);
if (nextChild != nullptr) {
node = nextChild;
goto downLeftSpine;
@@ -1707,76 +1765,48 @@ downLeftSpine:
return node;
}
// Invalidates `self`, replacing it with a node of at least capacity.
// Does not return nodes to freelists when kUseFreeList is false.
void freeAndMakeCapacityAtLeast(Node *&self, int capacity,
void freeAndMakeCapacityBetween(Node *&self, int minCapacity, int maxCapacity,
WriteContext *writeContext,
ConflictSet::Impl *impl,
const bool kUseFreeList) {
ConflictSet::Impl *impl) {
switch (self->getType()) {
case Type_Node0: {
auto *self0 = (Node0 *)self;
auto *newSelf = writeContext->allocate<Node0>(capacity);
auto *newSelf = writeContext->allocate<Node0>(minCapacity, maxCapacity);
newSelf->copyChildrenAndKeyFrom(*self0);
getInTree(self, impl) = newSelf;
if (kUseFreeList) {
writeContext->deferRelease(self0, newSelf);
} else {
removeNode(self0);
safe_free(self0, self0->size());
}
writeContext->deferRelease(self0, newSelf);
self = newSelf;
} break;
case Type_Node3: {
auto *self3 = (Node3 *)self;
auto *newSelf = writeContext->allocate<Node3>(capacity);
auto *newSelf = writeContext->allocate<Node3>(minCapacity, maxCapacity);
newSelf->copyChildrenAndKeyFrom(*self3);
getInTree(self, impl) = newSelf;
if (kUseFreeList) {
writeContext->deferRelease(self3, newSelf);
} else {
removeNode(self3);
safe_free(self3, self3->size());
}
writeContext->deferRelease(self3, newSelf);
self = newSelf;
} break;
case Type_Node16: {
auto *self16 = (Node16 *)self;
auto *newSelf = writeContext->allocate<Node16>(capacity);
auto *newSelf = writeContext->allocate<Node16>(minCapacity, maxCapacity);
newSelf->copyChildrenAndKeyFrom(*self16);
getInTree(self, impl) = newSelf;
if (kUseFreeList) {
writeContext->deferRelease(self16, newSelf);
} else {
removeNode(self16);
safe_free(self16, self16->size());
}
writeContext->deferRelease(self16, newSelf);
self = newSelf;
} break;
case Type_Node48: {
auto *self48 = (Node48 *)self;
auto *newSelf = writeContext->allocate<Node48>(capacity);
auto *newSelf = writeContext->allocate<Node48>(minCapacity, maxCapacity);
newSelf->copyChildrenAndKeyFrom(*self48);
getInTree(self, impl) = newSelf;
if (kUseFreeList) {
writeContext->deferRelease(self48, newSelf);
} else {
removeNode(self48);
safe_free(self48, self48->size());
}
writeContext->deferRelease(self48, newSelf);
self = newSelf;
} break;
case Type_Node256: {
auto *self256 = (Node256 *)self;
auto *newSelf = writeContext->allocate<Node256>(capacity);
auto *newSelf = writeContext->allocate<Node256>(minCapacity, maxCapacity);
newSelf->copyChildrenAndKeyFrom(*self256);
getInTree(self, impl) = newSelf;
if (kUseFreeList) {
writeContext->deferRelease(self256, newSelf);
} else {
removeNode(self256);
safe_free(self256, self256->size());
}
writeContext->deferRelease(self256, newSelf);
self = newSelf;
} break;
default: // GCOVR_EXCL_LINE
@@ -1784,9 +1814,7 @@ void freeAndMakeCapacityAtLeast(Node *&self, int capacity,
}
}
// Fix larger-than-desired capacities. Does not return nodes to freelists,
// since that wouldn't actually reclaim the memory used for partial key
// capacity.
// Fix larger-than-desired capacities. self must not be the root
void maybeDecreaseCapacity(Node *&self, WriteContext *writeContext,
ConflictSet::Impl *impl) {
@@ -1800,7 +1828,8 @@ void maybeDecreaseCapacity(Node *&self, WriteContext *writeContext,
if (self->getCapacity() <= maxCapacity) {
return;
}
freeAndMakeCapacityAtLeast(self, maxCapacity, writeContext, impl, false);
freeAndMakeCapacityBetween(self, self->partialKeyLen, maxCapacity,
writeContext, impl);
}
#if defined(HAS_AVX) && !defined(__SANITIZE_THREAD__)
@@ -1870,13 +1899,16 @@ void rezero(Node *n, InternalVersionT z) {
#endif
void mergeWithChild(TaggedNodePointer &self, WriteContext *writeContext,
ConflictSet::Impl *impl, Node3 *self3) {
Node3 *self3, ConflictSet::Impl *impl) {
assert(!self3->entryPresent);
Node *child = self3->children[0];
int minCapacity = self3->partialKeyLen + 1 + child->partialKeyLen;
const int minCapacity = self3->partialKeyLen + 1 + child->partialKeyLen;
const int maxCapacity =
getMaxCapacity(child->numChildren, child->entryPresent, minCapacity);
if (minCapacity > child->getCapacity()) {
freeAndMakeCapacityAtLeast(child, minCapacity, writeContext, impl, true);
freeAndMakeCapacityBetween(child, minCapacity, maxCapacity, writeContext,
impl);
}
// Merge partial key with child
@@ -1915,20 +1947,23 @@ bool needsDownsize(Node *n) {
void downsize(Node3 *self, WriteContext *writeContext,
ConflictSet::Impl *impl) {
if (self->numChildren == 0) {
auto *newSelf = writeContext->allocate<Node0>(self->partialKeyLen);
auto *newSelf = writeContext->allocate<Node0>(
self->partialKeyLen, getMaxCapacity(0, 1, self->partialKeyLen));
newSelf->copyChildrenAndKeyFrom(*self);
getInTree(self, impl) = newSelf;
writeContext->deferRelease(self, newSelf);
} else {
assert(self->numChildren == 1 && !self->entryPresent);
mergeWithChild(getInTree(self, impl), writeContext, impl, self);
mergeWithChild(getInTree(self, impl), writeContext, self, impl);
}
}
void downsize(Node16 *self, WriteContext *writeContext,
ConflictSet::Impl *impl) {
assert(self->numChildren + int(self->entryPresent) < kMinChildrenNode16);
auto *newSelf = writeContext->allocate<Node3>(self->partialKeyLen);
auto *newSelf = writeContext->allocate<Node3>(
self->partialKeyLen,
getMaxCapacity(kMinChildrenNode16 - 1, 0, self->partialKeyLen));
newSelf->copyChildrenAndKeyFrom(*self);
getInTree(self, impl) = newSelf;
writeContext->deferRelease(self, newSelf);
@@ -1937,7 +1972,9 @@ void downsize(Node16 *self, WriteContext *writeContext,
void downsize(Node48 *self, WriteContext *writeContext,
ConflictSet::Impl *impl) {
assert(self->numChildren + int(self->entryPresent) < kMinChildrenNode48);
auto *newSelf = writeContext->allocate<Node16>(self->partialKeyLen);
auto *newSelf = writeContext->allocate<Node16>(
self->partialKeyLen,
getMaxCapacity(kMinChildrenNode48 - 1, 0, self->partialKeyLen));
newSelf->copyChildrenAndKeyFrom(*self);
getInTree(self, impl) = newSelf;
writeContext->deferRelease(self, newSelf);
@@ -1947,7 +1984,9 @@ void downsize(Node256 *self, WriteContext *writeContext,
ConflictSet::Impl *impl) {
assert(self->numChildren + int(self->entryPresent) < kMinChildrenNode256);
auto *self256 = (Node256 *)self;
auto *newSelf = writeContext->allocate<Node48>(self->partialKeyLen);
auto *newSelf = writeContext->allocate<Node48>(
self->partialKeyLen,
getMaxCapacity(kMinChildrenNode256 - 1, 0, self->partialKeyLen));
newSelf->copyChildrenAndKeyFrom(*self256);
getInTree(self, impl) = newSelf;
writeContext->deferRelease(self256, newSelf);
@@ -2001,6 +2040,10 @@ Node *erase(Node *self, WriteContext *writeContext, ConflictSet::Impl *impl,
if (needsDownsize(self)) {
downsize(self, writeContext, impl);
}
while (self->releaseDeferred) {
self = self->forwardTo;
}
maybeDecreaseCapacity(self, writeContext, impl);
if (result != nullptr) {
while (result->releaseDeferred) {
result = result->forwardTo;
@@ -2088,6 +2131,11 @@ Node *erase(Node *self, WriteContext *writeContext, ConflictSet::Impl *impl,
__builtin_unreachable(); // GCOVR_EXCL_LINE
}
while (parent->releaseDeferred) {
parent = parent->forwardTo;
}
maybeDecreaseCapacity(parent, writeContext, impl);
if (result != nullptr) {
while (result->releaseDeferred) {
result = result->forwardTo;
@@ -2791,13 +2839,12 @@ checkMaxBetweenExclusiveImpl<true>(Node256 *n, int begin, int end,
// of the result will have `maxVersion` set to `writeVersion` as a
// postcondition. Nodes along the search path may be invalidated. Callers must
// ensure that the max version of the self argument is updated.
[[nodiscard]] TaggedNodePointer *insert(TaggedNodePointer *self,
TrivialSpan key,
InternalVersionT writeVersion,
WriteContext *writeContext) {
[[nodiscard]] TaggedNodePointer *
insert(TaggedNodePointer *self, TrivialSpan key, InternalVersionT writeVersion,
WriteContext *writeContext, ConflictSet::Impl *impl) {
for (; key.size() != 0; ++writeContext->accum.insert_iterations) {
self = &getOrCreateChild(*self, key, writeVersion, writeContext);
self = &getOrCreateChild(*self, key, writeVersion, writeContext, impl);
}
return self;
}
@@ -2855,9 +2902,10 @@ void eraseTree(Node *root, WriteContext *writeContext) {
}
void addPointWrite(TaggedNodePointer &root, TrivialSpan key,
InternalVersionT writeVersion, WriteContext *writeContext) {
InternalVersionT writeVersion, WriteContext *writeContext,
ConflictSet::Impl *impl) {
++writeContext->accum.point_writes;
auto n = *insert(&root, key, writeVersion, writeContext);
auto n = *insert(&root, key, writeVersion, writeContext, impl);
if (!n->entryPresent) {
++writeContext->accum.entries_inserted;
auto *p = nextLogical(n);
@@ -2991,8 +3039,8 @@ AddedWriteRange addWriteRange(Node *beginRoot, TrivialSpan begin, Node *endRoot,
++writeContext->accum.range_writes;
Node *beginNode =
*insert(&getInTree(beginRoot, impl), begin, writeVersion, writeContext);
Node *beginNode = *insert(&getInTree(beginRoot, impl), begin, writeVersion,
writeContext, impl);
addKey(beginNode);
if (!beginNode->entryPresent) {
++writeContext->accum.entries_inserted;
@@ -3008,7 +3056,7 @@ AddedWriteRange addWriteRange(Node *beginRoot, TrivialSpan begin, Node *endRoot,
beginNode->entry.pointVersion = writeVersion;
Node *endNode =
*insert(&getInTree(endRoot, impl), end, writeVersion, writeContext);
*insert(&getInTree(endRoot, impl), end, writeVersion, writeContext, impl);
addKey(endNode);
if (!endNode->entryPresent) {
@@ -3054,10 +3102,10 @@ void addWriteRange(TaggedNodePointer &root, TrivialSpan begin, TrivialSpan end,
std::min(begin.size(), end.size()));
if (lcp == begin.size() && end.size() == begin.size() + 1 &&
end.back() == 0) {
return addPointWrite(root, begin, writeVersion, writeContext);
return addPointWrite(root, begin, writeVersion, writeContext, impl);
}
auto useAsRoot =
insert(&root, begin.subspan(0, lcp), writeVersion, writeContext);
insert(&root, begin.subspan(0, lcp), writeVersion, writeContext, impl);
auto [beginNode, endNode] = addWriteRange(
*useAsRoot, begin.subspan(lcp, begin.size() - lcp), *useAsRoot,
@@ -4143,12 +4191,13 @@ void pointIter(Job *job, Context *context) {
MUSTTAIL return complete(job, context);
}
++context->iterations;
if (!job->getChildAndIndex(child, job->remaining.front())) [[unlikely]] {
*job->result = {job->n, job->remaining};
MUSTTAIL return complete(job, context);
}
++context->iterations;
job->continuation = PointIterTable<NodeTTo>::table[job->child.getType()];
__builtin_prefetch(job->child);
MUSTTAIL return keepGoing(job, context);
@@ -4249,11 +4298,12 @@ void prefixIter(Job *job, Context *context) {
}
}
++context->iterations;
if (!job->getChildAndIndex(child, job->remaining.front())) [[unlikely]] {
goto noNodeOnSearchPath;
}
++context->iterations;
job->continuation = PrefixIterTable<NodeTTo>::table[job->child.getType()];
__builtin_prefetch(job->child);
MUSTTAIL return keepGoing(job, context);
@@ -4323,11 +4373,12 @@ void beginIter(Job *job, Context *context) {
goto gotoEndIter;
}
++context->iterations;
if (!job->getChildAndIndex(child, job->begin.front())) [[unlikely]] {
goto gotoEndIter;
}
++context->iterations;
job->continuation = BeginIterTable<NodeTTo>::table[job->child.getType()];
__builtin_prefetch(job->child);
MUSTTAIL return keepGoing(job, context);
@@ -4387,13 +4438,14 @@ void endIter(Job *job, Context *context) {
MUSTTAIL return complete(job, context);
}
++context->iterations;
if (!job->getChildAndIndex(child, job->end.front())) [[unlikely]] {
*job->result = {job->n, job->begin, job->endNode, job->end};
assert(job->endNode != nullptr);
MUSTTAIL return complete(job, context);
}
++context->iterations;
job->continuation = EndIterTable<NodeTTo>::table[job->child.getType()];
__builtin_prefetch(job->child);
MUSTTAIL return keepGoing(job, context);
@@ -5073,8 +5125,8 @@ struct __attribute__((visibility("hidden"))) ConflictSet::Impl {
}
if (context.results[i].endInsertionPoint == nullptr) {
addPointWrite(getInTree(context.results[i].insertionPoint, this),
context.results[i].remaining, writeVersion,
&writeContext);
context.results[i].remaining, writeVersion, &writeContext,
this);
} else {
if (firstRangeWrite == nullptr) {
firstRangeWrite = context.results + i;
@@ -5151,7 +5203,7 @@ struct __attribute__((visibility("hidden"))) ConflictSet::Impl {
&writeContext, this);
} else {
addPointWrite(root, begin, InternalVersionT(writeVersion),
&writeContext);
&writeContext, this);
}
}
}
@@ -5271,7 +5323,6 @@ struct __attribute__((visibility("hidden"))) ConflictSet::Impl {
assert(n->entry.rangeVersion <= oldestVersion);
n = erase(n, &writeContext, this, /*logical*/ false);
} else {
maybeDecreaseCapacity(n, &writeContext, this);
n = nextPhysical(n);
}
}
@@ -5350,7 +5401,7 @@ struct __attribute__((visibility("hidden"))) ConflictSet::Impl {
keyUpdates = 10;
// Insert ""
root = writeContext.allocate<Node0>(0);
root = writeContext.allocate<Node0>(0, 0);
root->numChildren = 0;
root->parent = nullptr;
root->entryPresent = false;
@@ -5923,7 +5974,16 @@ checkMaxVersion(Node *root, Node *node, InternalVersionT oldestVersion,
int(node->entryPresent), minNumChildren);
success = false;
}
// TODO check that the max capacity property eventually holds
const int maxCapacity =
(node->numChildren + int(node->entryPresent)) * (node->partialKeyLen + 1);
if (node->getCapacity() > maxCapacity) {
fprintf(stderr, "%s has d capacity %d, which is more than the allowed %d\n",
getSearchPathPrintable(node).c_str(), node->getCapacity(),
maxCapacity);
success = false;
}
for (auto child = getChildGeq(node, 0); child != nullptr;
child = getChildGeq(node, child->parentsIndex + 1)) {
checkMemoryBoundInvariants(child, success);

4
Jenkinsfile vendored
View File

@@ -11,11 +11,11 @@ def CleanBuildAndTest(String cmakeArgs) {
catchError {
sh '''
cd build
ctest --no-compress-output --test-output-size-passed 100000 --test-output-size-failed 100000 -T Test -j `nproc` --timeout 90
ctest --no-compress-output --test-output-size-passed 100000 --test-output-size-failed 100000 -T Test -j `nproc` --timeout 90 > /dev/null
zstd Testing/*/Test.xml
'''
}
xunit tools: [CTest(pattern: 'build/Testing/*/Test.xml')], reduceLog: false, skipPublishingChecks: false
xunit tools: [CTest(pattern: 'build/Testing/*/Test.xml')], skipPublishingChecks: false
minio bucket: 'jenkins', credentialsId: 'jenkins-minio', excludes: '', host: 'minio.weaselab.dev', includes: 'build/Testing/*/Test.xml.zst', targetFolder: '${JOB_NAME}/${BUILD_NUMBER}/${STAGE_NAME}/'
}

View File

@@ -2,7 +2,9 @@ A data structure for optimistic concurrency control on ranges of bitwise-lexicog
Intended as an alternative to FoundationDB's skip list.
Hardware for all benchmarks is an AMD Ryzen 9 7900 with (2x32GB) 5600MT/s CL28-34-34-89 1.35V RAM
Hardware for all benchmarks is an AMD Ryzen 9 7900 with (2x32GB) 5600MT/s CL28-34-34-89 1.35V RAM.
Compiler is `Ubuntu clang version 20.0.0 (++20241029082144+7544d3af0e28-1~exp1~20241029082307.506)`.
# Microbenchmark
@@ -10,29 +12,29 @@ Hardware for all benchmarks is an AMD Ryzen 9 7900 with (2x32GB) 5600MT/s CL28-3
| ns/op | op/s | err% | ins/op | cyc/op | IPC | bra/op | miss% | total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
| 172.03 | 5,812,791.77 | 0.4% | 3,130.62 | 879.00 | 3.562 | 509.23 | 0.0% | 0.01 | `point reads`
| 167.44 | 5,972,130.71 | 0.2% | 3,065.14 | 862.27 | 3.555 | 494.30 | 0.0% | 0.01 | `prefix reads`
| 238.77 | 4,188,130.84 | 0.9% | 3,589.93 | 1,259.30 | 2.851 | 637.12 | 0.0% | 0.01 | `range reads`
| 424.01 | 2,358,426.70 | 0.2% | 5,620.05 | 2,242.35 | 2.506 | 854.80 | 1.7% | 0.01 | `point writes`
| 418.45 | 2,389,780.56 | 0.4% | 5,525.07 | 2,211.05 | 2.499 | 831.71 | 1.7% | 0.01 | `prefix writes`
| 254.87 | 3,923,568.88 | 2.6% | 3,187.01 | 1,366.50 | 2.332 | 529.11 | 2.7% | 0.02 | `range writes`
| 675.96 | 1,479,374.50 | 3.3% | 7,735.41 | 3,468.60 | 2.230 | 1,386.02 | 1.8% | 0.01 | `monotonic increasing point writes`
| 137,986.20 | 7,247.10 | 0.6% | 789,752.33 | 699,462.00 | 1.129 | 144,824.14 | 0.0% | 0.01 | `worst case for radix tree`
| 21.63 | 46,231,564.03 | 1.0% | 448.00 | 107.14 | 4.181 | 84.00 | 0.0% | 0.01 | `create and destroy`
| 159.65 | 6,263,576.52 | 1.6% | 2,972.36 | 820.37 | 3.623 | 504.59 | 0.0% | 0.01 | `point reads`
| 156.32 | 6,397,320.65 | 0.7% | 2,913.62 | 806.87 | 3.611 | 490.19 | 0.0% | 0.01 | `prefix reads`
| 229.18 | 4,363,293.65 | 1.2% | 3,541.05 | 1,219.75 | 2.903 | 629.33 | 0.0% | 0.01 | `range reads`
| 363.37 | 2,752,026.30 | 0.3% | 5,273.63 | 1,951.54 | 2.702 | 851.66 | 1.7% | 0.01 | `point writes`
| 364.99 | 2,739,787.02 | 0.3% | 5,250.92 | 1,958.54 | 2.681 | 839.24 | 1.7% | 0.01 | `prefix writes`
| 242.26 | 4,127,796.58 | 2.9% | 3,117.33 | 1,304.41 | 2.390 | 541.07 | 2.8% | 0.02 | `range writes`
| 562.48 | 1,777,855.27 | 0.8% | 7,305.21 | 3,034.34 | 2.408 | 1,329.30 | 1.3% | 0.01 | `monotonic increasing point writes`
| 122,688.57 | 8,150.72 | 0.7% | 798,766.00 | 666,842.00 | 1.198 | 144,584.50 | 0.1% | 0.01 | `worst case for radix tree`
| 41.71 | 23,976,459.34 | 1.7% | 885.00 | 219.17 | 4.038 | 132.00 | 0.0% | 0.01 | `create and destroy`
## Radix tree (this implementation)
| ns/op | op/s | err% | ins/op | cyc/op | IPC | bra/op | miss% | total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
| 12.88 | 77,653,350.77 | 0.5% | 185.37 | 64.45 | 2.876 | 41.51 | 0.4% | 0.01 | `point reads`
| 14.67 | 68,179,354.49 | 0.1% | 271.44 | 73.40 | 3.698 | 53.70 | 0.3% | 0.01 | `prefix reads`
| 34.84 | 28,701,444.36 | 0.3% | 715.74 | 175.27 | 4.084 | 127.30 | 0.2% | 0.01 | `range reads`
| 17.12 | 58,422,988.28 | 0.2% | 314.30 | 86.11 | 3.650 | 39.82 | 0.4% | 0.01 | `point writes`
| 31.42 | 31,830,804.65 | 0.1% | 591.06 | 158.07 | 3.739 | 82.67 | 0.2% | 0.01 | `prefix writes`
| 37.37 | 26,759,432.70 | 2.2% | 681.98 | 188.95 | 3.609 | 96.10 | 0.1% | 0.01 | `range writes`
| 76.72 | 13,035,140.63 | 2.3% | 1,421.28 | 387.17 | 3.671 | 257.76 | 0.1% | 0.01 | `monotonic increasing point writes`
| 297,452.00 | 3,361.89 | 0.9% | 3,508,083.00 | 1,500,834.67 | 2.337 | 727,525.33 | 0.1% | 0.01 | `worst case for radix tree`
| 87.70 | 11,402,490.60 | 1.0% | 1,795.00 | 442.09 | 4.060 | 297.00 | 0.0% | 0.01 | `create and destroy`
| 12.63 | 79,186,868.18 | 1.4% | 241.61 | 64.76 | 3.731 | 31.64 | 0.8% | 0.01 | `point reads`
| 14.48 | 69,078,073.40 | 0.3% | 292.42 | 74.69 | 3.915 | 41.49 | 0.5% | 0.01 | `prefix reads`
| 34.37 | 29,094,694.11 | 0.2% | 759.53 | 179.77 | 4.225 | 100.38 | 0.2% | 0.01 | `range reads`
| 19.34 | 51,713,896.36 | 0.7% | 369.70 | 101.81 | 3.631 | 47.88 | 0.6% | 0.01 | `point writes`
| 39.16 | 25,538,968.61 | 0.2% | 653.16 | 206.77 | 3.159 | 89.62 | 0.8% | 0.01 | `prefix writes`
| 40.58 | 24,642,681.12 | 4.7% | 718.44 | 216.44 | 3.319 | 99.28 | 0.6% | 0.01 | `range writes`
| 78.77 | 12,694,520.69 | 3.8% | 1,395.55 | 421.73 | 3.309 | 249.81 | 0.1% | 0.01 | `monotonic increasing point writes`
| 287,760.50 | 3,475.11 | 0.5% | 3,929,266.50 | 1,550,225.50 | 2.535 | 639,064.00 | 0.0% | 0.01 | `worst case for radix tree`
| 104.76 | 9,545,250.65 | 3.1% | 2,000.00 | 552.82 | 3.618 | 342.00 | 0.0% | 0.01 | `create and destroy`
# "Real data" test
@@ -41,13 +43,13 @@ Point queries only, best of three runs. Gc ratio is the ratio of time spent doin
## skip list
```
Check: 4.47891 seconds, 364.05 MB/s, Add: 4.55599 seconds, 123.058 MB/s, Gc ratio: 37.1145%
Check: 4.39702 seconds, 370.83 MB/s, Add: 4.50025 seconds, 124.583 MB/s, Gc ratio: 29.1333%, Peak idle memory: 5.51852e+06
```
## radix tree
```
Check: 0.953012 seconds, 1710.94 MB/s, Add: 1.30025 seconds, 431.188 MB/s, Gc ratio: 43.9816%, Peak idle memory: 2.28375e+06
Check: 0.987757 seconds, 1650.76 MB/s, Add: 1.24815 seconds, 449.186 MB/s, Gc ratio: 41.4675%, Peak idle memory: 2.02872e+06
```
## hash table
@@ -55,5 +57,5 @@ Check: 0.953012 seconds, 1710.94 MB/s, Add: 1.30025 seconds, 431.188 MB/s, Gc ra
(The hash table implementation doesn't work on range queries, and its purpose is to provide an idea of how fast point queries can be)
```
Check: 0.804094 seconds, 2027.81 MB/s, Add: 0.652952 seconds, 858.645 MB/s, Gc ratio: 35.3885%
Check: 0.84256 seconds, 1935.23 MB/s, Add: 0.697204 seconds, 804.146 MB/s, Gc ratio: 35.4091%
```

View File

@@ -1,3 +1,4 @@
#include <algorithm>
#include <atomic>
#include <cstdint>
#include <cstdlib>
@@ -19,31 +20,69 @@
#include <vector>
#include "ConflictSet.h"
#include "Internal.h"
#include "third_party/nadeau.h"
std::atomic<int64_t> transactions;
constexpr int kWindowSize = 10000000;
int64_t safeUnaryMinus(int64_t x) {
return x == std::numeric_limits<int64_t>::min() ? x : -x;
}
constexpr int kNumPrefixes = 250000;
void tupleAppend(std::string &output, int64_t value) {
if (value == 0) {
output.push_back(0x14);
return;
}
uint32_t size = 8 - __builtin_clrsbll(value) / 8;
int typeCode = 0x14 + (value < 0 ? -1 : 1) * size;
output.push_back(typeCode);
if (value < 0) {
value = ~safeUnaryMinus(value);
}
uint64_t swap = __builtin_bswap64(value);
output.insert(output.end(), (uint8_t *)&swap + 8 - size,
(uint8_t *)&swap + 8);
}
std::string makeKey(int64_t num, int suffixLen) {
void tupleAppend(std::string &output, std::string_view value) {
output.push_back('\x02');
for (auto c : value) {
if (c == '\x00') {
output.push_back('\x00');
output.push_back('\xff');
} else {
output.push_back(c);
}
}
output.push_back('\x00');
}
template <class... Ts> std::string tupleKey(const Ts &...ts) {
std::string result;
result.resize(sizeof(int64_t) + suffixLen);
int64_t be = __builtin_bswap64(num);
memcpy(result.data(), &be, sizeof(int64_t));
memset(result.data() + sizeof(int64_t), 0, suffixLen);
(tupleAppend(result, ts), ...);
return result;
}
constexpr int kWindowSize = 300000;
void workload(weaselab::ConflictSet *cs) {
int64_t version = kWindowSize;
constexpr int kNumWrites = 16;
for (;; transactions.fetch_add(1, std::memory_order_relaxed)) {
std::vector<int64_t> keyIndices;
for (int i = 0; i < kNumWrites; ++i) {
keyIndices.push_back(rand() % 100'000'000);
}
std::sort(keyIndices.begin(), keyIndices.end());
std::vector<std::string> keys;
std::vector<weaselab::ConflictSet::WriteRange> writes;
constexpr std::string_view suffix = "this is a suffix";
for (int i = 0; i < kNumWrites; ++i) {
keys.push_back(makeKey(rand() % kNumPrefixes, rand() % 50));
keys.push_back(tupleKey(0x100, i, keyIndices[i],
suffix.substr(0, rand() % suffix.size()),
rand()));
// printf("%s\n", printable(keys.back()).c_str());
}
for (int i = 0; i < kNumWrites; ++i) {
writes.push_back({{(const uint8_t *)keys[i].data(), int(keys[i].size())},

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Some files were not shown because too many files have changed in this diff Show More