13 Commits

Author SHA1 Message Date
f85b92f8db Improve next{Physical,Logical} codegen
All checks were successful
Tests / 64 bit versions total: 7949, passed: 7949
Tests / Debug total: 7947, passed: 7947
Tests / SIMD fallback total: 7949, passed: 7949
Tests / Release [clang] total: 7949, passed: 7949
Clang |Total|New|Outstanding|Fixed|Trend |:-:|:-:|:-:|:-:|:-: |0|0|0|0|:clap:
Tests / Release [gcc] total: 7949, passed: 7949
GNU C Compiler (gcc) |Total|New|Outstanding|Fixed|Trend |:-:|:-:|:-:|:-:|:-: |0|0|0|0|:clap:
Tests / Release [clang,aarch64] total: 5268, passed: 5268
Tests / Coverage total: 5315, passed: 5315
Code Coverage #### Project Overview No changes detected, that affect the code coverage. * Line Coverage: 97.67% (3139/3214) * Branch Coverage: 42.05% (18734/44548) * Complexity Density: 0.00 * Lines of Code: 3214 #### Quality Gates Summary Output truncated.
weaselab/conflict-set/pipeline/head This commit looks good
2024-11-09 13:21:36 -08:00
3c44614311 Allocate from freelist with min/max capacity constraints 2024-11-08 21:35:13 -08:00
9c1ac3702e Move index closer to start of Node{3,16}
This should slightly improve cache hit rate
2024-11-08 20:54:00 -08:00
224d21648a Make server_bench workload point writes of tuple-encoded keys 2024-11-08 20:50:54 -08:00
33f9c89328 Try reduceLog: true
All checks were successful
Tests / 64 bit versions total: 7949, passed: 7949
Tests / Debug total: 7947, passed: 7947
Tests / SIMD fallback total: 7949, passed: 7949
Tests / Release [clang] total: 7949, passed: 7949
Clang |Total|New|Outstanding|Fixed|Trend |:-:|:-:|:-:|:-:|:-: |0|0|0|0|:clap:
Tests / Release [gcc] total: 7949, passed: 7949
GNU C Compiler (gcc) |Total|New|Outstanding|Fixed|Trend |:-:|:-:|:-:|:-:|:-: |0|0|0|0|:clap:
Tests / Release [clang,aarch64] total: 5268, passed: 5268
Tests / Coverage total: 5315, passed: 5315
Code Coverage #### Project Overview No changes detected, that affect the code coverage. * Line Coverage: 97.74% (3066/3137) * Branch Coverage: 41.94% (18241/43494) * Complexity Density: 0.00 * Lines of Code: 3137 #### Quality Gates Summary Output truncated.
weaselab/conflict-set/pipeline/head This commit looks good
2024-11-04 17:03:24 -08:00
12c2d5eb95 Try not logging ctest output to stdout
Some checks failed
Tests / Coverage total: 5315, passed: 5315
Code Coverage #### Project Overview No changes detected, that affect the code coverage. * Line Coverage: 97.74% (3066/3137) * Branch Coverage: 41.94% (18241/43494) * Complexity Density: 0.00 * Lines of Code: 3137 #### Quality Gates Summary Output truncated.
Tests / 64 bit versions total: 7949, passed: 7949
Tests / Debug total: 7947, passed: 7947
Tests / SIMD fallback total: 7949, passed: 7949
Tests / Release [clang] total: 7949, passed: 7949
Clang |Total|New|Outstanding|Fixed|Trend |:-:|:-:|:-:|:-:|:-: |0|0|0|0|:clap:
Tests / Release [gcc] total: 7949, passed: 7949
GNU C Compiler (gcc) |Total|New|Outstanding|Fixed|Trend |:-:|:-:|:-:|:-:|:-: |0|0|0|0|:clap:
Tests / Release [clang,aarch64] total: 5268, passed: 5268
weaselab/conflict-set/pipeline/head There was a failure building this commit
2024-11-04 15:36:50 -08:00
db357e747d Add __mem{cpy,set}_chk to allowed symbol imports
Some checks reported errors
Tests / 64 bit versions total: 7949, passed: 7949
Tests / Debug total: 7947, passed: 7947
Tests / SIMD fallback total: 7949, passed: 7949
Tests / Release [clang] total: 7949, passed: 7949
Clang |Total|New|Outstanding|Fixed|Trend |:-:|:-:|:-:|:-:|:-: |0|0|0|0|:clap:
Tests / Release [gcc] total: 7949, passed: 7949
GNU C Compiler (gcc) |Total|New|Outstanding|Fixed|Trend |:-:|:-:|:-:|:-:|:-: |0|0|0|0|:clap:
Tests / Release [clang,aarch64] total: 5268, passed: 5268
Tests / Coverage total: 5315, passed: 5315
Code Coverage #### Project Overview No changes detected, that affect the code coverage. * Line Coverage: 97.74% (3066/3137) * Branch Coverage: 41.94% (18241/43494) * Complexity Density: 0.00 * Lines of Code: 3137 #### Quality Gates Summary Output truncated.
weaselab/conflict-set/pipeline/head Something is wrong with the build of this commit
2024-11-04 15:08:25 -08:00
4494359ca2 Make insert_iterations for interleaved writes more closely match
Some checks failed
Tests / 64 bit versions total: 7949, passed: 7949
Tests / Debug total: 7947, passed: 7947
Tests / SIMD fallback total: 7949, passed: 7949
Tests / Release [clang] total: 7949, passed: 7949
Clang |Total|New|Outstanding|Fixed|Trend |:-:|:-:|:-:|:-:|:-: |0|0|0|0|:clap:
Tests / Release [gcc] total: 7949, failed: 2, passed: 7947
Tests / Release [clang,aarch64] total: 5268, passed: 5268
Tests / Coverage total: 5315, passed: 5315
weaselab/conflict-set/pipeline/head There was a failure building this commit
2024-11-04 14:46:30 -08:00
f079d84bda Update README 2024-11-04 14:33:43 -08:00
724ec09248 Add to corpus 2024-11-04 14:30:12 -08:00
4eaad39294 Maintain capacity invariant strictly 2024-11-04 13:43:02 -08:00
891100e649 Add to corpus
All checks were successful
Tests / 64 bit versions total: 7901, passed: 7901
Tests / Debug total: 7899, passed: 7899
Tests / SIMD fallback total: 7901, passed: 7901
Tests / Release [clang] total: 7901, passed: 7901
Clang |Total|New|Outstanding|Fixed|Trend |:-:|:-:|:-:|:-:|:-: |0|0|0|0|:clap:
Tests / Release [gcc] total: 7901, passed: 7901
GNU C Compiler (gcc) |Total|New|Outstanding|Fixed|Trend |:-:|:-:|:-:|:-:|:-: |0|0|0|0|:clap:
Tests / Release [clang,aarch64] total: 5236, passed: 5236
Tests / Coverage total: 5283, passed: 5283
Code Coverage #### Project Overview No changes detected, that affect the code coverage. * Line Coverage: 97.62% (3112/3188) * Branch Coverage: 42.08% (18883/44869) * Complexity Density: 0.00 * Lines of Code: 3188 #### Quality Gates Summary Output truncated.
weaselab/conflict-set/pipeline/head This commit looks good
2024-11-01 21:32:00 -07:00
22e55309be Update benchmarks 2024-11-01 17:50:03 -07:00
268 changed files with 332 additions and 229 deletions

View File

@@ -210,7 +210,7 @@ enum Type : int8_t {
Type_Node256, Type_Node256,
}; };
template <class T> struct BoundedFreeListAllocator; template <class T> struct NodeAllocator;
struct TaggedNodePointer { struct TaggedNodePointer {
TaggedNodePointer() = default; TaggedNodePointer() = default;
@@ -297,9 +297,9 @@ struct Node {
} }
private: private:
template <class T> friend struct BoundedFreeListAllocator; template <class T> friend struct NodeAllocator;
// These are publically readable, but should only be written by // These are publically readable, but should only be written by
// BoundedFreeListAllocator // NodeAllocator
Type type; Type type;
int32_t partialKeyCapacity; int32_t partialKeyCapacity;
}; };
@@ -338,11 +338,12 @@ struct Node3 : Node {
constexpr static auto kMaxNodes = 3; constexpr static auto kMaxNodes = 3;
constexpr static auto kType = Type_Node3; constexpr static auto kType = Type_Node3;
TaggedNodePointer children[kMaxNodes];
InternalVersionT childMaxVersion[kMaxNodes];
// Sorted // Sorted
uint8_t index[kMaxNodes]; uint8_t index[kMaxNodes];
TaggedNodePointer children[kMaxNodes];
InternalVersionT childMaxVersion[kMaxNodes];
uint8_t *partialKey() { uint8_t *partialKey() {
assert(!releaseDeferred); assert(!releaseDeferred);
return (uint8_t *)(this + 1); return (uint8_t *)(this + 1);
@@ -357,11 +358,12 @@ struct Node16 : Node {
constexpr static auto kType = Type_Node16; constexpr static auto kType = Type_Node16;
constexpr static auto kMaxNodes = 16; constexpr static auto kMaxNodes = 16;
TaggedNodePointer children[kMaxNodes];
InternalVersionT childMaxVersion[kMaxNodes];
// Sorted // Sorted
uint8_t index[kMaxNodes]; uint8_t index[kMaxNodes];
TaggedNodePointer children[kMaxNodes];
InternalVersionT childMaxVersion[kMaxNodes];
uint8_t *partialKey() { uint8_t *partialKey() {
assert(!releaseDeferred); assert(!releaseDeferred);
return (uint8_t *)(this + 1); return (uint8_t *)(this + 1);
@@ -440,7 +442,10 @@ inline void Node3::copyChildrenAndKeyFrom(const Node0 &other) {
inline void Node3::copyChildrenAndKeyFrom(const Node3 &other) { inline void Node3::copyChildrenAndKeyFrom(const Node3 &other) {
memcpy((char *)this + kNodeCopyBegin, (char *)&other + kNodeCopyBegin, memcpy((char *)this + kNodeCopyBegin, (char *)&other + kNodeCopyBegin,
kNodeCopySize); kNodeCopySize);
memcpy(children, other.children, sizeof(*this) - sizeof(Node)); memcpy(index, other.index, kMaxNodes);
memcpy(children, other.children, kMaxNodes * sizeof(children[0])); // NOLINT
memcpy(childMaxVersion, other.childMaxVersion,
kMaxNodes * sizeof(childMaxVersion[0]));
memcpy(partialKey(), &other + 1, partialKeyLen); memcpy(partialKey(), &other + 1, partialKeyLen);
for (int i = 0; i < numChildren; ++i) { for (int i = 0; i < numChildren; ++i) {
assert(children[i]->parent == &other); assert(children[i]->parent == &other);
@@ -644,7 +649,7 @@ constexpr int kMinNodeSurplus = 104;
constexpr int kBytesPerKey = 112; constexpr int kBytesPerKey = 112;
constexpr int kMinNodeSurplus = 80; constexpr int kMinNodeSurplus = 80;
#endif #endif
// Cound the entry itself as a child // Count the entry itself as a child
constexpr int kMinChildrenNode0 = 1; constexpr int kMinChildrenNode0 = 1;
constexpr int kMinChildrenNode3 = 2; constexpr int kMinChildrenNode3 = 2;
constexpr int kMinChildrenNode16 = 4; constexpr int kMinChildrenNode16 = 4;
@@ -669,50 +674,40 @@ static_assert(kNode3Surplus >= kMinNodeSurplus);
static_assert(kBytesPerKey - sizeof(Node0) >= kMinNodeSurplus); static_assert(kBytesPerKey - sizeof(Node0) >= kMinNodeSurplus);
// setOldestVersion will additionally try to maintain this property: // We'll additionally maintain this property:
// `(children + entryPresent) * length >= capacity` // `(children + entryPresent) * length >= capacity`
// //
// Which should give us the budget to pay for the key bytes. (children + // Which should give us the budget to pay for the key bytes. (children +
// entryPresent) is a lower bound on how many keys these bytes are a prefix of // entryPresent) is a lower bound on how many keys these bytes are a prefix of
constexpr int64_t kFreeListMaxMemory = 1 << 20; constexpr int getMaxCapacity(int numChildren, int entryPresent,
int partialKeyLen) {
return (numChildren + entryPresent) * (partialKeyLen + 1);
}
template <class T> struct BoundedFreeListAllocator { constexpr int getMaxCapacity(Node *self) {
return getMaxCapacity(self->numChildren, self->entryPresent,
self->partialKeyLen);
}
constexpr int64_t kMaxFreeListBytes = 1 << 20;
// Maintains a free list up to kMaxFreeListBytes. If the top element of the list
// doesn't meet the capacity constraints, it's freed and a new node is allocated
// with the minimum capacity. The hope is that "unfit" nodes don't get stuck in
// the free list.
//
// TODO valgrind annotations
template <class T> struct NodeAllocator {
static_assert(sizeof(T) >= sizeof(void *));
static_assert(std::derived_from<T, Node>); static_assert(std::derived_from<T, Node>);
static_assert(std::is_trivial_v<T>); static_assert(std::is_trivial_v<T>);
T *allocate_helper(int partialKeyCapacity) { T *allocate(int minCapacity, int maxCapacity) {
if (freeList != nullptr) { assert(minCapacity <= maxCapacity);
T *n = (T *)freeList; assert(freeListSize >= 0);
VALGRIND_MAKE_MEM_DEFINED(freeList, sizeof(freeList)); assert(freeListSize <= kMaxFreeListBytes);
memcpy(&freeList, freeList, sizeof(freeList)); T *result = allocate_helper(minCapacity, maxCapacity);
VALGRIND_MAKE_MEM_UNDEFINED(n, sizeof(T));
VALGRIND_MAKE_MEM_DEFINED(&n->partialKeyCapacity,
sizeof(n->partialKeyCapacity));
VALGRIND_MAKE_MEM_DEFINED(&n->type, sizeof(n->type));
assert(n->type == T::kType);
VALGRIND_MAKE_MEM_UNDEFINED(n + 1, n->partialKeyCapacity);
freeListBytes -= sizeof(T) + n->partialKeyCapacity;
if (n->partialKeyCapacity >= partialKeyCapacity) {
return n;
} else {
// The intent is to filter out too-small nodes in the freelist
removeNode(n);
safe_free(n, sizeof(T) + n->partialKeyCapacity);
}
}
auto *result = (T *)safe_malloc(sizeof(T) + partialKeyCapacity);
result->type = T::kType;
result->partialKeyCapacity = partialKeyCapacity;
addNode(result);
return result;
}
T *allocate(int partialKeyCapacity) {
T *result = allocate_helper(partialKeyCapacity);
result->endOfRange = false; result->endOfRange = false;
result->releaseDeferred = false; result->releaseDeferred = false;
if constexpr (!std::is_same_v<T, Node0>) { if constexpr (!std::is_same_v<T, Node0>) {
@@ -732,37 +727,93 @@ template <class T> struct BoundedFreeListAllocator {
} }
void release(T *p) { void release(T *p) {
if (freeListBytes >= kFreeListMaxMemory) { if (freeListSize + sizeof(T) + p->partialKeyCapacity > kMaxFreeListBytes) {
removeNode(p); removeNode(p);
return safe_free(p, sizeof(T) + p->partialKeyCapacity); return safe_free(p, sizeof(T) + p->partialKeyCapacity);
} }
memcpy((void *)p, &freeList, sizeof(freeList)); p->parent = freeList;
freeList = p; freeList = p;
freeListBytes += sizeof(T) + p->partialKeyCapacity; freeListSize += sizeof(T) + p->partialKeyCapacity;
VALGRIND_MAKE_MEM_NOACCESS(freeList, sizeof(T) + p->partialKeyCapacity);
} }
BoundedFreeListAllocator() = default; void deferRelease(T *p, Node *forwardTo) {
p->releaseDeferred = true;
p->forwardTo = forwardTo;
if (freeListSize + sizeof(T) + p->partialKeyCapacity > kMaxFreeListBytes) {
p->parent = deferredListOverflow;
deferredListOverflow = p;
} else {
if (deferredList == nullptr) {
deferredListFront = p;
}
p->parent = deferredList;
deferredList = p;
freeListSize += sizeof(T) + p->partialKeyCapacity;
}
}
BoundedFreeListAllocator(const BoundedFreeListAllocator &) = delete; void releaseDeferred() {
BoundedFreeListAllocator & if (deferredList != nullptr) {
operator=(const BoundedFreeListAllocator &) = delete; deferredListFront->parent = freeList;
BoundedFreeListAllocator(BoundedFreeListAllocator &&) = delete; freeList = std::exchange(deferredList, nullptr);
BoundedFreeListAllocator &operator=(BoundedFreeListAllocator &&) = delete; }
for (T *n = std::exchange(deferredListOverflow, nullptr); n != nullptr;) {
auto *tmp = n;
n = (T *)n->parent;
release(tmp);
}
}
~BoundedFreeListAllocator() { NodeAllocator() = default;
for (void *iter = freeList; iter != nullptr;) {
VALGRIND_MAKE_MEM_DEFINED(iter, sizeof(Node)); NodeAllocator(const NodeAllocator &) = delete;
auto *tmp = (T *)iter; NodeAllocator &operator=(const NodeAllocator &) = delete;
memcpy(&iter, iter, sizeof(void *)); NodeAllocator(NodeAllocator &&) = delete;
removeNode((tmp)); NodeAllocator &operator=(NodeAllocator &&) = delete;
~NodeAllocator() {
assert(deferredList == nullptr);
assert(deferredListOverflow == nullptr);
for (T *iter = freeList; iter != nullptr;) {
auto *tmp = iter;
iter = (T *)iter->parent;
removeNode(tmp);
safe_free(tmp, sizeof(T) + tmp->partialKeyCapacity); safe_free(tmp, sizeof(T) + tmp->partialKeyCapacity);
} }
} }
private: private:
int64_t freeListBytes = 0; int64_t freeListSize = 0;
void *freeList = nullptr; T *freeList = nullptr;
T *deferredList = nullptr;
// Used to concatenate deferredList to freeList
T *deferredListFront;
T *deferredListOverflow = nullptr;
T *allocate_helper(int minCapacity, int maxCapacity) {
if (freeList != nullptr) {
freeListSize -= sizeof(T) + freeList->partialKeyCapacity;
assume(freeList->partialKeyCapacity >= 0);
assume(minCapacity >= 0);
assume(minCapacity <= maxCapacity);
if (freeList->partialKeyCapacity >= minCapacity &&
freeList->partialKeyCapacity <= maxCapacity) {
auto *result = freeList;
freeList = (T *)freeList->parent;
return result;
} else {
auto *p = freeList;
freeList = (T *)p->parent;
removeNode(p);
safe_free(p, sizeof(T) + p->partialKeyCapacity);
}
}
auto *result = (T *)safe_malloc(sizeof(T) + minCapacity);
result->type = T::kType;
result->partialKeyCapacity = minCapacity;
addNode(result);
return result;
}
}; };
uint8_t *Node::partialKey() { uint8_t *Node::partialKey() {
@@ -827,18 +878,19 @@ struct WriteContext {
WriteContext() { memset(&accum, 0, sizeof(accum)); } WriteContext() { memset(&accum, 0, sizeof(accum)); }
template <class T> T *allocate(int c) { template <class T> T *allocate(int minCapacity, int maxCapacity) {
static_assert(!std::is_same_v<T, Node>);
++accum.nodes_allocated; ++accum.nodes_allocated;
if constexpr (std::is_same_v<T, Node0>) { if constexpr (std::is_same_v<T, Node0>) {
return node0.allocate(c); return node0.allocate(minCapacity, maxCapacity);
} else if constexpr (std::is_same_v<T, Node3>) { } else if constexpr (std::is_same_v<T, Node3>) {
return node3.allocate(c); return node3.allocate(minCapacity, maxCapacity);
} else if constexpr (std::is_same_v<T, Node16>) { } else if constexpr (std::is_same_v<T, Node16>) {
return node16.allocate(c); return node16.allocate(minCapacity, maxCapacity);
} else if constexpr (std::is_same_v<T, Node48>) { } else if constexpr (std::is_same_v<T, Node48>) {
return node48.allocate(c); return node48.allocate(minCapacity, maxCapacity);
} else if constexpr (std::is_same_v<T, Node256>) { } else if constexpr (std::is_same_v<T, Node256>) {
return node256.allocate(c); return node256.allocate(minCapacity, maxCapacity);
} }
} }
template <class T> void release(T *c) { template <class T> void release(T *c) {
@@ -858,49 +910,37 @@ struct WriteContext {
} }
// Place in a list to be released in the next call to releaseDeferred. // Place in a list to be released in the next call to releaseDeferred.
void deferRelease(Node *n, Node *forwardTo) { template <class T> void deferRelease(T *n, Node *forwardTo) {
n->releaseDeferred = true; static_assert(!std::is_same_v<T, Node>);
n->forwardTo = forwardTo; if constexpr (std::is_same_v<T, Node0>) {
n->parent = deferredList; return node0.deferRelease(n, forwardTo);
deferredList = n; } else if constexpr (std::is_same_v<T, Node3>) {
return node3.deferRelease(n, forwardTo);
} else if constexpr (std::is_same_v<T, Node16>) {
return node16.deferRelease(n, forwardTo);
} else if constexpr (std::is_same_v<T, Node48>) {
return node48.deferRelease(n, forwardTo);
} else if constexpr (std::is_same_v<T, Node256>) {
return node256.deferRelease(n, forwardTo);
}
} }
// Release all nodes passed to deferRelease since the last call to // Release all nodes passed to deferRelease since the last call to
// releaseDeferred. // releaseDeferred.
void releaseDeferred() { void releaseDeferred() {
for (Node *n = std::exchange(deferredList, nullptr); n != nullptr;) { node0.releaseDeferred();
auto *tmp = n; node3.releaseDeferred();
n = n->parent; node16.releaseDeferred();
switch (tmp->getType()) { node48.releaseDeferred();
case Type_Node0: node256.releaseDeferred();
release(static_cast<Node0 *>(tmp));
break;
case Type_Node3:
release(static_cast<Node3 *>(tmp));
break;
case Type_Node16:
release(static_cast<Node16 *>(tmp));
break;
case Type_Node48:
release(static_cast<Node48 *>(tmp));
break;
case Type_Node256:
release(static_cast<Node256 *>(tmp));
break;
default: // GCOVR_EXCL_LINE
__builtin_unreachable(); // GCOVR_EXCL_LINE
}
}
} }
private: private:
Node *deferredList = nullptr; NodeAllocator<Node0> node0;
NodeAllocator<Node3> node3;
BoundedFreeListAllocator<Node0> node0; NodeAllocator<Node16> node16;
BoundedFreeListAllocator<Node3> node3; NodeAllocator<Node48> node48;
BoundedFreeListAllocator<Node16> node16; NodeAllocator<Node256> node256;
BoundedFreeListAllocator<Node48> node48;
BoundedFreeListAllocator<Node256> node256;
}; };
int getNodeIndex(Node3 *self, uint8_t index) { int getNodeIndex(Node3 *self, uint8_t index) {
@@ -1177,7 +1217,8 @@ void setMaxVersion(Node *n, InternalVersionT newMax) {
} }
} }
TaggedNodePointer &getInTree(Node *n, ConflictSet::Impl *); // If impl is nullptr, then n->parent must not be nullptr
TaggedNodePointer &getInTree(Node *n, ConflictSet::Impl *impl);
TaggedNodePointer getChild(Node0 *, uint8_t) { return nullptr; } TaggedNodePointer getChild(Node0 *, uint8_t) { return nullptr; }
TaggedNodePointer getChild(Node3 *self, uint8_t index) { TaggedNodePointer getChild(Node3 *self, uint8_t index) {
@@ -1430,9 +1471,14 @@ TaggedNodePointer getFirstChildExists(Node *self) {
// GCOVR_EXCL_STOP // GCOVR_EXCL_STOP
} }
// self must not be the root
void maybeDecreaseCapacity(Node *&self, WriteContext *writeContext,
ConflictSet::Impl *impl);
void consumePartialKeyFull(TaggedNodePointer &self, TrivialSpan &key, void consumePartialKeyFull(TaggedNodePointer &self, TrivialSpan &key,
InternalVersionT writeVersion, InternalVersionT writeVersion,
WriteContext *writeContext) { WriteContext *writeContext,
ConflictSet::Impl *impl) {
// Handle an existing partial key // Handle an existing partial key
int commonLen = std::min<int>(self->partialKeyLen, key.size()); int commonLen = std::min<int>(self->partialKeyLen, key.size());
int partialKeyIndex = int partialKeyIndex =
@@ -1444,7 +1490,8 @@ void consumePartialKeyFull(TaggedNodePointer &self, TrivialSpan &key,
InternalVersionT oldMaxVersion = exchangeMaxVersion(old, writeVersion); InternalVersionT oldMaxVersion = exchangeMaxVersion(old, writeVersion);
// *self will have one child (old) // *self will have one child (old)
auto *newSelf = writeContext->allocate<Node3>(partialKeyIndex); auto *newSelf = writeContext->allocate<Node3>(
partialKeyIndex, getMaxCapacity(1, 0, partialKeyIndex));
newSelf->parent = old->parent; newSelf->parent = old->parent;
newSelf->parentsIndex = old->parentsIndex; newSelf->parentsIndex = old->parentsIndex;
@@ -1466,9 +1513,8 @@ void consumePartialKeyFull(TaggedNodePointer &self, TrivialSpan &key,
old->partialKeyLen - (partialKeyIndex + 1)); old->partialKeyLen - (partialKeyIndex + 1));
old->partialKeyLen -= partialKeyIndex + 1; old->partialKeyLen -= partialKeyIndex + 1;
// We would consider decreasing capacity here, but we can't invalidate // Maintain memory capacity invariant
// old since it's not on the search path. setOldestVersion will clean it maybeDecreaseCapacity(old, writeContext, impl);
// up.
} }
key = key.subspan(partialKeyIndex, key.size() - partialKeyIndex); key = key.subspan(partialKeyIndex, key.size() - partialKeyIndex);
} }
@@ -1477,9 +1523,10 @@ void consumePartialKeyFull(TaggedNodePointer &self, TrivialSpan &key,
// `key` such that `self` is along the search path of `key` // `key` such that `self` is along the search path of `key`
inline __attribute__((always_inline)) void inline __attribute__((always_inline)) void
consumePartialKey(TaggedNodePointer &self, TrivialSpan &key, consumePartialKey(TaggedNodePointer &self, TrivialSpan &key,
InternalVersionT writeVersion, WriteContext *writeContext) { InternalVersionT writeVersion, WriteContext *writeContext,
ConflictSet::Impl *impl) {
if (self->partialKeyLen > 0) { if (self->partialKeyLen > 0) {
consumePartialKeyFull(self, key, writeVersion, writeContext); consumePartialKeyFull(self, key, writeVersion, writeContext, impl);
} }
} }
@@ -1489,7 +1536,8 @@ consumePartialKey(TaggedNodePointer &self, TrivialSpan &key,
// `maxVersion` for result. // `maxVersion` for result.
TaggedNodePointer &getOrCreateChild(TaggedNodePointer &self, TrivialSpan &key, TaggedNodePointer &getOrCreateChild(TaggedNodePointer &self, TrivialSpan &key,
InternalVersionT newMaxVersion, InternalVersionT newMaxVersion,
WriteContext *writeContext) { WriteContext *writeContext,
ConflictSet::Impl *impl) {
int index = key.front(); int index = key.front();
key = key.subspan(1, key.size() - 1); key = key.subspan(1, key.size() - 1);
@@ -1502,7 +1550,8 @@ TaggedNodePointer &getOrCreateChild(TaggedNodePointer &self, TrivialSpan &key,
auto *self3 = static_cast<Node3 *>(self); auto *self3 = static_cast<Node3 *>(self);
int i = getNodeIndex(self3, index); int i = getNodeIndex(self3, index);
if (i >= 0) { if (i >= 0) {
consumePartialKey(self3->children[i], key, newMaxVersion, writeContext); consumePartialKey(self3->children[i], key, newMaxVersion, writeContext,
impl);
self3->childMaxVersion[i] = newMaxVersion; self3->childMaxVersion[i] = newMaxVersion;
return self3->children[i]; return self3->children[i];
} }
@@ -1511,7 +1560,8 @@ TaggedNodePointer &getOrCreateChild(TaggedNodePointer &self, TrivialSpan &key,
auto *self16 = static_cast<Node16 *>(self); auto *self16 = static_cast<Node16 *>(self);
int i = getNodeIndex(self16, index); int i = getNodeIndex(self16, index);
if (i >= 0) { if (i >= 0) {
consumePartialKey(self16->children[i], key, newMaxVersion, writeContext); consumePartialKey(self16->children[i], key, newMaxVersion, writeContext,
impl);
self16->childMaxVersion[i] = newMaxVersion; self16->childMaxVersion[i] = newMaxVersion;
return self16->children[i]; return self16->children[i];
} }
@@ -1521,7 +1571,7 @@ TaggedNodePointer &getOrCreateChild(TaggedNodePointer &self, TrivialSpan &key,
int secondIndex = self48->index[index]; int secondIndex = self48->index[index];
if (secondIndex >= 0) { if (secondIndex >= 0) {
consumePartialKey(self48->children[secondIndex], key, newMaxVersion, consumePartialKey(self48->children[secondIndex], key, newMaxVersion,
writeContext); writeContext, impl);
self48->childMaxVersion[secondIndex] = newMaxVersion; self48->childMaxVersion[secondIndex] = newMaxVersion;
self48->maxOfMax[secondIndex >> Node48::kMaxOfMaxShift] = self48->maxOfMax[secondIndex >> Node48::kMaxOfMaxShift] =
std::max(self48->maxOfMax[secondIndex >> Node48::kMaxOfMaxShift], std::max(self48->maxOfMax[secondIndex >> Node48::kMaxOfMaxShift],
@@ -1532,7 +1582,7 @@ TaggedNodePointer &getOrCreateChild(TaggedNodePointer &self, TrivialSpan &key,
case Type_Node256: { case Type_Node256: {
auto *self256 = static_cast<Node256 *>(self); auto *self256 = static_cast<Node256 *>(self);
if (auto &result = self256->children[index]; result != nullptr) { if (auto &result = self256->children[index]; result != nullptr) {
consumePartialKey(result, key, newMaxVersion, writeContext); consumePartialKey(result, key, newMaxVersion, writeContext, impl);
self256->childMaxVersion[index] = newMaxVersion; self256->childMaxVersion[index] = newMaxVersion;
self256->maxOfMax[index >> Node256::kMaxOfMaxShift] = std::max( self256->maxOfMax[index >> Node256::kMaxOfMaxShift] = std::max(
self256->maxOfMax[index >> Node256::kMaxOfMaxShift], newMaxVersion); self256->maxOfMax[index >> Node256::kMaxOfMaxShift], newMaxVersion);
@@ -1543,9 +1593,10 @@ TaggedNodePointer &getOrCreateChild(TaggedNodePointer &self, TrivialSpan &key,
__builtin_unreachable(); // GCOVR_EXCL_LINE __builtin_unreachable(); // GCOVR_EXCL_LINE
} }
auto *newChild = writeContext->allocate<Node0>(key.size()); auto *newChild = writeContext->allocate<Node0>(
key.size(), getMaxCapacity(0, 1, key.size()));
newChild->numChildren = 0; newChild->numChildren = 0;
newChild->entryPresent = false; newChild->entryPresent = false; // Will be set to true by the caller
newChild->partialKeyLen = key.size(); newChild->partialKeyLen = key.size();
newChild->parentsIndex = index; newChild->parentsIndex = index;
memcpy(newChild->partialKey(), key.data(), key.size()); memcpy(newChild->partialKey(), key.data(), key.size());
@@ -1555,7 +1606,8 @@ TaggedNodePointer &getOrCreateChild(TaggedNodePointer &self, TrivialSpan &key,
case Type_Node0: { case Type_Node0: {
auto *self0 = static_cast<Node0 *>(self); auto *self0 = static_cast<Node0 *>(self);
auto *newSelf = writeContext->allocate<Node3>(self->partialKeyLen); auto *newSelf = writeContext->allocate<Node3>(
self->partialKeyLen, getMaxCapacity(1, 1, self->partialKeyLen));
newSelf->copyChildrenAndKeyFrom(*self0); newSelf->copyChildrenAndKeyFrom(*self0);
writeContext->deferRelease(self0, newSelf); writeContext->deferRelease(self0, newSelf);
self = newSelf; self = newSelf;
@@ -1565,7 +1617,9 @@ TaggedNodePointer &getOrCreateChild(TaggedNodePointer &self, TrivialSpan &key,
case Type_Node3: { case Type_Node3: {
if (self->numChildren == Node3::kMaxNodes) { if (self->numChildren == Node3::kMaxNodes) {
auto *self3 = static_cast<Node3 *>(self); auto *self3 = static_cast<Node3 *>(self);
auto *newSelf = writeContext->allocate<Node16>(self->partialKeyLen); auto *newSelf = writeContext->allocate<Node16>(
self->partialKeyLen,
getMaxCapacity(4, self->entryPresent, self->partialKeyLen));
newSelf->copyChildrenAndKeyFrom(*self3); newSelf->copyChildrenAndKeyFrom(*self3);
writeContext->deferRelease(self3, newSelf); writeContext->deferRelease(self3, newSelf);
self = newSelf; self = newSelf;
@@ -1594,7 +1648,9 @@ TaggedNodePointer &getOrCreateChild(TaggedNodePointer &self, TrivialSpan &key,
case Type_Node16: { case Type_Node16: {
if (self->numChildren == Node16::kMaxNodes) { if (self->numChildren == Node16::kMaxNodes) {
auto *self16 = static_cast<Node16 *>(self); auto *self16 = static_cast<Node16 *>(self);
auto *newSelf = writeContext->allocate<Node48>(self->partialKeyLen); auto *newSelf = writeContext->allocate<Node48>(
self->partialKeyLen,
getMaxCapacity(17, self->entryPresent, self->partialKeyLen));
newSelf->copyChildrenAndKeyFrom(*self16); newSelf->copyChildrenAndKeyFrom(*self16);
writeContext->deferRelease(self16, newSelf); writeContext->deferRelease(self16, newSelf);
self = newSelf; self = newSelf;
@@ -1625,7 +1681,9 @@ TaggedNodePointer &getOrCreateChild(TaggedNodePointer &self, TrivialSpan &key,
if (self->numChildren == 48) { if (self->numChildren == 48) {
auto *self48 = static_cast<Node48 *>(self); auto *self48 = static_cast<Node48 *>(self);
auto *newSelf = writeContext->allocate<Node256>(self->partialKeyLen); auto *newSelf = writeContext->allocate<Node256>(
self->partialKeyLen,
getMaxCapacity(49, self->entryPresent, self->partialKeyLen));
newSelf->copyChildrenAndKeyFrom(*self48); newSelf->copyChildrenAndKeyFrom(*self48);
writeContext->deferRelease(self48, newSelf); writeContext->deferRelease(self48, newSelf);
self = newSelf; self = newSelf;
@@ -1676,7 +1734,7 @@ Node *nextPhysical(Node *node) {
if (node == nullptr) { if (node == nullptr) {
return nullptr; return nullptr;
} }
auto nextChild = getChildGeq(node, index + 1); Node *nextChild = getChildGeq(node, index + 1);
if (nextChild != nullptr) { if (nextChild != nullptr) {
return nextChild; return nextChild;
} }
@@ -1695,7 +1753,7 @@ Node *nextLogical(Node *node) {
if (node == nullptr) { if (node == nullptr) {
return nullptr; return nullptr;
} }
auto nextChild = getChildGeq(node, index + 1); Node *nextChild = getChildGeq(node, index + 1);
if (nextChild != nullptr) { if (nextChild != nullptr) {
node = nextChild; node = nextChild;
goto downLeftSpine; goto downLeftSpine;
@@ -1707,76 +1765,48 @@ downLeftSpine:
return node; return node;
} }
// Invalidates `self`, replacing it with a node of at least capacity. void freeAndMakeCapacityBetween(Node *&self, int minCapacity, int maxCapacity,
// Does not return nodes to freelists when kUseFreeList is false.
void freeAndMakeCapacityAtLeast(Node *&self, int capacity,
WriteContext *writeContext, WriteContext *writeContext,
ConflictSet::Impl *impl, ConflictSet::Impl *impl) {
const bool kUseFreeList) {
switch (self->getType()) { switch (self->getType()) {
case Type_Node0: { case Type_Node0: {
auto *self0 = (Node0 *)self; auto *self0 = (Node0 *)self;
auto *newSelf = writeContext->allocate<Node0>(capacity); auto *newSelf = writeContext->allocate<Node0>(minCapacity, maxCapacity);
newSelf->copyChildrenAndKeyFrom(*self0); newSelf->copyChildrenAndKeyFrom(*self0);
getInTree(self, impl) = newSelf; getInTree(self, impl) = newSelf;
if (kUseFreeList) { writeContext->deferRelease(self0, newSelf);
writeContext->deferRelease(self0, newSelf);
} else {
removeNode(self0);
safe_free(self0, self0->size());
}
self = newSelf; self = newSelf;
} break; } break;
case Type_Node3: { case Type_Node3: {
auto *self3 = (Node3 *)self; auto *self3 = (Node3 *)self;
auto *newSelf = writeContext->allocate<Node3>(capacity); auto *newSelf = writeContext->allocate<Node3>(minCapacity, maxCapacity);
newSelf->copyChildrenAndKeyFrom(*self3); newSelf->copyChildrenAndKeyFrom(*self3);
getInTree(self, impl) = newSelf; getInTree(self, impl) = newSelf;
if (kUseFreeList) { writeContext->deferRelease(self3, newSelf);
writeContext->deferRelease(self3, newSelf);
} else {
removeNode(self3);
safe_free(self3, self3->size());
}
self = newSelf; self = newSelf;
} break; } break;
case Type_Node16: { case Type_Node16: {
auto *self16 = (Node16 *)self; auto *self16 = (Node16 *)self;
auto *newSelf = writeContext->allocate<Node16>(capacity); auto *newSelf = writeContext->allocate<Node16>(minCapacity, maxCapacity);
newSelf->copyChildrenAndKeyFrom(*self16); newSelf->copyChildrenAndKeyFrom(*self16);
getInTree(self, impl) = newSelf; getInTree(self, impl) = newSelf;
if (kUseFreeList) { writeContext->deferRelease(self16, newSelf);
writeContext->deferRelease(self16, newSelf);
} else {
removeNode(self16);
safe_free(self16, self16->size());
}
self = newSelf; self = newSelf;
} break; } break;
case Type_Node48: { case Type_Node48: {
auto *self48 = (Node48 *)self; auto *self48 = (Node48 *)self;
auto *newSelf = writeContext->allocate<Node48>(capacity); auto *newSelf = writeContext->allocate<Node48>(minCapacity, maxCapacity);
newSelf->copyChildrenAndKeyFrom(*self48); newSelf->copyChildrenAndKeyFrom(*self48);
getInTree(self, impl) = newSelf; getInTree(self, impl) = newSelf;
if (kUseFreeList) { writeContext->deferRelease(self48, newSelf);
writeContext->deferRelease(self48, newSelf);
} else {
removeNode(self48);
safe_free(self48, self48->size());
}
self = newSelf; self = newSelf;
} break; } break;
case Type_Node256: { case Type_Node256: {
auto *self256 = (Node256 *)self; auto *self256 = (Node256 *)self;
auto *newSelf = writeContext->allocate<Node256>(capacity); auto *newSelf = writeContext->allocate<Node256>(minCapacity, maxCapacity);
newSelf->copyChildrenAndKeyFrom(*self256); newSelf->copyChildrenAndKeyFrom(*self256);
getInTree(self, impl) = newSelf; getInTree(self, impl) = newSelf;
if (kUseFreeList) { writeContext->deferRelease(self256, newSelf);
writeContext->deferRelease(self256, newSelf);
} else {
removeNode(self256);
safe_free(self256, self256->size());
}
self = newSelf; self = newSelf;
} break; } break;
default: // GCOVR_EXCL_LINE default: // GCOVR_EXCL_LINE
@@ -1784,9 +1814,7 @@ void freeAndMakeCapacityAtLeast(Node *&self, int capacity,
} }
} }
// Fix larger-than-desired capacities. Does not return nodes to freelists, // Fix larger-than-desired capacities. self must not be the root
// since that wouldn't actually reclaim the memory used for partial key
// capacity.
void maybeDecreaseCapacity(Node *&self, WriteContext *writeContext, void maybeDecreaseCapacity(Node *&self, WriteContext *writeContext,
ConflictSet::Impl *impl) { ConflictSet::Impl *impl) {
@@ -1800,7 +1828,8 @@ void maybeDecreaseCapacity(Node *&self, WriteContext *writeContext,
if (self->getCapacity() <= maxCapacity) { if (self->getCapacity() <= maxCapacity) {
return; return;
} }
freeAndMakeCapacityAtLeast(self, maxCapacity, writeContext, impl, false); freeAndMakeCapacityBetween(self, self->partialKeyLen, maxCapacity,
writeContext, impl);
} }
#if defined(HAS_AVX) && !defined(__SANITIZE_THREAD__) #if defined(HAS_AVX) && !defined(__SANITIZE_THREAD__)
@@ -1870,13 +1899,16 @@ void rezero(Node *n, InternalVersionT z) {
#endif #endif
void mergeWithChild(TaggedNodePointer &self, WriteContext *writeContext, void mergeWithChild(TaggedNodePointer &self, WriteContext *writeContext,
ConflictSet::Impl *impl, Node3 *self3) { Node3 *self3, ConflictSet::Impl *impl) {
assert(!self3->entryPresent); assert(!self3->entryPresent);
Node *child = self3->children[0]; Node *child = self3->children[0];
int minCapacity = self3->partialKeyLen + 1 + child->partialKeyLen; const int minCapacity = self3->partialKeyLen + 1 + child->partialKeyLen;
const int maxCapacity =
getMaxCapacity(child->numChildren, child->entryPresent, minCapacity);
if (minCapacity > child->getCapacity()) { if (minCapacity > child->getCapacity()) {
freeAndMakeCapacityAtLeast(child, minCapacity, writeContext, impl, true); freeAndMakeCapacityBetween(child, minCapacity, maxCapacity, writeContext,
impl);
} }
// Merge partial key with child // Merge partial key with child
@@ -1915,20 +1947,23 @@ bool needsDownsize(Node *n) {
void downsize(Node3 *self, WriteContext *writeContext, void downsize(Node3 *self, WriteContext *writeContext,
ConflictSet::Impl *impl) { ConflictSet::Impl *impl) {
if (self->numChildren == 0) { if (self->numChildren == 0) {
auto *newSelf = writeContext->allocate<Node0>(self->partialKeyLen); auto *newSelf = writeContext->allocate<Node0>(
self->partialKeyLen, getMaxCapacity(0, 1, self->partialKeyLen));
newSelf->copyChildrenAndKeyFrom(*self); newSelf->copyChildrenAndKeyFrom(*self);
getInTree(self, impl) = newSelf; getInTree(self, impl) = newSelf;
writeContext->deferRelease(self, newSelf); writeContext->deferRelease(self, newSelf);
} else { } else {
assert(self->numChildren == 1 && !self->entryPresent); assert(self->numChildren == 1 && !self->entryPresent);
mergeWithChild(getInTree(self, impl), writeContext, impl, self); mergeWithChild(getInTree(self, impl), writeContext, self, impl);
} }
} }
void downsize(Node16 *self, WriteContext *writeContext, void downsize(Node16 *self, WriteContext *writeContext,
ConflictSet::Impl *impl) { ConflictSet::Impl *impl) {
assert(self->numChildren + int(self->entryPresent) < kMinChildrenNode16); assert(self->numChildren + int(self->entryPresent) < kMinChildrenNode16);
auto *newSelf = writeContext->allocate<Node3>(self->partialKeyLen); auto *newSelf = writeContext->allocate<Node3>(
self->partialKeyLen,
getMaxCapacity(kMinChildrenNode16 - 1, 0, self->partialKeyLen));
newSelf->copyChildrenAndKeyFrom(*self); newSelf->copyChildrenAndKeyFrom(*self);
getInTree(self, impl) = newSelf; getInTree(self, impl) = newSelf;
writeContext->deferRelease(self, newSelf); writeContext->deferRelease(self, newSelf);
@@ -1937,7 +1972,9 @@ void downsize(Node16 *self, WriteContext *writeContext,
void downsize(Node48 *self, WriteContext *writeContext, void downsize(Node48 *self, WriteContext *writeContext,
ConflictSet::Impl *impl) { ConflictSet::Impl *impl) {
assert(self->numChildren + int(self->entryPresent) < kMinChildrenNode48); assert(self->numChildren + int(self->entryPresent) < kMinChildrenNode48);
auto *newSelf = writeContext->allocate<Node16>(self->partialKeyLen); auto *newSelf = writeContext->allocate<Node16>(
self->partialKeyLen,
getMaxCapacity(kMinChildrenNode48 - 1, 0, self->partialKeyLen));
newSelf->copyChildrenAndKeyFrom(*self); newSelf->copyChildrenAndKeyFrom(*self);
getInTree(self, impl) = newSelf; getInTree(self, impl) = newSelf;
writeContext->deferRelease(self, newSelf); writeContext->deferRelease(self, newSelf);
@@ -1947,7 +1984,9 @@ void downsize(Node256 *self, WriteContext *writeContext,
ConflictSet::Impl *impl) { ConflictSet::Impl *impl) {
assert(self->numChildren + int(self->entryPresent) < kMinChildrenNode256); assert(self->numChildren + int(self->entryPresent) < kMinChildrenNode256);
auto *self256 = (Node256 *)self; auto *self256 = (Node256 *)self;
auto *newSelf = writeContext->allocate<Node48>(self->partialKeyLen); auto *newSelf = writeContext->allocate<Node48>(
self->partialKeyLen,
getMaxCapacity(kMinChildrenNode256 - 1, 0, self->partialKeyLen));
newSelf->copyChildrenAndKeyFrom(*self256); newSelf->copyChildrenAndKeyFrom(*self256);
getInTree(self, impl) = newSelf; getInTree(self, impl) = newSelf;
writeContext->deferRelease(self256, newSelf); writeContext->deferRelease(self256, newSelf);
@@ -2001,6 +2040,10 @@ Node *erase(Node *self, WriteContext *writeContext, ConflictSet::Impl *impl,
if (needsDownsize(self)) { if (needsDownsize(self)) {
downsize(self, writeContext, impl); downsize(self, writeContext, impl);
} }
while (self->releaseDeferred) {
self = self->forwardTo;
}
maybeDecreaseCapacity(self, writeContext, impl);
if (result != nullptr) { if (result != nullptr) {
while (result->releaseDeferred) { while (result->releaseDeferred) {
result = result->forwardTo; result = result->forwardTo;
@@ -2088,6 +2131,11 @@ Node *erase(Node *self, WriteContext *writeContext, ConflictSet::Impl *impl,
__builtin_unreachable(); // GCOVR_EXCL_LINE __builtin_unreachable(); // GCOVR_EXCL_LINE
} }
while (parent->releaseDeferred) {
parent = parent->forwardTo;
}
maybeDecreaseCapacity(parent, writeContext, impl);
if (result != nullptr) { if (result != nullptr) {
while (result->releaseDeferred) { while (result->releaseDeferred) {
result = result->forwardTo; result = result->forwardTo;
@@ -2791,13 +2839,12 @@ checkMaxBetweenExclusiveImpl<true>(Node256 *n, int begin, int end,
// of the result will have `maxVersion` set to `writeVersion` as a // of the result will have `maxVersion` set to `writeVersion` as a
// postcondition. Nodes along the search path may be invalidated. Callers must // postcondition. Nodes along the search path may be invalidated. Callers must
// ensure that the max version of the self argument is updated. // ensure that the max version of the self argument is updated.
[[nodiscard]] TaggedNodePointer *insert(TaggedNodePointer *self, [[nodiscard]] TaggedNodePointer *
TrivialSpan key, insert(TaggedNodePointer *self, TrivialSpan key, InternalVersionT writeVersion,
InternalVersionT writeVersion, WriteContext *writeContext, ConflictSet::Impl *impl) {
WriteContext *writeContext) {
for (; key.size() != 0; ++writeContext->accum.insert_iterations) { for (; key.size() != 0; ++writeContext->accum.insert_iterations) {
self = &getOrCreateChild(*self, key, writeVersion, writeContext); self = &getOrCreateChild(*self, key, writeVersion, writeContext, impl);
} }
return self; return self;
} }
@@ -2855,9 +2902,10 @@ void eraseTree(Node *root, WriteContext *writeContext) {
} }
void addPointWrite(TaggedNodePointer &root, TrivialSpan key, void addPointWrite(TaggedNodePointer &root, TrivialSpan key,
InternalVersionT writeVersion, WriteContext *writeContext) { InternalVersionT writeVersion, WriteContext *writeContext,
ConflictSet::Impl *impl) {
++writeContext->accum.point_writes; ++writeContext->accum.point_writes;
auto n = *insert(&root, key, writeVersion, writeContext); auto n = *insert(&root, key, writeVersion, writeContext, impl);
if (!n->entryPresent) { if (!n->entryPresent) {
++writeContext->accum.entries_inserted; ++writeContext->accum.entries_inserted;
auto *p = nextLogical(n); auto *p = nextLogical(n);
@@ -2991,8 +3039,8 @@ AddedWriteRange addWriteRange(Node *beginRoot, TrivialSpan begin, Node *endRoot,
++writeContext->accum.range_writes; ++writeContext->accum.range_writes;
Node *beginNode = Node *beginNode = *insert(&getInTree(beginRoot, impl), begin, writeVersion,
*insert(&getInTree(beginRoot, impl), begin, writeVersion, writeContext); writeContext, impl);
addKey(beginNode); addKey(beginNode);
if (!beginNode->entryPresent) { if (!beginNode->entryPresent) {
++writeContext->accum.entries_inserted; ++writeContext->accum.entries_inserted;
@@ -3008,7 +3056,7 @@ AddedWriteRange addWriteRange(Node *beginRoot, TrivialSpan begin, Node *endRoot,
beginNode->entry.pointVersion = writeVersion; beginNode->entry.pointVersion = writeVersion;
Node *endNode = Node *endNode =
*insert(&getInTree(endRoot, impl), end, writeVersion, writeContext); *insert(&getInTree(endRoot, impl), end, writeVersion, writeContext, impl);
addKey(endNode); addKey(endNode);
if (!endNode->entryPresent) { if (!endNode->entryPresent) {
@@ -3054,10 +3102,10 @@ void addWriteRange(TaggedNodePointer &root, TrivialSpan begin, TrivialSpan end,
std::min(begin.size(), end.size())); std::min(begin.size(), end.size()));
if (lcp == begin.size() && end.size() == begin.size() + 1 && if (lcp == begin.size() && end.size() == begin.size() + 1 &&
end.back() == 0) { end.back() == 0) {
return addPointWrite(root, begin, writeVersion, writeContext); return addPointWrite(root, begin, writeVersion, writeContext, impl);
} }
auto useAsRoot = auto useAsRoot =
insert(&root, begin.subspan(0, lcp), writeVersion, writeContext); insert(&root, begin.subspan(0, lcp), writeVersion, writeContext, impl);
auto [beginNode, endNode] = addWriteRange( auto [beginNode, endNode] = addWriteRange(
*useAsRoot, begin.subspan(lcp, begin.size() - lcp), *useAsRoot, *useAsRoot, begin.subspan(lcp, begin.size() - lcp), *useAsRoot,
@@ -4143,12 +4191,13 @@ void pointIter(Job *job, Context *context) {
MUSTTAIL return complete(job, context); MUSTTAIL return complete(job, context);
} }
++context->iterations;
if (!job->getChildAndIndex(child, job->remaining.front())) [[unlikely]] { if (!job->getChildAndIndex(child, job->remaining.front())) [[unlikely]] {
*job->result = {job->n, job->remaining}; *job->result = {job->n, job->remaining};
MUSTTAIL return complete(job, context); MUSTTAIL return complete(job, context);
} }
++context->iterations;
job->continuation = PointIterTable<NodeTTo>::table[job->child.getType()]; job->continuation = PointIterTable<NodeTTo>::table[job->child.getType()];
__builtin_prefetch(job->child); __builtin_prefetch(job->child);
MUSTTAIL return keepGoing(job, context); MUSTTAIL return keepGoing(job, context);
@@ -4249,11 +4298,12 @@ void prefixIter(Job *job, Context *context) {
} }
} }
++context->iterations;
if (!job->getChildAndIndex(child, job->remaining.front())) [[unlikely]] { if (!job->getChildAndIndex(child, job->remaining.front())) [[unlikely]] {
goto noNodeOnSearchPath; goto noNodeOnSearchPath;
} }
++context->iterations;
job->continuation = PrefixIterTable<NodeTTo>::table[job->child.getType()]; job->continuation = PrefixIterTable<NodeTTo>::table[job->child.getType()];
__builtin_prefetch(job->child); __builtin_prefetch(job->child);
MUSTTAIL return keepGoing(job, context); MUSTTAIL return keepGoing(job, context);
@@ -4323,11 +4373,12 @@ void beginIter(Job *job, Context *context) {
goto gotoEndIter; goto gotoEndIter;
} }
++context->iterations;
if (!job->getChildAndIndex(child, job->begin.front())) [[unlikely]] { if (!job->getChildAndIndex(child, job->begin.front())) [[unlikely]] {
goto gotoEndIter; goto gotoEndIter;
} }
++context->iterations;
job->continuation = BeginIterTable<NodeTTo>::table[job->child.getType()]; job->continuation = BeginIterTable<NodeTTo>::table[job->child.getType()];
__builtin_prefetch(job->child); __builtin_prefetch(job->child);
MUSTTAIL return keepGoing(job, context); MUSTTAIL return keepGoing(job, context);
@@ -4387,13 +4438,14 @@ void endIter(Job *job, Context *context) {
MUSTTAIL return complete(job, context); MUSTTAIL return complete(job, context);
} }
++context->iterations;
if (!job->getChildAndIndex(child, job->end.front())) [[unlikely]] { if (!job->getChildAndIndex(child, job->end.front())) [[unlikely]] {
*job->result = {job->n, job->begin, job->endNode, job->end}; *job->result = {job->n, job->begin, job->endNode, job->end};
assert(job->endNode != nullptr); assert(job->endNode != nullptr);
MUSTTAIL return complete(job, context); MUSTTAIL return complete(job, context);
} }
++context->iterations;
job->continuation = EndIterTable<NodeTTo>::table[job->child.getType()]; job->continuation = EndIterTable<NodeTTo>::table[job->child.getType()];
__builtin_prefetch(job->child); __builtin_prefetch(job->child);
MUSTTAIL return keepGoing(job, context); MUSTTAIL return keepGoing(job, context);
@@ -5073,8 +5125,8 @@ struct __attribute__((visibility("hidden"))) ConflictSet::Impl {
} }
if (context.results[i].endInsertionPoint == nullptr) { if (context.results[i].endInsertionPoint == nullptr) {
addPointWrite(getInTree(context.results[i].insertionPoint, this), addPointWrite(getInTree(context.results[i].insertionPoint, this),
context.results[i].remaining, writeVersion, context.results[i].remaining, writeVersion, &writeContext,
&writeContext); this);
} else { } else {
if (firstRangeWrite == nullptr) { if (firstRangeWrite == nullptr) {
firstRangeWrite = context.results + i; firstRangeWrite = context.results + i;
@@ -5151,7 +5203,7 @@ struct __attribute__((visibility("hidden"))) ConflictSet::Impl {
&writeContext, this); &writeContext, this);
} else { } else {
addPointWrite(root, begin, InternalVersionT(writeVersion), addPointWrite(root, begin, InternalVersionT(writeVersion),
&writeContext); &writeContext, this);
} }
} }
} }
@@ -5271,7 +5323,6 @@ struct __attribute__((visibility("hidden"))) ConflictSet::Impl {
assert(n->entry.rangeVersion <= oldestVersion); assert(n->entry.rangeVersion <= oldestVersion);
n = erase(n, &writeContext, this, /*logical*/ false); n = erase(n, &writeContext, this, /*logical*/ false);
} else { } else {
maybeDecreaseCapacity(n, &writeContext, this);
n = nextPhysical(n); n = nextPhysical(n);
} }
} }
@@ -5350,7 +5401,7 @@ struct __attribute__((visibility("hidden"))) ConflictSet::Impl {
keyUpdates = 10; keyUpdates = 10;
// Insert "" // Insert ""
root = writeContext.allocate<Node0>(0); root = writeContext.allocate<Node0>(0, 0);
root->numChildren = 0; root->numChildren = 0;
root->parent = nullptr; root->parent = nullptr;
root->entryPresent = false; root->entryPresent = false;
@@ -5923,7 +5974,16 @@ checkMaxVersion(Node *root, Node *node, InternalVersionT oldestVersion,
int(node->entryPresent), minNumChildren); int(node->entryPresent), minNumChildren);
success = false; success = false;
} }
// TODO check that the max capacity property eventually holds
const int maxCapacity =
(node->numChildren + int(node->entryPresent)) * (node->partialKeyLen + 1);
if (node->getCapacity() > maxCapacity) {
fprintf(stderr, "%s has d capacity %d, which is more than the allowed %d\n",
getSearchPathPrintable(node).c_str(), node->getCapacity(),
maxCapacity);
success = false;
}
for (auto child = getChildGeq(node, 0); child != nullptr; for (auto child = getChildGeq(node, 0); child != nullptr;
child = getChildGeq(node, child->parentsIndex + 1)) { child = getChildGeq(node, child->parentsIndex + 1)) {
checkMemoryBoundInvariants(child, success); checkMemoryBoundInvariants(child, success);

4
Jenkinsfile vendored
View File

@@ -11,11 +11,11 @@ def CleanBuildAndTest(String cmakeArgs) {
catchError { catchError {
sh ''' sh '''
cd build cd build
ctest --no-compress-output --test-output-size-passed 100000 --test-output-size-failed 100000 -T Test -j `nproc` --timeout 90 ctest --no-compress-output --test-output-size-passed 100000 --test-output-size-failed 100000 -T Test -j `nproc` --timeout 90 > /dev/null
zstd Testing/*/Test.xml zstd Testing/*/Test.xml
''' '''
} }
xunit tools: [CTest(pattern: 'build/Testing/*/Test.xml')], reduceLog: false, skipPublishingChecks: false xunit tools: [CTest(pattern: 'build/Testing/*/Test.xml')], skipPublishingChecks: false
minio bucket: 'jenkins', credentialsId: 'jenkins-minio', excludes: '', host: 'minio.weaselab.dev', includes: 'build/Testing/*/Test.xml.zst', targetFolder: '${JOB_NAME}/${BUILD_NUMBER}/${STAGE_NAME}/' minio bucket: 'jenkins', credentialsId: 'jenkins-minio', excludes: '', host: 'minio.weaselab.dev', includes: 'build/Testing/*/Test.xml.zst', targetFolder: '${JOB_NAME}/${BUILD_NUMBER}/${STAGE_NAME}/'
} }

View File

@@ -2,7 +2,9 @@ A data structure for optimistic concurrency control on ranges of bitwise-lexicog
Intended as an alternative to FoundationDB's skip list. Intended as an alternative to FoundationDB's skip list.
Hardware for all benchmarks is an AMD Ryzen 9 7900 with (2x32GB) 5600MT/s CL28-34-34-89 1.35V RAM Hardware for all benchmarks is an AMD Ryzen 9 7900 with (2x32GB) 5600MT/s CL28-34-34-89 1.35V RAM.
Compiler is `Ubuntu clang version 20.0.0 (++20241029082144+7544d3af0e28-1~exp1~20241029082307.506)`.
# Microbenchmark # Microbenchmark
@@ -10,29 +12,29 @@ Hardware for all benchmarks is an AMD Ryzen 9 7900 with (2x32GB) 5600MT/s CL28-3
| ns/op | op/s | err% | ins/op | cyc/op | IPC | bra/op | miss% | total | benchmark | ns/op | op/s | err% | ins/op | cyc/op | IPC | bra/op | miss% | total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:---------- |--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
| 172.03 | 5,812,791.77 | 0.4% | 3,130.62 | 879.00 | 3.562 | 509.23 | 0.0% | 0.01 | `point reads` | 159.65 | 6,263,576.52 | 1.6% | 2,972.36 | 820.37 | 3.623 | 504.59 | 0.0% | 0.01 | `point reads`
| 167.44 | 5,972,130.71 | 0.2% | 3,065.14 | 862.27 | 3.555 | 494.30 | 0.0% | 0.01 | `prefix reads` | 156.32 | 6,397,320.65 | 0.7% | 2,913.62 | 806.87 | 3.611 | 490.19 | 0.0% | 0.01 | `prefix reads`
| 238.77 | 4,188,130.84 | 0.9% | 3,589.93 | 1,259.30 | 2.851 | 637.12 | 0.0% | 0.01 | `range reads` | 229.18 | 4,363,293.65 | 1.2% | 3,541.05 | 1,219.75 | 2.903 | 629.33 | 0.0% | 0.01 | `range reads`
| 424.01 | 2,358,426.70 | 0.2% | 5,620.05 | 2,242.35 | 2.506 | 854.80 | 1.7% | 0.01 | `point writes` | 363.37 | 2,752,026.30 | 0.3% | 5,273.63 | 1,951.54 | 2.702 | 851.66 | 1.7% | 0.01 | `point writes`
| 418.45 | 2,389,780.56 | 0.4% | 5,525.07 | 2,211.05 | 2.499 | 831.71 | 1.7% | 0.01 | `prefix writes` | 364.99 | 2,739,787.02 | 0.3% | 5,250.92 | 1,958.54 | 2.681 | 839.24 | 1.7% | 0.01 | `prefix writes`
| 254.87 | 3,923,568.88 | 2.6% | 3,187.01 | 1,366.50 | 2.332 | 529.11 | 2.7% | 0.02 | `range writes` | 242.26 | 4,127,796.58 | 2.9% | 3,117.33 | 1,304.41 | 2.390 | 541.07 | 2.8% | 0.02 | `range writes`
| 675.96 | 1,479,374.50 | 3.3% | 7,735.41 | 3,468.60 | 2.230 | 1,386.02 | 1.8% | 0.01 | `monotonic increasing point writes` | 562.48 | 1,777,855.27 | 0.8% | 7,305.21 | 3,034.34 | 2.408 | 1,329.30 | 1.3% | 0.01 | `monotonic increasing point writes`
| 137,986.20 | 7,247.10 | 0.6% | 789,752.33 | 699,462.00 | 1.129 | 144,824.14 | 0.0% | 0.01 | `worst case for radix tree` | 122,688.57 | 8,150.72 | 0.7% | 798,766.00 | 666,842.00 | 1.198 | 144,584.50 | 0.1% | 0.01 | `worst case for radix tree`
| 21.63 | 46,231,564.03 | 1.0% | 448.00 | 107.14 | 4.181 | 84.00 | 0.0% | 0.01 | `create and destroy` | 41.71 | 23,976,459.34 | 1.7% | 885.00 | 219.17 | 4.038 | 132.00 | 0.0% | 0.01 | `create and destroy`
## Radix tree (this implementation) ## Radix tree (this implementation)
| ns/op | op/s | err% | ins/op | cyc/op | IPC | bra/op | miss% | total | benchmark | ns/op | op/s | err% | ins/op | cyc/op | IPC | bra/op | miss% | total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:---------- |--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
| 12.88 | 77,653,350.77 | 0.5% | 185.37 | 64.45 | 2.876 | 41.51 | 0.4% | 0.01 | `point reads` | 12.63 | 79,186,868.18 | 1.4% | 241.61 | 64.76 | 3.731 | 31.64 | 0.8% | 0.01 | `point reads`
| 14.67 | 68,179,354.49 | 0.1% | 271.44 | 73.40 | 3.698 | 53.70 | 0.3% | 0.01 | `prefix reads` | 14.48 | 69,078,073.40 | 0.3% | 292.42 | 74.69 | 3.915 | 41.49 | 0.5% | 0.01 | `prefix reads`
| 34.84 | 28,701,444.36 | 0.3% | 715.74 | 175.27 | 4.084 | 127.30 | 0.2% | 0.01 | `range reads` | 34.37 | 29,094,694.11 | 0.2% | 759.53 | 179.77 | 4.225 | 100.38 | 0.2% | 0.01 | `range reads`
| 17.12 | 58,422,988.28 | 0.2% | 314.30 | 86.11 | 3.650 | 39.82 | 0.4% | 0.01 | `point writes` | 19.34 | 51,713,896.36 | 0.7% | 369.70 | 101.81 | 3.631 | 47.88 | 0.6% | 0.01 | `point writes`
| 31.42 | 31,830,804.65 | 0.1% | 591.06 | 158.07 | 3.739 | 82.67 | 0.2% | 0.01 | `prefix writes` | 39.16 | 25,538,968.61 | 0.2% | 653.16 | 206.77 | 3.159 | 89.62 | 0.8% | 0.01 | `prefix writes`
| 37.37 | 26,759,432.70 | 2.2% | 681.98 | 188.95 | 3.609 | 96.10 | 0.1% | 0.01 | `range writes` | 40.58 | 24,642,681.12 | 4.7% | 718.44 | 216.44 | 3.319 | 99.28 | 0.6% | 0.01 | `range writes`
| 76.72 | 13,035,140.63 | 2.3% | 1,421.28 | 387.17 | 3.671 | 257.76 | 0.1% | 0.01 | `monotonic increasing point writes` | 78.77 | 12,694,520.69 | 3.8% | 1,395.55 | 421.73 | 3.309 | 249.81 | 0.1% | 0.01 | `monotonic increasing point writes`
| 297,452.00 | 3,361.89 | 0.9% | 3,508,083.00 | 1,500,834.67 | 2.337 | 727,525.33 | 0.1% | 0.01 | `worst case for radix tree` | 287,760.50 | 3,475.11 | 0.5% | 3,929,266.50 | 1,550,225.50 | 2.535 | 639,064.00 | 0.0% | 0.01 | `worst case for radix tree`
| 87.70 | 11,402,490.60 | 1.0% | 1,795.00 | 442.09 | 4.060 | 297.00 | 0.0% | 0.01 | `create and destroy` | 104.76 | 9,545,250.65 | 3.1% | 2,000.00 | 552.82 | 3.618 | 342.00 | 0.0% | 0.01 | `create and destroy`
# "Real data" test # "Real data" test
@@ -41,13 +43,13 @@ Point queries only, best of three runs. Gc ratio is the ratio of time spent doin
## skip list ## skip list
``` ```
Check: 4.47891 seconds, 364.05 MB/s, Add: 4.55599 seconds, 123.058 MB/s, Gc ratio: 37.1145% Check: 4.39702 seconds, 370.83 MB/s, Add: 4.50025 seconds, 124.583 MB/s, Gc ratio: 29.1333%, Peak idle memory: 5.51852e+06
``` ```
## radix tree ## radix tree
``` ```
Check: 0.953012 seconds, 1710.94 MB/s, Add: 1.30025 seconds, 431.188 MB/s, Gc ratio: 43.9816%, Peak idle memory: 2.28375e+06 Check: 0.987757 seconds, 1650.76 MB/s, Add: 1.24815 seconds, 449.186 MB/s, Gc ratio: 41.4675%, Peak idle memory: 2.02872e+06
``` ```
## hash table ## hash table
@@ -55,5 +57,5 @@ Check: 0.953012 seconds, 1710.94 MB/s, Add: 1.30025 seconds, 431.188 MB/s, Gc ra
(The hash table implementation doesn't work on range queries, and its purpose is to provide an idea of how fast point queries can be) (The hash table implementation doesn't work on range queries, and its purpose is to provide an idea of how fast point queries can be)
``` ```
Check: 0.804094 seconds, 2027.81 MB/s, Add: 0.652952 seconds, 858.645 MB/s, Gc ratio: 35.3885% Check: 0.84256 seconds, 1935.23 MB/s, Add: 0.697204 seconds, 804.146 MB/s, Gc ratio: 35.4091%
``` ```

View File

@@ -1,3 +1,4 @@
#include <algorithm>
#include <atomic> #include <atomic>
#include <cstdint> #include <cstdint>
#include <cstdlib> #include <cstdlib>
@@ -19,31 +20,69 @@
#include <vector> #include <vector>
#include "ConflictSet.h" #include "ConflictSet.h"
#include "Internal.h"
#include "third_party/nadeau.h" #include "third_party/nadeau.h"
std::atomic<int64_t> transactions; std::atomic<int64_t> transactions;
constexpr int kWindowSize = 10000000; int64_t safeUnaryMinus(int64_t x) {
return x == std::numeric_limits<int64_t>::min() ? x : -x;
}
constexpr int kNumPrefixes = 250000; void tupleAppend(std::string &output, int64_t value) {
if (value == 0) {
output.push_back(0x14);
return;
}
uint32_t size = 8 - __builtin_clrsbll(value) / 8;
int typeCode = 0x14 + (value < 0 ? -1 : 1) * size;
output.push_back(typeCode);
if (value < 0) {
value = ~safeUnaryMinus(value);
}
uint64_t swap = __builtin_bswap64(value);
output.insert(output.end(), (uint8_t *)&swap + 8 - size,
(uint8_t *)&swap + 8);
}
std::string makeKey(int64_t num, int suffixLen) { void tupleAppend(std::string &output, std::string_view value) {
output.push_back('\x02');
for (auto c : value) {
if (c == '\x00') {
output.push_back('\x00');
output.push_back('\xff');
} else {
output.push_back(c);
}
}
output.push_back('\x00');
}
template <class... Ts> std::string tupleKey(const Ts &...ts) {
std::string result; std::string result;
result.resize(sizeof(int64_t) + suffixLen); (tupleAppend(result, ts), ...);
int64_t be = __builtin_bswap64(num);
memcpy(result.data(), &be, sizeof(int64_t));
memset(result.data() + sizeof(int64_t), 0, suffixLen);
return result; return result;
} }
constexpr int kWindowSize = 300000;
void workload(weaselab::ConflictSet *cs) { void workload(weaselab::ConflictSet *cs) {
int64_t version = kWindowSize; int64_t version = kWindowSize;
constexpr int kNumWrites = 16; constexpr int kNumWrites = 16;
for (;; transactions.fetch_add(1, std::memory_order_relaxed)) { for (;; transactions.fetch_add(1, std::memory_order_relaxed)) {
std::vector<int64_t> keyIndices;
for (int i = 0; i < kNumWrites; ++i) {
keyIndices.push_back(rand() % 100'000'000);
}
std::sort(keyIndices.begin(), keyIndices.end());
std::vector<std::string> keys; std::vector<std::string> keys;
std::vector<weaselab::ConflictSet::WriteRange> writes; std::vector<weaselab::ConflictSet::WriteRange> writes;
constexpr std::string_view suffix = "this is a suffix";
for (int i = 0; i < kNumWrites; ++i) { for (int i = 0; i < kNumWrites; ++i) {
keys.push_back(makeKey(rand() % kNumPrefixes, rand() % 50)); keys.push_back(tupleKey(0x100, i, keyIndices[i],
suffix.substr(0, rand() % suffix.size()),
rand()));
// printf("%s\n", printable(keys.back()).c_str());
} }
for (int i = 0; i < kNumWrites; ++i) { for (int i = 0; i < kNumWrites; ++i) {
writes.push_back({{(const uint8_t *)keys[i].data(), int(keys[i].size())}, writes.push_back({{(const uint8_t *)keys[i].data(), int(keys[i].size())},

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Some files were not shown because too many files have changed in this diff Show More