13 Commits

Author SHA1 Message Date
62516825d1 Try specifying filter multiple times
Some checks failed
Tests / Clang total: 2096, failed: 520, passed: 1576
Tests / Debug total: 2094, failed: 520, passed: 1574
Tests / SIMD fallback total: 2096, passed: 2096
Tests / Release [gcc] total: 2096, passed: 2096
Tests / Release [gcc,aarch64] total: 1564, passed: 1564
Tests / Coverage total: 1574, passed: 1574
weaselab/conflict-set/pipeline/head There was a failure building this commit
2024-08-12 15:26:36 -07:00
3d592bd6a9 Move longestCommonPrefix to its own file
Some checks failed
Tests / Clang total: 2096, failed: 520, passed: 1576
Tests / Debug total: 2094, failed: 520, passed: 1574
Tests / SIMD fallback total: 2096, passed: 2096
Tests / Release [gcc] total: 2096, passed: 2096
Tests / Release [gcc,aarch64] total: 1564, passed: 1564
Tests / Coverage total: 1574, passed: 1574
weaselab/conflict-set/pipeline/head There was a failure building this commit
2024-08-12 15:10:05 -07:00
f5f5fb620b Run gc at 200%
150% pessimized the "real data" benchmark
2024-08-12 10:48:24 -07:00
e3d1b2e842 Add to corpus
All checks were successful
Tests / Clang total: 2096, passed: 2096
Clang |Total|New|Outstanding|Fixed|Trend |:-:|:-:|:-:|:-:|:-: |0|0|0|0|:clap:
Tests / Debug total: 2094, passed: 2094
Tests / SIMD fallback total: 2096, passed: 2096
Tests / Release [gcc] total: 2096, passed: 2096
GNU C Compiler (gcc) |Total|New|Outstanding|Fixed|Trend |:-:|:-:|:-:|:-:|:-: |0|0|0|0|:clap:
Tests / Release [gcc,aarch64] total: 1564, passed: 1564
Tests / Coverage total: 1574, passed: 1574
Code Coverage #### Project Overview No changes detected, that affect the code coverage. * Line Coverage: 99.34% (1803/1815) * Branch Coverage: 67.55% (1486/2200) * Complexity Density: 0.00 * Lines of Code: 1815 #### Quality Gates Summary Output truncated.
weaselab/conflict-set/pipeline/head This commit looks good
After a bunch of fuzzing on "udon" (my new zen4 machine)
2024-08-09 19:48:22 -07:00
9f8800af16 Add more to corpus (from fuzzing on osx)
Some checks reported errors
Tests / Clang total: 1784, passed: 1784
Clang |Total|New|Outstanding|Fixed|Trend |:-:|:-:|:-:|:-:|:-: |0|0|0|0|:clap:
Tests / Debug total: 1782, passed: 1782
Tests / SIMD fallback total: 1784, passed: 1784
Tests / Release [gcc] total: 1784, passed: 1784
GNU C Compiler (gcc) |Total|New|Outstanding|Fixed|Trend |:-:|:-:|:-:|:-:|:-: |0|0|0|0|:clap:
Tests / Release [gcc,aarch64] total: 1330, passed: 1330
Tests / Coverage total: 1340, passed: 1340
Code Coverage #### Project Overview No changes detected, that affect the code coverage. * Line Coverage: 99.12% (1799/1815) * Branch Coverage: 67.50% (1485/2200) * Complexity Density: 0.00 * Lines of Code: 1815 #### Quality Gates Summary Output truncated.
weaselab/conflict-set/pipeline/head Something is wrong with the build of this commit
Maybe we should just accumulate the corpus instead of replacing it? That
should be easier on git right?
2024-08-09 18:06:54 -07:00
182c065c8e Insert common prefix in addWriteRange
Some checks failed
Tests / Clang total: 1420, passed: 1420
Clang |Total|New|Outstanding|Fixed|Trend |:-:|:-:|:-:|:-:|:-: |0|0|0|0|:clap:
Tests / Debug total: 1418, passed: 1418
Tests / SIMD fallback total: 1420, passed: 1420
Tests / Release [gcc] total: 1420, passed: 1420
GNU C Compiler (gcc) |Total|New|Outstanding|Fixed|Trend |:-:|:-:|:-:|:-:|:-: |0|0|0|0|:clap:
Tests / Release [gcc,aarch64] total: 1057, passed: 1057
Tests / Coverage total: 1067, passed: 1067
Code Coverage #### Project Overview No changes detected, that affect the code coverage. * Line Coverage: 99.12% (1799/1815) * Branch Coverage: 67.50% (1485/2200) * Complexity Density: 0.00 * Lines of Code: 1815 #### Quality Gates Summary Output truncated.
weaselab/conflict-set/pipeline/head There was a failure building this commit
This allows us to use our optimized implementation for setting max
version along the search path instead of a one-off loop
2024-08-09 14:57:04 -07:00
2dba0d5be3 Have insert return a pointer to the in-tree pointer 2024-08-09 13:58:31 -07:00
a1dfdf355c Use metrics to count change in entry count
This lets us run gc slower safely
2024-08-09 13:44:49 -07:00
15919cb1c4 Add range writes to server_bench 2024-08-09 13:43:24 -07:00
5ed9003a83 Bump version
All checks were successful
Tests / Clang total: 1420, passed: 1420
Clang |Total|New|Outstanding|Fixed|Trend |:-:|:-:|:-:|:-:|:-: |0|0|0|0|:clap:
Tests / Debug total: 1418, passed: 1418
Tests / SIMD fallback total: 1420, passed: 1420
Tests / Release [gcc] total: 1420, passed: 1420
GNU C Compiler (gcc) |Total|New|Outstanding|Fixed|Trend |:-:|:-:|:-:|:-:|:-: |0|0|0|0|:clap:
Tests / Release [gcc,aarch64] total: 1057, passed: 1057
Tests / Coverage total: 1067, passed: 1067
Code Coverage #### Project Overview No changes detected, that affect the code coverage. * Line Coverage: 99.34% (1817/1829) * Branch Coverage: 67.49% (1503/2227) * Complexity Density: 0.00 * Lines of Code: 1829 #### Quality Gates Summary Output truncated.
weaselab/conflict-set/pipeline/head This commit looks good
2024-08-08 12:09:26 -07:00
84c6a2bfc2 Disable debug symbols and frame pointer for macos
All checks were successful
Tests / Clang total: 1420, passed: 1420
Clang |Total|New|Outstanding|Fixed|Trend |:-:|:-:|:-:|:-:|:-: |0|0|0|0|:clap:
Tests / Debug total: 1418, passed: 1418
Tests / SIMD fallback total: 1420, passed: 1420
Tests / Release [gcc] total: 1420, passed: 1420
GNU C Compiler (gcc) |Total|New|Outstanding|Fixed|Trend |:-:|:-:|:-:|:-:|:-: |0|0|0|0|:clap:
Tests / Release [gcc,aarch64] total: 1057, passed: 1057
Tests / Coverage total: 1067, passed: 1067
Code Coverage #### Project Overview No changes detected, that affect the code coverage. * Line Coverage: 99.34% (1817/1829) * Branch Coverage: 67.49% (1503/2227) * Complexity Density: 0.00 * Lines of Code: 1829 #### Quality Gates Summary Output truncated.
weaselab/conflict-set/pipeline/head This commit looks good
This causes some versions of clang to crash
2024-08-08 11:51:13 -07:00
b5772a6aa0 Revert "Use homebrew clang for packaging for macos"
This reverts commit c20c08f112.
2024-08-08 11:50:13 -07:00
e6c39981b9 Bump version
All checks were successful
Tests / Clang total: 1420, passed: 1420
Clang |Total|New|Outstanding|Fixed|Trend |:-:|:-:|:-:|:-:|:-: |0|0|0|0|:clap:
Tests / Debug total: 1418, passed: 1418
Tests / SIMD fallback total: 1420, passed: 1420
Tests / Release [gcc] total: 1420, passed: 1420
GNU C Compiler (gcc) |Total|New|Outstanding|Fixed|Trend |:-:|:-:|:-:|:-:|:-: |0|0|0|0|:clap:
Tests / Release [gcc,aarch64] total: 1057, passed: 1057
Tests / Coverage total: 1067, passed: 1067
Code Coverage #### Project Overview No changes detected, that affect the code coverage. * Line Coverage: 99.34% (1817/1829) * Branch Coverage: 67.49% (1503/2227) * Complexity Density: 0.00 * Lines of Code: 1829 #### Quality Gates Summary Output truncated.
weaselab/conflict-set/pipeline/head This commit looks good
2024-08-08 11:26:49 -07:00
175 changed files with 229 additions and 225 deletions

View File

@@ -1,7 +1,7 @@
cmake_minimum_required(VERSION 3.18)
project(
conflict-set
VERSION 0.0.10
VERSION 0.0.12
DESCRIPTION
"A data structure for optimistic concurrency control on ranges of bitwise-lexicographically-ordered keys."
HOMEPAGE_URL "https://git.weaselab.dev/weaselab/conflict-set"
@@ -31,14 +31,12 @@ if(NOT CMAKE_BUILD_TYPE AND NOT CMAKE_CONFIGURATION_TYPES)
"MinSizeRel" "RelWithDebInfo")
endif()
add_compile_options(
-fdata-sections
-ffunction-sections
-Wswitch-enum
-Werror=switch-enum
-fPIC
-g
-fno-omit-frame-pointer)
add_compile_options(-fdata-sections -ffunction-sections -Wswitch-enum
-Werror=switch-enum -fPIC)
if(NOT APPLE)
# This causes some versions of clang to crash on macos
add_compile_options(-g -fno-omit-frame-pointer)
endif()
set(full_relro_flags "-pie;LINKER:-z,relro,-z,now,-z,noexecstack")
cmake_push_check_state()

View File

@@ -16,12 +16,12 @@ limitations under the License.
#include "ConflictSet.h"
#include "Internal.h"
#include "LongestCommonPrefix.h"
#include <algorithm>
#include <atomic>
#include <bit>
#include <cassert>
#include <compare>
#include <cstddef>
#include <cstdint>
#include <cstring>
@@ -1687,167 +1687,6 @@ Node *nextSibling(Node *node) {
}
}
#if defined(HAS_AVX) || defined(HAS_ARM_NEON)
constexpr int kStride = 64;
#else
constexpr int kStride = 16;
#endif
constexpr int kUnrollFactor = 4;
bool compareStride(const uint8_t *ap, const uint8_t *bp) {
#if defined(HAS_ARM_NEON)
static_assert(kStride == 64);
uint8x16_t x[4]; // GCOVR_EXCL_LINE
for (int i = 0; i < 4; ++i) {
x[i] = vceqq_u8(vld1q_u8(ap + i * 16), vld1q_u8(bp + i * 16));
}
auto results = vreinterpretq_u16_u8(
vandq_u8(vandq_u8(x[0], x[1]), vandq_u8(x[2], x[3])));
bool eq = vget_lane_u64(vreinterpret_u64_u8(vshrn_n_u16(results, 4)), 0) ==
uint64_t(-1);
#elif defined(HAS_AVX)
static_assert(kStride == 64);
__m128i x[4]; // GCOVR_EXCL_LINE
for (int i = 0; i < 4; ++i) {
x[i] = _mm_cmpeq_epi8(_mm_loadu_si128((__m128i *)(ap + i * 16)),
_mm_loadu_si128((__m128i *)(bp + i * 16)));
}
auto eq =
_mm_movemask_epi8(_mm_and_si128(_mm_and_si128(x[0], x[1]),
_mm_and_si128(x[2], x[3]))) == 0xffff;
#else
// Hope it gets vectorized
auto eq = memcmp(ap, bp, kStride) == 0;
#endif
return eq;
}
// Precondition: ap[:kStride] != bp[:kStride]
int firstNeqStride(const uint8_t *ap, const uint8_t *bp) {
#if defined(HAS_AVX)
static_assert(kStride == 64);
uint64_t c[kStride / 16]; // GCOVR_EXCL_LINE
for (int i = 0; i < kStride; i += 16) {
const auto a = _mm_loadu_si128((__m128i *)(ap + i));
const auto b = _mm_loadu_si128((__m128i *)(bp + i));
const auto compared = _mm_cmpeq_epi8(a, b);
c[i / 16] = _mm_movemask_epi8(compared) & 0xffff;
}
return std::countr_zero(~(c[0] | c[1] << 16 | c[2] << 32 | c[3] << 48));
#elif defined(HAS_ARM_NEON)
static_assert(kStride == 64);
for (int i = 0; i < kStride; i += 16) {
// 0xff for each match
uint16x8_t results =
vreinterpretq_u16_u8(vceqq_u8(vld1q_u8(ap + i), vld1q_u8(bp + i)));
// 0xf for each mismatch
uint64_t bitfield =
~vget_lane_u64(vreinterpret_u64_u8(vshrn_n_u16(results, 4)), 0);
if (bitfield) {
return i + (std::countr_zero(bitfield) >> 2);
}
}
__builtin_unreachable(); // GCOVR_EXCL_LINE
#else
int i = 0;
for (; i < kStride - 1; ++i) {
if (*ap++ != *bp++) {
break;
}
}
return i;
#endif
}
// This gets covered in local development
// GCOVR_EXCL_START
#if defined(HAS_AVX) && !defined(__SANITIZE_THREAD__)
__attribute__((target("avx512f,avx512bw"))) int
longestCommonPrefix(const uint8_t *ap, const uint8_t *bp, int cl) {
int i = 0;
int end = cl & ~63;
while (i < end) {
const uint64_t eq =
_mm512_cmpeq_epi8_mask(_mm512_loadu_epi8(ap), _mm512_loadu_epi8(bp));
if (eq != uint64_t(-1)) {
return i + std::countr_one(eq);
}
i += 64;
ap += 64;
bp += 64;
}
if (i < cl) {
const uint64_t mask = (uint64_t(1) << (cl - i)) - 1;
const uint64_t eq = _mm512_cmpeq_epi8_mask(
_mm512_maskz_loadu_epi8(mask, ap), _mm512_maskz_loadu_epi8(mask, bp));
return i + std::countr_one(eq & mask);
}
assert(i == cl);
return i;
}
__attribute__((target("default")))
#endif
// GCOVR_EXCL_STOP
int longestCommonPrefix(const uint8_t *ap, const uint8_t *bp, int cl) {
assume(cl >= 0);
int i = 0;
int end;
// kStride * kUnrollCount at a time
end = cl & ~(kStride * kUnrollFactor - 1);
while (i < end) {
for (int j = 0; j < kUnrollFactor; ++j) {
if (!compareStride(ap, bp)) {
return i + firstNeqStride(ap, bp);
}
i += kStride;
ap += kStride;
bp += kStride;
}
}
// kStride at a time
end = cl & ~(kStride - 1);
while (i < end) {
if (!compareStride(ap, bp)) {
return i + firstNeqStride(ap, bp);
}
i += kStride;
ap += kStride;
bp += kStride;
}
// word at a time
end = cl & ~(sizeof(uint64_t) - 1);
while (i < end) {
uint64_t a; // GCOVR_EXCL_LINE
uint64_t b; // GCOVR_EXCL_LINE
memcpy(&a, ap, 8);
memcpy(&b, bp, 8);
const auto mismatched = a ^ b;
if (mismatched) {
return i + std::countr_zero(mismatched) / 8;
}
i += 8;
ap += 8;
bp += 8;
}
// byte at a time
while (i < cl) {
if (*ap != *bp) {
break;
}
++ap;
++bp;
++i;
}
return i;
}
// Logically this is the same as performing firstGeq and then checking against
// point or range version according to cmp, but this version short circuits as
// soon as it can prove that there's no conflict.
@@ -2922,25 +2761,22 @@ void consumePartialKey(Node *&self, std::span<const uint8_t> &key,
key = key.subspan(partialKeyIndex, key.size() - partialKeyIndex);
}
// Returns a pointer to the newly inserted node. Caller must set
// `entryPresent`, and `entry` fields. All nodes along the search path of the
// result will have `maxVersion` set to `writeVersion` as a postcondition. Nodes
// along the search path may be invalidated.
// Returns a pointer the pointer to the newly inserted node in the tree. Caller
// must set `entryPresent`, and `entry` fields. All nodes along the search path
// of the result will have `maxVersion` set to `writeVersion` as a
// postcondition. Nodes along the search path may be invalidated.
[[nodiscard]]
Node *insert(Node **self, std::span<const uint8_t> key,
InternalVersionT writeVersion, WriteContext *tls,
ConflictSet::Impl *impl) {
Node **insert(Node **self, std::span<const uint8_t> key,
InternalVersionT writeVersion, WriteContext *tls,
ConflictSet::Impl *impl) {
if ((*self)->partialKeyLen > 0) {
consumePartialKey(*self, key, writeVersion, tls);
}
assert(maxVersion(*self, impl) <= writeVersion);
setMaxVersion(*self, impl, writeVersion);
for (;; ++tls->accum.insert_iterations) {
if (key.size() == 0) {
return *self;
return self;
}
auto &child = getOrCreateChild(*self, key.front(), writeVersion, tls);
@@ -2953,7 +2789,7 @@ Node *insert(Node **self, std::span<const uint8_t> key,
child->parentsIndex = key.front();
setMaxVersion(child, impl, writeVersion);
memcpy(child->partialKey(), key.data() + 1, child->partialKeyLen);
return child;
return &child;
}
self = &child;
@@ -2996,7 +2832,7 @@ void addPointWrite(Node *&root, std::span<const uint8_t> key,
InternalVersionT writeVersion, WriteContext *tls,
ConflictSet::Impl *impl) {
++tls->accum.point_writes;
auto *n = insert(&root, key, writeVersion, tls, impl);
auto *n = *insert(&root, key, writeVersion, tls, impl);
if (!n->entryPresent) {
++tls->accum.entries_inserted;
auto *p = nextLogical(n);
@@ -3063,41 +2899,16 @@ void addWriteRange(Node *&root, std::span<const uint8_t> begin,
}
++tls->accum.range_writes;
const bool beginIsPrefix = lcp == int(begin.size());
auto remaining = begin.subspan(0, lcp);
auto *n = root;
Node **useAsRoot =
insert(&root, begin.subspan(0, lcp), writeVersion, tls, impl);
for (;;) {
if (int(remaining.size()) <= n->partialKeyLen) {
break;
}
int i = longestCommonPrefix(n->partialKey(), remaining.data(),
n->partialKeyLen);
if (i != n->partialKeyLen) {
break;
}
auto *child = getChild(n, remaining[n->partialKeyLen]);
if (child == nullptr) {
break;
}
assert(maxVersion(n, impl) <= writeVersion);
setMaxVersion(n, impl, writeVersion);
remaining = remaining.subspan(n->partialKeyLen + 1,
remaining.size() - (n->partialKeyLen + 1));
n = child;
}
Node **useAsRoot = &getInTree(n, impl);
int consumed = lcp - remaining.size();
int consumed = lcp;
begin = begin.subspan(consumed, begin.size() - consumed);
end = end.subspan(consumed, end.size() - consumed);
auto *beginNode = insert(useAsRoot, begin, writeVersion, tls, impl);
auto *beginNode = *insert(useAsRoot, begin, writeVersion, tls, impl);
const bool insertedBegin = !beginNode->entryPresent;
@@ -3114,7 +2925,7 @@ void addWriteRange(Node *&root, std::span<const uint8_t> begin,
assert(writeVersion >= beginNode->entry.pointVersion);
beginNode->entry.pointVersion = writeVersion;
auto *endNode = insert(useAsRoot, end, writeVersion, tls, impl);
auto *endNode = *insert(useAsRoot, end, writeVersion, tls, impl);
const bool insertedEnd = !endNode->entryPresent;
@@ -3132,7 +2943,7 @@ void addWriteRange(Node *&root, std::span<const uint8_t> begin,
if (beginIsPrefix && insertedEnd) {
// beginNode may have been invalidated when inserting end. TODO can we do
// better?
beginNode = insert(useAsRoot, begin, writeVersion, tls, impl);
beginNode = *insert(useAsRoot, begin, writeVersion, tls, impl);
assert(beginNode->entryPresent);
}
@@ -3246,6 +3057,8 @@ struct __attribute__((visibility("hidden"))) ConflictSet::Impl {
InternalVersionT::zero = tls.zero = oldestVersion;
assert(writeVersion >= newestVersionFullPrecision);
assert(tls.accum.entries_erased == 0);
assert(tls.accum.entries_inserted == 0);
if (oldestExtantVersion < writeVersion - kMaxCorrectVersionWindow)
[[unlikely]] {
@@ -3273,15 +3086,19 @@ struct __attribute__((visibility("hidden"))) ConflictSet::Impl {
auto begin = std::span<const uint8_t>(w.begin.p, w.begin.len);
auto end = std::span<const uint8_t>(w.end.p, w.end.len);
if (w.end.len > 0) {
keyUpdates += 3;
addWriteRange(root, begin, end, InternalVersionT(writeVersion), &tls,
this);
} else {
keyUpdates += 2;
addPointWrite(root, begin, InternalVersionT(writeVersion), &tls, this);
}
}
// Run gc at least 200% the rate we're inserting entries
keyUpdates +=
std::max<int64_t>(tls.accum.entries_inserted - tls.accum.entries_erased,
0) *
2;
memory_bytes.set(totalBytes);
point_writes_total.add(tls.accum.point_writes);
range_writes_total.add(tls.accum.range_writes);

17
Jenkinsfile vendored
View File

@@ -117,15 +117,18 @@ pipeline {
}
}
steps {
script {
filter_args = "-f ConflictSet.cpp -f LongestCommonPrefix.h"
}
CleanBuildAndTest("-DCMAKE_C_COMPILER=gcc -DCMAKE_CXX_COMPILER=g++ -DCMAKE_C_FLAGS=--coverage -DCMAKE_CXX_FLAGS=--coverage -DCMAKE_BUILD_TYPE=Debug -DDISABLE_TSAN=ON")
sh '''
gcovr -f ConflictSet.cpp --cobertura > build/coverage.xml
'''
sh """
gcovr ${filter_args} --cobertura > build/coverage.xml
"""
recordCoverage qualityGates: [[criticality: 'NOTE', metric: 'MODULE']], tools: [[parser: 'COBERTURA', pattern: 'build/coverage.xml']]
sh '''
gcovr -f ConflictSet.cpp
gcovr -f ConflictSet.cpp --fail-under-line 100 > /dev/null
'''
sh """
gcovr ${filter_args}
gcovr ${filter_args} --fail-under-line 100 > /dev/null
"""
}
}
}

177
LongestCommonPrefix.h Normal file
View File

@@ -0,0 +1,177 @@
#pragma once
#include <assert.h>
#include <bit>
#include <stdint.h>
#include <string.h>
#ifdef HAS_AVX
#include <immintrin.h>
#elif HAS_ARM_NEON
#include <arm_neon.h>
#endif
#if defined(HAS_AVX) || defined(HAS_ARM_NEON)
constexpr int kStride = 64;
#else
constexpr int kStride = 16;
#endif
constexpr int kUnrollFactor = 4;
inline bool compareStride(const uint8_t *ap, const uint8_t *bp) {
#if defined(HAS_ARM_NEON)
static_assert(kStride == 64);
uint8x16_t x[4]; // GCOVR_EXCL_LINE
for (int i = 0; i < 4; ++i) {
x[i] = vceqq_u8(vld1q_u8(ap + i * 16), vld1q_u8(bp + i * 16));
}
auto results = vreinterpretq_u16_u8(
vandq_u8(vandq_u8(x[0], x[1]), vandq_u8(x[2], x[3])));
bool eq = vget_lane_u64(vreinterpret_u64_u8(vshrn_n_u16(results, 4)), 0) ==
uint64_t(-1);
#elif defined(HAS_AVX)
static_assert(kStride == 64);
__m128i x[4]; // GCOVR_EXCL_LINE
for (int i = 0; i < 4; ++i) {
x[i] = _mm_cmpeq_epi8(_mm_loadu_si128((__m128i *)(ap + i * 16)),
_mm_loadu_si128((__m128i *)(bp + i * 16)));
}
auto eq =
_mm_movemask_epi8(_mm_and_si128(_mm_and_si128(x[0], x[1]),
_mm_and_si128(x[2], x[3]))) == 0xffff;
#else
// Hope it gets vectorized
auto eq = memcmp(ap, bp, kStride) == 0;
#endif
return eq;
}
// Precondition: ap[:kStride] != bp[:kStride]
inline int firstNeqStride(const uint8_t *ap, const uint8_t *bp) {
#if defined(HAS_AVX)
static_assert(kStride == 64);
uint64_t c[kStride / 16]; // GCOVR_EXCL_LINE
for (int i = 0; i < kStride; i += 16) {
const auto a = _mm_loadu_si128((__m128i *)(ap + i));
const auto b = _mm_loadu_si128((__m128i *)(bp + i));
const auto compared = _mm_cmpeq_epi8(a, b);
c[i / 16] = _mm_movemask_epi8(compared) & 0xffff;
}
return std::countr_zero(~(c[0] | c[1] << 16 | c[2] << 32 | c[3] << 48));
#elif defined(HAS_ARM_NEON)
static_assert(kStride == 64);
for (int i = 0; i < kStride; i += 16) {
// 0xff for each match
uint16x8_t results =
vreinterpretq_u16_u8(vceqq_u8(vld1q_u8(ap + i), vld1q_u8(bp + i)));
// 0xf for each mismatch
uint64_t bitfield =
~vget_lane_u64(vreinterpret_u64_u8(vshrn_n_u16(results, 4)), 0);
if (bitfield) {
return i + (std::countr_zero(bitfield) >> 2);
}
}
__builtin_unreachable(); // GCOVR_EXCL_LINE
#else
int i = 0;
for (; i < kStride - 1; ++i) {
if (*ap++ != *bp++) {
break;
}
}
return i;
#endif
}
// This gets covered in local development
// GCOVR_EXCL_START
#if defined(HAS_AVX) && !defined(__SANITIZE_THREAD__)
__attribute__((target("avx512f,avx512bw"))) inline int
longestCommonPrefix(const uint8_t *ap, const uint8_t *bp, int cl) {
int i = 0;
int end = cl & ~63;
while (i < end) {
const uint64_t eq =
_mm512_cmpeq_epi8_mask(_mm512_loadu_epi8(ap), _mm512_loadu_epi8(bp));
if (eq != uint64_t(-1)) {
return i + std::countr_one(eq);
}
i += 64;
ap += 64;
bp += 64;
}
if (i < cl) {
const uint64_t mask = (uint64_t(1) << (cl - i)) - 1;
const uint64_t eq = _mm512_cmpeq_epi8_mask(
_mm512_maskz_loadu_epi8(mask, ap), _mm512_maskz_loadu_epi8(mask, bp));
return i + std::countr_one(eq & mask);
}
assert(i == cl);
return i;
}
__attribute__((target("default")))
#endif
// GCOVR_EXCL_STOP
inline int
longestCommonPrefix(const uint8_t *ap, const uint8_t *bp, int cl) {
if (!(cl >= 0)) {
__builtin_unreachable();
}
int i = 0;
int end;
// kStride * kUnrollCount at a time
end = cl & ~(kStride * kUnrollFactor - 1);
while (i < end) {
for (int j = 0; j < kUnrollFactor; ++j) {
if (!compareStride(ap, bp)) {
return i + firstNeqStride(ap, bp);
}
i += kStride;
ap += kStride;
bp += kStride;
}
}
// kStride at a time
end = cl & ~(kStride - 1);
while (i < end) {
if (!compareStride(ap, bp)) {
return i + firstNeqStride(ap, bp);
}
i += kStride;
ap += kStride;
bp += kStride;
}
// word at a time
end = cl & ~(sizeof(uint64_t) - 1);
while (i < end) {
uint64_t a; // GCOVR_EXCL_LINE
uint64_t b; // GCOVR_EXCL_LINE
memcpy(&a, ap, 8);
memcpy(&b, bp, 8);
const auto mismatched = a ^ b;
if (mismatched) {
return i + std::countr_zero(mismatched) / 8;
}
i += 8;
ap += 8;
bp += 8;
}
// byte at a time
while (i < cl) {
if (*ap != *bp) {
break;
}
++ap;
++bp;
++i;
}
return i;
}

View File

@@ -76,6 +76,14 @@ void workload(weaselab::ConflictSet *cs) {
} else {
w.begin.len = k.size();
cs->addWrites(&w, 1, version);
int64_t beginN = version - kWindowSize + rand() % kWindowSize;
auto b = numToKey(beginN);
auto e = numToKey(beginN + 1000);
w.begin.p = b.data();
w.begin.len = b.size();
w.end.p = e.data();
w.end.len = e.size();
cs->addWrites(&w, 1, version);
}
}
// GC

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

View File

@@ -0,0 +1 @@
:<15><><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>:::::

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Some files were not shown because too many files have changed in this diff Show More