2 Commits

Author SHA1 Message Date
2eb461b8ea Fix build for llvm 18
All checks were successful
Tests / Clang total: 1130, passed: 1130
Clang |Total|New|Outstanding|Fixed|Trend |:-:|:-:|:-:|:-:|:-: |0|0|0|0|:clap:
Tests / SIMD fallback total: 1130, passed: 1130
Tests / Release [gcc] total: 1130, passed: 1130
GNU C Compiler (gcc) |Total|New|Outstanding|Fixed|Trend |:-:|:-:|:-:|:-:|:-: |0|0|0|0|:clap:
Tests / Release [gcc,aarch64] total: 844, passed: 844
Tests / Coverage total: 848, passed: 848
weaselab/conflict-set/pipeline/head This commit looks good
2024-06-11 11:38:55 -07:00
e2e92f4ef5 Address some feedback on paper
All checks were successful
Tests / Clang total: 1130, passed: 1130
Clang |Total|New|Outstanding|Fixed|Trend |:-:|:-:|:-:|:-:|:-: |0|0|0|0|:clap:
Tests / SIMD fallback total: 1130, passed: 1130
Tests / Release [gcc] total: 1130, passed: 1130
GNU C Compiler (gcc) |Total|New|Outstanding|Fixed|Trend |:-:|:-:|:-:|:-:|:-: |0|0|0|0|:clap:
Tests / Release [gcc,aarch64] total: 844, passed: 844
Tests / Coverage total: 848, passed: 848
weaselab/conflict-set/pipeline/head This commit looks good
2024-05-06 14:30:49 -07:00
3 changed files with 95 additions and 11 deletions

View File

@@ -2944,10 +2944,6 @@ Iterator firstGeq(Node *n, std::string_view key) {
} // namespace
namespace std {
void __throw_length_error(const char *) { __builtin_unreachable(); }
} // namespace std
#if SHOW_MEMORY
int64_t nodeBytes = 0;

View File

@@ -255,10 +255,65 @@ template <class T> struct ArenaAlloc {
void deallocate(T *, size_t) noexcept {}
};
template <class T> using Vector = std::vector<T, ArenaAlloc<T>>;
template <class T> auto vector(Arena &arena) {
return Vector<T>(ArenaAlloc<T>(&arena));
}
template <class T> struct Vector {
static_assert(std::is_trivially_destructible_v<T>);
static_assert(std::is_trivially_copyable_v<T>);
explicit Vector(Arena *arena)
: arena(arena), t(nullptr), size_(0), capacity(0) {}
void append(std::span<const T> slice) {
if (size_ + int(slice.size()) > capacity) {
grow(std::max<int>(size_ + slice.size(), capacity * 2));
}
if (slice.size() > 0) {
memcpy(const_cast<std::remove_const_t<T> *>(t) + size_, slice.data(),
slice.size() * sizeof(T));
}
size_ += slice.size();
}
void push_back(const T &t) { append(std::span<const T>(&t, 1)); }
T *begin() { return t; }
T *end() { return t + size_; }
T *data() { return t; }
T &back() {
assert(size_ > 0);
return t[size_ - 1];
}
T &operator[](int i) {
assert(i >= 0 && i < size_);
return t[i];
}
void pop_back() {
assert(size_ > 0);
--size_;
}
int size() const { return size_; }
operator std::span<const T>() const { return std::span(t, size_); }
private:
void grow(int newCapacity) {
capacity = newCapacity;
auto old = std::span<const T>(*this);
t = (T *)new (std::align_val_t(alignof(T)), *arena)
uint8_t[capacity * sizeof(T)];
size_ = 0;
append(old);
}
Arena *arena;
T *t;
int size_;
int capacity;
};
template <class T> auto vector(Arena &arena) { return Vector<T>(&arena); }
template <class T, class C> using Set = std::set<T, C, ArenaAlloc<T>>;
template <class T, class C = std::less<T>> auto set(Arena &arena) {
return Set<T, C>(ArenaAlloc<T>(&arena));

View File

@@ -33,8 +33,41 @@ This implementation is available at \url{https://git.weaselab.dev/weaselab/confl
Let's begin by considering design options for \emph{lastCommit}.
In order to manage half-open intervals we need an ordered data structure, so hash tables are out of consideration.
For any ordered data structure we can implement \emph{lastCommit} using a representation where a logical key is mapped to the value of the last physical key less than or equal to the logical key.
This is a standard technique used throughout FoundationDB.
For any ordered data structure we can implement \emph{lastCommit} using a representation where a logical key range (figure \ref{fig:logicalrangemap}) is mapped so that the value of a key is the value of the last physical key (figure \ref{fig:physicalrangemap}) less than or equal to the key.
This is a standard technique used throughout FoundationDB called a \emph{range map}.
\begin{figure}
\caption{Physical structure of range map}
\label{fig:physicalrangemap}
\centering
\begin{tikzpicture}
\draw[-latex] (-3.5,0) -- (3.5,0);
\foreach \x [count=\xi from 0] in {\epsilon, a, b}
{
\draw[shift={(\xi * 2.333 - 3.5,0)},color=black] (0pt,3pt) -- (0pt,-3pt);
\node[] at (\xi * 2.333 - 3.5,0.5) {$\x$};
\node[anchor=west] at (\xi * 2.333 - 3.5,-0.5) {$\x \mapsto \xi$};
};
\end{tikzpicture}
\end{figure}
\begin{figure}
\caption{Logical structure of range map}
\label{fig:logicalrangemap}
\centering
\begin{tikzpicture}
\draw[-latex] (-3.5,0) -- (3.5,0);
\foreach \x [count=\xi from 0] in {\epsilon, a, b}
{
\draw[shift={(\xi * 2.333 - 3.5,0)},color=black] (0pt,3pt) -- (0pt,-3pt);
\node[] at (\xi * 2.333 - 3.5,0.5) {$\x$};
};
\foreach \x [count=\xi from 0] in {{$[\epsilon, a) \mapsto \xi$}, {$[a, b) \mapsto \xi$}, {$[b, \infty) \mapsto \xi$}}
{
\node[anchor=west] at (\xi * 2.333 - 3.5,-0.5) {\x};
};
\end{tikzpicture}
\end{figure}
The problem with applying this to an off-the-shelf ordered data structure is that checking a read range is linear in the number of intersecting physical keys.
Scanning through every recent point write intersecting a large range read would make conflict checking unacceptably slow for high-write-throughput workloads.
@@ -204,7 +237,7 @@ Libfuzzer's minimized corpus achieves 98\% line coverage on its own.
We regenerate the corpus on an ad hoc basis by running libfuzzer for a few cpu-hours, during which it tests millions of unique inputs.
In addition to asserting correct externally-visible behavior, in each of these tests we assert that internal invariants hold between operations.
We also use address sanitizer \cite{10.5555/2342821.2342849} to detect memory errors, undefined behavior sanitizer \cite{ubsan} to detect invocations of undefined behavior, and thread sanitizer \cite{10.1145/1791194.1791203} (while exercising concurrent access as allowed by the documented contract) to detect data-race-related undefined behavior.
We also use address sanitizer \cite{10.5555/2342821.2342849} to detect memory errors, undefined behavior sanitizer \cite{ubsan} to detect invocations of undefined behavior, and thread sanitizer \cite{10.1145/1791194.1791203} (while exercising concurrent access as allowed by the contract documented in the c++ header file) to detect data-race-related undefined behavior.
Each of these sanitizers is implemented using compiler instrumentation, which means that they are not testing the final binary artifact that will be run in production.
Therefore we also run the test inputs linking directly to the final release artifact, both standalone and under valgrind \cite{10.5555/1247360.1247362}.