12 KiB
12 KiB
WeaselDB C++ Style Guide
This document describes the C++ coding style used in the WeaselDB project. These conventions ensure consistency, readability, and maintainability across the codebase.
Table of Contents
- General Principles
- Naming Conventions
- File Organization
- Code Structure
- Memory Management
- Error Handling
- Documentation
- Testing
General Principles
Language Standard
- C++20 is the target standard
- Use modern C++ features: RAII, move semantics, constexpr, concepts where appropriate
- Prefer standard library containers and algorithms over custom implementations
Data Types
- Almost always signed - prefer
int,int64_t,size_tover unsigned types except for:- Bit manipulation operations
- Interfacing with APIs that require unsigned types
- Memory sizes where overflow is impossible (
size_t,uint32_tfor arena block sizes) - Where defined unsigned overflow behavior (wraparound) is intentional and desired
- Almost always auto - let the compiler deduce types except when:
- The type is not obvious from context (prefer explicit for clarity)
- Specific type requirements matter (numeric conversions, template parameters)
- Interface contracts need explicit types (public APIs, function signatures)
- Prefer uninitialized memory to default initialization when using before initializing would be an error
- Valgrind will catch uninitialized memory usage bugs
- Avoid hiding logic errors with unnecessary zero-initialization
- Default initialization can mask bugs and hurt performance
- Floating point is for metrics only - avoid
float/doublein core data structures and algorithms- Use for performance measurements, statistics, and monitoring data
- Never use for counts, sizes, or business logic
Performance Focus
- Performance-first design - optimize for the hot path
- Simple is fast - find exactly what's necessary, strip away everything else
- Complexity must be justified with benchmarks - measure performance impact before adding complexity
- Strive for 0% CPU usage when idle - avoid polling, busy waiting, or unnecessary background activity
- Use inline functions for performance-critical code (e.g.,
allocate_raw) - Zero-copy operations with
std::string_viewover string copying - Arena allocation for efficient memory management
Complexity Control
- Encapsulation is the main tool for controlling complexity
- Header files define the interface - they are the contract with users of your code
- Headers should be complete - include everything needed to use the interface effectively:
- Usage examples in comments
- Preconditions and postconditions
- Thread safety guarantees
- Performance characteristics
- Ownership and lifetime semantics
- Do not rely on undocumented interface properties - if it's not in the header, don't depend on it
Naming Conventions
Variables and Functions
- snake_case for all variables, functions, and member functions
size_t used_bytes() const;
void add_block(size_t size);
uint32_t initial_block_size_;
Classes and Structs
- PascalCase for class and struct names
- Prefer struct over class - public members at the top, avoid
class { public:pattern
struct ArenaAllocator {};
struct CommitRequest {};
struct JsonCommitRequestParser {};
Enums
- PascalCase for enum class names
- PascalCase for enum values (not SCREAMING_SNAKE_CASE)
enum class Type {
PointRead,
RangeRead
};
enum class ParseState {
Root,
PreconditionsArray,
OperationObject
};
Constants and Macros
- snake_case for constants
- Avoid macros when possible; prefer
constexprvariables
static const WeaselJsonCallbacks json_callbacks;
Member Variables
- Trailing underscore for private member variables
private:
uint32_t initial_block_size_;
Block *current_block_;
Template Parameters
- PascalCase for template type parameters
template <typename T, typename... Args>
template <typename U> struct rebind {};
File Organization
Header Files
- Use
#pragma onceinstead of include guards - Never
using namespace std- always use fully qualified names for clarity and safety - Include order:
- Corresponding header file (for .cpp files)
- Standard library headers (alphabetical)
- Third-party library headers
- Project headers
#pragma once
#include <algorithm>
#include <cstddef>
#include <cstdint>
#include <memory>
#include <string_view>
#include <simdutf.h>
#include <weaseljson/weaseljson.h>
#include "arena_allocator.hpp"
#include "commit_request.hpp"
// Never this:
// using namespace std;
// Always this:
std::vector<int> data;
std::unique_ptr<Parser> parser;
Source Files
- Include corresponding header first
- Follow same include order as headers
#include "json_commit_request_parser.hpp"
#include <cassert>
#include <charconv>
#include <cstring>
#include <simdutf.h>
#include "json_token_enum.hpp"
Code Structure
Class Design
- Move-only semantics for resource-owning classes
- Explicit constructors to prevent implicit conversions
- Delete copy operations when inappropriate
struct ArenaAllocator {
explicit ArenaAllocator(size_t initial_size = 1024);
// Copy construction is not allowed
ArenaAllocator(const ArenaAllocator &) = delete;
ArenaAllocator &operator=(const ArenaAllocator &) = delete;
// Move semantics
ArenaAllocator(ArenaAllocator &&other) noexcept;
ArenaAllocator &operator=(ArenaAllocator &&other) noexcept;
private:
uint32_t initial_block_size_;
Block *current_block_;
};
Function Design
- Const correctness - mark methods const when appropriate
- Parameter passing:
- Pass by value for types ≤ 16 bytes (int, pointers, string_view, small structs)
- Pass by const reference for types > 16 bytes (containers, large objects)
- Return by value for small types (≤ 16 bytes), string_view for zero-copy over strings
- noexcept specification for move operations and non-throwing functions
std::span<const Operation> operations() const { return operations_; }
void process_data(std::string_view data); // ≤ 16 bytes, pass by value
void process_request(const CommitRequest& req); // > 16 bytes, pass by reference
ArenaAllocator(ArenaAllocator &&other) noexcept;
Template Usage
- Template constraints using static_assert for better error messages
- SFINAE or concepts for template specialization
template <typename T> T *construct(Args &&...args) {
static_assert(
std::is_trivially_destructible_v<T>,
"ArenaAllocator::construct requires trivially destructible types.");
// ...
}
Control Flow
- Early returns to reduce nesting
- Range-based for loops when possible
if (size == 0) {
return nullptr;
}
for (auto &precondition : preconditions_) {
// ...
}
Memory Management
Ownership & Allocation
- Arena allocators for request-scoped memory with STL allocator adapters
- String views pointing to arena-allocated memory for zero-copy
- Prefer unique_ptr for exclusive ownership
- shared_ptr only if shared ownership is necessary - most objects have single owners
- Factory patterns for complex construction and ownership control
// Static factory methods for complex objects requiring specific initialization
auto server = Server::create(config, handler); // Ensures shared_ptr semantics
Block *block = Block::create(size, prev); // Custom allocation + setup
// Friend-based factories for access control
struct Connection {
// Public interface first
void appendMessage(std::string_view data);
bool writeBytes();
private:
Connection(/* args */); // Private constructor
friend struct Server; // Only Server can construct Connections
};
// Usage in Server
auto conn = std::unique_ptr<Connection>(new Connection(args));
Resource Management
- RAII everywhere - constructors acquire, destructors release
- Move semantics for efficient resource transfer
- Explicit cleanup methods where appropriate
~ArenaAllocator() {
while (current_block_) {
Block *prev = current_block_->prev;
std::free(current_block_);
current_block_ = prev;
}
}
Error Handling
Error Reporting
- Return codes for expected errors
- Exceptions only for exceptional circumstances
- fprintf + abort() for unrecoverable errors
enum class ParseResult { Success, InvalidJson, MissingField };
if (!memory) {
std::fprintf(stderr, "ArenaAllocator: Failed to allocate memory\n");
std::abort();
}
Assertions
- Use assert() for debug-time checks
- Static assertions for compile-time validation
assert(current_block_ && "realloc called with non-null ptr but no current block");
static_assert(std::is_trivially_destructible_v<T>, "Arena requires trivially destructible types");
Documentation
Doxygen Style
- / for class and public method documentation
- @brief for short descriptions
- @param and @return for function parameters
- @note for important implementation notes
- @warning for critical usage warnings
/**
* @brief Type-safe version of realloc_raw for arrays of type T.
* @param ptr Pointer to the existing allocation
* @param old_size Size in number of T objects
* @param new_size Desired new size in number of T objects
* @return Pointer to reallocated memory
* @note Prints error to stderr and calls std::abort() if allocation fails
*/
template <typename T>
T *realloc(T *ptr, uint32_t old_size, uint32_t new_size);
Code Comments
- Explain why, not what - code should be self-documenting
- Performance notes for optimization decisions
- Thread safety and ownership semantics
// Uses O(1) accumulated counters for fast retrieval
size_t total_allocated() const;
// Only Server can create connections - no public constructor
Connection(struct sockaddr_storage addr, int fd, int64_t id,
ConnectionHandler *handler, std::weak_ptr<Server> server);
Testing
Test Framework
- doctest for unit testing
- TEST_CASE and SUBCASE for test organization
- CHECK for assertions (non-terminating)
- REQUIRE for critical assertions (terminating)
Test Structure
- Descriptive test names explaining the scenario
- SUBCASE for related test variations
- Fresh instances for each test to avoid state contamination
TEST_CASE("ArenaAllocator basic allocation") {
ArenaAllocator arena;
SUBCASE("allocate zero bytes returns nullptr") {
void *ptr = arena.allocate_raw(0);
CHECK(ptr == nullptr);
}
SUBCASE("allocate single byte") {
void *ptr = arena.allocate_raw(1);
CHECK(ptr != nullptr);
CHECK(arena.used_bytes() >= 1);
}
}
Test Synchronization
- NEVER use timeouts or sleep-based synchronization
- Deterministic synchronization only:
- Blocking I/O operations
condition_variable.wait()(no timeout variant)std::latch,std::barrier, futures/promises- RAII guards and resource management
Build Integration
CMake Integration
- Generated code (gperf hash tables) in build directory
- Ninja as the preferred generator
- Export compile commands for tooling support
cmake .. -G Ninja -DCMAKE_EXPORT_COMPILE_COMMANDS=ON
Code Generation
- gperf for perfect hash table generation
- Build-time generation of token lookup tables
- Include generated headers from build directory
Style Enforcement
Consistency
- Follow existing patterns in the codebase
- Use the same style for similar constructs
- Maintain consistency within each translation unit
Tools
- clang-format configuration (when available)
- Static analysis tools for code quality
- Address sanitizer for memory safety testing
This style guide reflects the existing codebase patterns and should be followed for all new code contributions to maintain consistency and readability.