diff --git a/design.md b/design.md index 64708ac..d5fb42b 100644 --- a/design.md +++ b/design.md @@ -9,6 +9,8 @@ 5. [Common Patterns](#common-patterns) 6. [Reference](#reference) +**See also:** [style.md](style.md) for comprehensive C++ coding standards and conventions. + --- ## Project Overview diff --git a/style.md b/style.md new file mode 100644 index 0000000..fc95fbe --- /dev/null +++ b/style.md @@ -0,0 +1,412 @@ +# WeaselDB C++ Style Guide + +This document describes the C++ coding style used in the WeaselDB project. These conventions ensure consistency, readability, and maintainability across the codebase. + +## Table of Contents + +1. [General Principles](#general-principles) +2. [Naming Conventions](#naming-conventions) +3. [File Organization](#file-organization) +4. [Code Structure](#code-structure) +5. [Memory Management](#memory-management) +6. [Error Handling](#error-handling) +7. [Documentation](#documentation) +8. [Testing](#testing) + +--- + +## General Principles + +### Language Standard +- **C++20** is the target standard +- Use modern C++ features: RAII, move semantics, constexpr, concepts where appropriate +- Prefer standard library containers and algorithms over custom implementations + +### Data Types +- **Almost always signed** - prefer `int`, `int64_t`, `size_t` over unsigned types except for: + - Bit manipulation operations + - Interfacing with APIs that require unsigned types + - Memory sizes where overflow is impossible (`size_t`, `uint32_t` for arena block sizes) + - Where defined unsigned overflow behavior (wraparound) is intentional and desired +- **Almost always auto** - let the compiler deduce types except when: + - The type is not obvious from context (prefer explicit for clarity) + - Specific type requirements matter (numeric conversions, template parameters) + - Interface contracts need explicit types (public APIs, function signatures) +- **Prefer uninitialized memory to default initialization** when using before initializing would be an error + - Valgrind will catch uninitialized memory usage bugs + - Avoid hiding logic errors with unnecessary zero-initialization + - Default initialization can mask bugs and hurt performance +- **Floating point is for metrics only** - avoid `float`/`double` in core data structures and algorithms + - Use for performance measurements, statistics, and monitoring data + - Never use for counts, sizes, or business logic + +### Performance Focus +- **Performance-first design** - optimize for the hot path +- **Simple is fast** - find exactly what's necessary, strip away everything else +- **Complexity must be justified with benchmarks** - measure performance impact before adding complexity +- **Strive for 0% CPU usage when idle** - avoid polling, busy waiting, or unnecessary background activity +- Use **inline functions** for performance-critical code (e.g., `allocate_raw`) +- **Zero-copy operations** with `std::string_view` over string copying +- **Arena allocation** for efficient memory management + +### Complexity Control +- **Encapsulation is the main tool for controlling complexity** +- **Header files define the interface** - they are the contract with users of your code +- **Headers should be complete** - include everything needed to use the interface effectively: + - Usage examples in comments + - Preconditions and postconditions + - Thread safety guarantees + - Performance characteristics + - Ownership and lifetime semantics +- **Do not rely on undocumented interface properties** - if it's not in the header, don't depend on it + +--- + +## Naming Conventions + +### Variables and Functions +- **snake_case** for all variables, functions, and member functions +```cpp +size_t used_bytes() const; +void add_block(size_t size); +uint32_t initial_block_size_; +``` + +### Classes and Structs +- **PascalCase** for class and struct names +- **Prefer struct over class** - public members at the top, avoid `class { public:` pattern +```cpp +struct ArenaAllocator {}; +struct CommitRequest {}; +struct JsonCommitRequestParser {}; +``` + +### Enums +- **PascalCase** for enum class names +- **PascalCase** for enum values (not SCREAMING_SNAKE_CASE) +```cpp +enum class Type { + PointRead, + RangeRead +}; + +enum class ParseState { + Root, + PreconditionsArray, + OperationObject +}; +``` + +### Constants and Macros +- **snake_case** for constants +- Avoid macros when possible; prefer `constexpr` variables +```cpp +static const WeaselJsonCallbacks json_callbacks; +``` + +### Member Variables +- **Trailing underscore** for private member variables +```cpp +private: + uint32_t initial_block_size_; + Block *current_block_; +``` + +### Template Parameters +- **PascalCase** for template type parameters +```cpp +template +template struct rebind {}; +``` + +--- + +## File Organization + +### Header Files +- Use **`#pragma once`** instead of include guards +- **Never `using namespace std`** - always use fully qualified names for clarity and safety +- **Include order:** + 1. Corresponding header file (for .cpp files) + 2. Standard library headers (alphabetical) + 3. Third-party library headers + 4. Project headers +```cpp +#pragma once + +#include +#include +#include +#include +#include + +#include +#include + +#include "arena_allocator.hpp" +#include "commit_request.hpp" + +// Never this: +// using namespace std; + +// Always this: +std::vector data; +std::unique_ptr parser; +``` + +### Source Files +- Include corresponding header first +- Follow same include order as headers +```cpp +#include "json_commit_request_parser.hpp" + +#include +#include +#include + +#include + +#include "json_token_enum.hpp" +``` + +--- + +## Code Structure + +### Class Design +- **Move-only semantics** for resource-owning classes +- **Explicit constructors** to prevent implicit conversions +- **Delete copy operations** when inappropriate +```cpp +struct ArenaAllocator { + explicit ArenaAllocator(size_t initial_size = 1024); + + // Copy construction is not allowed + ArenaAllocator(const ArenaAllocator &) = delete; + ArenaAllocator &operator=(const ArenaAllocator &) = delete; + + // Move semantics + ArenaAllocator(ArenaAllocator &&other) noexcept; + ArenaAllocator &operator=(ArenaAllocator &&other) noexcept; + +private: + uint32_t initial_block_size_; + Block *current_block_; +}; +``` + +### Function Design +- **Const correctness** - mark methods const when appropriate +- **Parameter passing:** + - Pass by value for types ≤ 16 bytes (int, pointers, string_view, small structs) + - Pass by const reference for types > 16 bytes (containers, large objects) +- **Return by value** for small types (≤ 16 bytes), **string_view** for zero-copy over strings +- **noexcept specification** for move operations and non-throwing functions +```cpp +std::span operations() const { return operations_; } +void process_data(std::string_view data); // ≤ 16 bytes, pass by value +void process_request(const CommitRequest& req); // > 16 bytes, pass by reference +ArenaAllocator(ArenaAllocator &&other) noexcept; +``` + +### Template Usage +- **Template constraints** using static_assert for better error messages +- **SFINAE** or concepts for template specialization +```cpp +template T *construct(Args &&...args) { + static_assert( + std::is_trivially_destructible_v, + "ArenaAllocator::construct requires trivially destructible types."); + // ... +} +``` + +### Control Flow +- **Early returns** to reduce nesting +- **Range-based for loops** when possible +```cpp +if (size == 0) { + return nullptr; +} + +for (auto &precondition : preconditions_) { + // ... +} +``` + +--- + +## Memory Management + +### Ownership & Allocation +- **Arena allocators** for request-scoped memory with **STL allocator adapters** +- **String views** pointing to arena-allocated memory for zero-copy +- **Prefer unique_ptr** for exclusive ownership +- **shared_ptr only if shared ownership is necessary** - most objects have single owners +- **Factory patterns** for complex construction and ownership control +```cpp +// Static factory methods for complex objects requiring specific initialization +auto server = Server::create(config, handler); // Ensures shared_ptr semantics +Block *block = Block::create(size, prev); // Custom allocation + setup + +// Friend-based factories for access control +struct Connection { + // Public interface first + void appendMessage(std::string_view data); + bool writeBytes(); + +private: + Connection(/* args */); // Private constructor + friend struct Server; // Only Server can construct Connections +}; + +// Usage in Server +auto conn = std::unique_ptr(new Connection(args)); +``` + +### Resource Management +- **RAII** everywhere - constructors acquire, destructors release +- **Move semantics** for efficient resource transfer +- **Explicit cleanup** methods where appropriate +```cpp +~ArenaAllocator() { + while (current_block_) { + Block *prev = current_block_->prev; + std::free(current_block_); + current_block_ = prev; + } +} +``` + +--- + +## Error Handling + +### Error Reporting +- **Return codes** for expected errors +- **Exceptions** only for exceptional circumstances +- **fprintf + abort()** for unrecoverable errors +```cpp +enum class ParseResult { Success, InvalidJson, MissingField }; + +if (!memory) { + std::fprintf(stderr, "ArenaAllocator: Failed to allocate memory\n"); + std::abort(); +} +``` + +### Assertions +- Use **assert()** for debug-time checks +- **Static assertions** for compile-time validation +```cpp +assert(current_block_ && "realloc called with non-null ptr but no current block"); +static_assert(std::is_trivially_destructible_v, "Arena requires trivially destructible types"); +``` + +--- + +## Documentation + +### Doxygen Style +- **/** for class and public method documentation +- **@brief** for short descriptions +- **@param** and **@return** for function parameters +- **@note** for important implementation notes +- **@warning** for critical usage warnings +```cpp +/** + * @brief Type-safe version of realloc_raw for arrays of type T. + * @param ptr Pointer to the existing allocation + * @param old_size Size in number of T objects + * @param new_size Desired new size in number of T objects + * @return Pointer to reallocated memory + * @note Prints error to stderr and calls std::abort() if allocation fails + */ +template +T *realloc(T *ptr, uint32_t old_size, uint32_t new_size); +``` + +### Code Comments +- **Explain why, not what** - code should be self-documenting +- **Performance notes** for optimization decisions +- **Thread safety** and ownership semantics +```cpp +// Uses O(1) accumulated counters for fast retrieval +size_t total_allocated() const; + +// Only Server can create connections - no public constructor +Connection(struct sockaddr_storage addr, int fd, int64_t id, + ConnectionHandler *handler, std::weak_ptr server); +``` + +--- + +## Testing + +### Test Framework +- **doctest** for unit testing +- **TEST_CASE** and **SUBCASE** for test organization +- **CHECK** for assertions (non-terminating) +- **REQUIRE** for critical assertions (terminating) + +### Test Structure +- **Descriptive test names** explaining the scenario +- **SUBCASE** for related test variations +- **Fresh instances** for each test to avoid state contamination +```cpp +TEST_CASE("ArenaAllocator basic allocation") { + ArenaAllocator arena; + + SUBCASE("allocate zero bytes returns nullptr") { + void *ptr = arena.allocate_raw(0); + CHECK(ptr == nullptr); + } + + SUBCASE("allocate single byte") { + void *ptr = arena.allocate_raw(1); + CHECK(ptr != nullptr); + CHECK(arena.used_bytes() >= 1); + } +} +``` + +### Test Synchronization +- **NEVER use timeouts** or sleep-based synchronization +- **Deterministic synchronization only:** + - Blocking I/O operations + - `condition_variable.wait()` (no timeout variant) + - `std::latch`, `std::barrier`, futures/promises + - RAII guards and resource management + +--- + +## Build Integration + +### CMake Integration +- **Generated code** (gperf hash tables) in build directory +- **Ninja** as the preferred generator +- **Export compile commands** for tooling support +```bash +cmake .. -G Ninja -DCMAKE_EXPORT_COMPILE_COMMANDS=ON +``` + +### Code Generation +- **gperf** for perfect hash table generation +- **Build-time generation** of token lookup tables +- **Include generated headers** from build directory + +--- + +## Style Enforcement + +### Consistency +- Follow existing patterns in the codebase +- Use the same style for similar constructs +- Maintain consistency within each translation unit + +### Tools +- **clang-format** configuration (when available) +- **Static analysis** tools for code quality +- **Address sanitizer** for memory safety testing + +This style guide reflects the existing codebase patterns and should be followed for all new code contributions to maintain consistency and readability. \ No newline at end of file