Add mdformat pre-commit hook
This commit is contained in:
112
style.md
112
style.md
@@ -5,28 +5,31 @@ This document describes the C++ coding style used in the WeaselDB project. These
|
||||
## Table of Contents
|
||||
|
||||
1. [General Principles](#general-principles)
|
||||
2. [Naming Conventions](#naming-conventions)
|
||||
3. [File Organization](#file-organization)
|
||||
4. [Code Structure](#code-structure)
|
||||
5. [Memory Management](#memory-management)
|
||||
6. [Error Handling](#error-handling)
|
||||
7. [Documentation](#documentation)
|
||||
8. [Testing](#testing)
|
||||
1. [Naming Conventions](#naming-conventions)
|
||||
1. [File Organization](#file-organization)
|
||||
1. [Code Structure](#code-structure)
|
||||
1. [Memory Management](#memory-management)
|
||||
1. [Error Handling](#error-handling)
|
||||
1. [Documentation](#documentation)
|
||||
1. [Testing](#testing)
|
||||
|
||||
---
|
||||
______________________________________________________________________
|
||||
|
||||
## General Principles
|
||||
|
||||
### Language Standard
|
||||
|
||||
- **C++20** is the target standard
|
||||
- Use modern C++ features: RAII, move semantics, constexpr, concepts where appropriate
|
||||
- Prefer standard library containers and algorithms over custom implementations
|
||||
|
||||
### C Library Functions and Headers
|
||||
|
||||
- **Always use std:: prefixed versions** of C library functions for consistency and clarity
|
||||
- **Use C++ style headers** (`<cstring>`, `<cstdlib>`, etc.) instead of C style headers (`<string.h>`, `<stdlib.h>`, etc.)
|
||||
- This applies to all standard libc functions: `std::abort()`, `std::fprintf()`, `std::free()`, `std::memcpy()`, `std::strlen()`, `std::strncpy()`, `std::memset()`, `std::signal()`, etc.
|
||||
- **Exception:** Functions with no std:: equivalent (e.g., `perror()`, `gai_strerror()`) and system-specific headers (e.g., `<unistd.h>`, `<fcntl.h>`)
|
||||
|
||||
```cpp
|
||||
// Preferred - C++ style
|
||||
#include <cstring>
|
||||
@@ -56,6 +59,7 @@ signal(SIGTERM, handler);
|
||||
```
|
||||
|
||||
### Data Types
|
||||
|
||||
- **Almost always signed** - prefer `int`, `int64_t`, `ssize_t` over unsigned types except for:
|
||||
- Bit manipulation operations
|
||||
- Interfacing with APIs that require unsigned types
|
||||
@@ -73,6 +77,7 @@ signal(SIGTERM, handler);
|
||||
- Never use for counts, sizes, or business logic
|
||||
|
||||
### Type Casting
|
||||
|
||||
- **Never use C-style casts** - they're unsafe and can hide bugs by performing dangerous conversions
|
||||
- **Use C++ cast operators** for explicit type conversions with clear intent and safety checks
|
||||
- **Avoid `reinterpret_cast`** - almost always indicates poor design; redesign APIs instead
|
||||
@@ -94,6 +99,7 @@ auto addr = reinterpret_cast<uintptr_t>(ptr); // Pointer to integer conv
|
||||
```
|
||||
|
||||
### Performance Focus
|
||||
|
||||
- **Performance-first design** - optimize for the hot path
|
||||
- **Simple is fast** - find exactly what's necessary, strip away everything else
|
||||
- **Complexity must be justified with benchmarks** - measure performance impact before adding complexity
|
||||
@@ -103,9 +109,11 @@ auto addr = reinterpret_cast<uintptr_t>(ptr); // Pointer to integer conv
|
||||
- **Arena allocation** for efficient memory management (~1ns vs ~20-270ns for malloc)
|
||||
|
||||
### String Formatting
|
||||
|
||||
- **Always use `format.hpp` functions** - formats directly into arena-allocated memory
|
||||
- **Use `static_format()` for performance-sensitive code** - faster but less flexible than `format()`
|
||||
- **Use `format()` function with arena allocator** for printf-style formatting
|
||||
|
||||
```cpp
|
||||
// Most performance-sensitive - compile-time optimized concatenation
|
||||
std::string_view response = static_format(arena,
|
||||
@@ -124,6 +132,7 @@ std::string_view response = format(arena,
|
||||
```
|
||||
|
||||
### Complexity Control
|
||||
|
||||
- **Encapsulation is the main tool for controlling complexity**
|
||||
- **Header files define the interface** - they are the contract with users of your code
|
||||
- **Headers should be complete** - include everything needed to use the interface effectively:
|
||||
@@ -134,13 +143,15 @@ std::string_view response = format(arena,
|
||||
- Ownership and lifetime semantics
|
||||
- **Do not rely on undocumented interface properties** - if it's not in the header, don't depend on it
|
||||
|
||||
---
|
||||
______________________________________________________________________
|
||||
|
||||
## Naming Conventions
|
||||
|
||||
### Variables and Functions
|
||||
|
||||
- **snake_case** for all variables, functions, and member functions
|
||||
- **Legacy camelCase exists** - the codebase currently contains mixed naming due to historical development. New code should use snake_case. Existing camelCase should be converted to snake_case during natural refactoring (not mass renaming).
|
||||
|
||||
```cpp
|
||||
int64_t used_bytes() const;
|
||||
void add_block(int64_t size);
|
||||
@@ -148,11 +159,13 @@ int32_t initial_block_size_;
|
||||
```
|
||||
|
||||
### Classes and Structs
|
||||
|
||||
- **PascalCase** for class/struct names
|
||||
- **Always use struct keyword** - eliminates debates about complexity and maintains consistency
|
||||
- **Public members first, private after** - puts the interface users care about at the top, implementation details below
|
||||
- **Full encapsulation still applies** - use `private:` sections to hide implementation details and maintain deep, capable structs
|
||||
- The struct keyword doesn't mean shallow design - it means interface-first organization for human readers
|
||||
|
||||
```cpp
|
||||
struct Arena {
|
||||
// Public interface first
|
||||
@@ -167,8 +180,10 @@ private:
|
||||
```
|
||||
|
||||
### Enums
|
||||
|
||||
- **PascalCase** for enum class names
|
||||
- **PascalCase** for enum values (not SCREAMING_SNAKE_CASE)
|
||||
|
||||
```cpp
|
||||
enum class Type {
|
||||
PointRead,
|
||||
@@ -183,14 +198,18 @@ enum class ParseState {
|
||||
```
|
||||
|
||||
### Constants and Macros
|
||||
|
||||
- **snake_case** for constants
|
||||
- Avoid macros when possible; prefer `constexpr` variables
|
||||
|
||||
```cpp
|
||||
static const WeaselJsonCallbacks json_callbacks;
|
||||
```
|
||||
|
||||
### Member Variables
|
||||
|
||||
- **Trailing underscore** for private member variables
|
||||
|
||||
```cpp
|
||||
private:
|
||||
int32_t initial_block_size_;
|
||||
@@ -198,24 +217,28 @@ private:
|
||||
```
|
||||
|
||||
### Template Parameters
|
||||
|
||||
- **PascalCase** for template type parameters
|
||||
|
||||
```cpp
|
||||
template <typename T, typename... Args>
|
||||
template <typename T> struct rebind { using type = T*; };
|
||||
```
|
||||
|
||||
---
|
||||
______________________________________________________________________
|
||||
|
||||
## File Organization
|
||||
|
||||
### Include Organization
|
||||
|
||||
- Use **`#pragma once`** instead of include guards
|
||||
- **Never `using namespace std`** - always use fully qualified names for clarity and safety
|
||||
- **Include order** (applies to both headers and source files):
|
||||
1. Corresponding header file (for .cpp files only)
|
||||
2. Standard library headers (alphabetical)
|
||||
3. Third-party library headers
|
||||
4. Project headers
|
||||
1. Standard library headers (alphabetical)
|
||||
1. Third-party library headers
|
||||
1. Project headers
|
||||
|
||||
```cpp
|
||||
#pragma once
|
||||
|
||||
@@ -239,14 +262,16 @@ std::vector<int> data;
|
||||
std::unique_ptr<Parser> parser;
|
||||
```
|
||||
|
||||
---
|
||||
______________________________________________________________________
|
||||
|
||||
## Code Structure
|
||||
|
||||
### Class Design
|
||||
|
||||
- **Move-only semantics** for resource-owning types
|
||||
- **Explicit constructors** to prevent implicit conversions
|
||||
- **Delete copy operations** when inappropriate
|
||||
|
||||
```cpp
|
||||
struct Arena {
|
||||
explicit Arena(int64_t initial_size = 1024);
|
||||
@@ -266,12 +291,14 @@ private:
|
||||
```
|
||||
|
||||
### Function Design
|
||||
|
||||
- **Const correctness** - mark methods const when appropriate
|
||||
- **Parameter passing:**
|
||||
- Pass by value for types ≤ 16 bytes (int, pointers, string_view, small structs)
|
||||
- Pass by const reference for types > 16 bytes (containers, large objects)
|
||||
- **Return by value** for small types (≤ 16 bytes), **string_view** to avoid copying strings
|
||||
- **noexcept specification** for move operations and non-throwing functions
|
||||
|
||||
```cpp
|
||||
std::span<const Operation> operations() const { return operations_; }
|
||||
void process_data(std::string_view request_data); // ≤ 16 bytes, pass by value
|
||||
@@ -280,10 +307,12 @@ Arena(Arena &&source) noexcept;
|
||||
```
|
||||
|
||||
### Template Usage
|
||||
|
||||
- **Template constraints** using static_assert for better error messages
|
||||
- **SFINAE** or concepts for template specialization
|
||||
|
||||
### Factory Patterns & Ownership
|
||||
|
||||
- **Static factory methods** for complex construction requiring shared ownership
|
||||
- **Friend-based factories** for access control when constructor should be private
|
||||
- **Ownership guidelines:**
|
||||
@@ -310,8 +339,10 @@ private:
|
||||
```
|
||||
|
||||
### Control Flow
|
||||
|
||||
- **Early returns** to reduce nesting
|
||||
- **Range-based for loops** when possible
|
||||
|
||||
```cpp
|
||||
if (size == 0) {
|
||||
return nullptr;
|
||||
@@ -323,9 +354,11 @@ for (auto &precondition : preconditions_) {
|
||||
```
|
||||
|
||||
### Atomic Operations
|
||||
|
||||
- **Never use assignment operators** with `std::atomic` - always use explicit `store()` and `load()`
|
||||
- **Always specify memory ordering** explicitly for atomic operations
|
||||
- **Use the least restrictive correct memory ordering** - choose the weakest ordering that maintains correctness
|
||||
|
||||
```cpp
|
||||
// Preferred - explicit store/load with precise memory ordering
|
||||
std::atomic<uint64_t> counter;
|
||||
@@ -343,14 +376,16 @@ counter = 42; // Implicit - memory ordering not explicit
|
||||
auto value = counter; // Implicit - memory ordering not explicit
|
||||
```
|
||||
|
||||
---
|
||||
______________________________________________________________________
|
||||
|
||||
## Memory Management
|
||||
|
||||
### Ownership & Allocation
|
||||
|
||||
- **Arena allocators** for request-scoped memory with **STL allocator adapters** (see Performance Focus section for characteristics)
|
||||
- **String views** pointing to arena-allocated memory to avoid unnecessary copying
|
||||
- **STL containers with arena allocators require default construction after arena reset** - `clear()` is not sufficient
|
||||
|
||||
```cpp
|
||||
// STL containers with arena allocators - correct reset pattern
|
||||
std::vector<Operation, ArenaStlAllocator<Operation>> operations(arena);
|
||||
@@ -360,9 +395,11 @@ arena.reset(); // Reset arena memory
|
||||
```
|
||||
|
||||
### Resource Management
|
||||
|
||||
- **RAII** everywhere - constructors acquire, destructors release
|
||||
- **Move semantics** for efficient resource transfer
|
||||
- **Explicit cleanup** methods where appropriate
|
||||
|
||||
```cpp
|
||||
~Arena() {
|
||||
while (current_block_) {
|
||||
@@ -373,16 +410,18 @@ arena.reset(); // Reset arena memory
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
______________________________________________________________________
|
||||
|
||||
## Error Handling
|
||||
|
||||
### Error Classification & Response
|
||||
|
||||
- **Expected errors** (invalid input, timeouts): Return error codes for programmatic handling
|
||||
- **System failures** (malloc fail, socket fail): Abort immediately with error message
|
||||
- **Programming errors** (precondition violations, assertions): Abort immediately
|
||||
|
||||
### Error Contract Design
|
||||
|
||||
- **Error codes are the API contract** - use enums for programmatic decisions
|
||||
- **Error messages are human-readable only** - never parse message strings
|
||||
- **Consistent error boundaries** - each component defines what it can/cannot recover from
|
||||
@@ -405,6 +444,7 @@ assert(ptr != nullptr && "Precondition violated: pointer must be non-null");
|
||||
```
|
||||
|
||||
### Assertions
|
||||
|
||||
- **Programming error detection** using standard `assert()` macro
|
||||
- **Assertion behavior follows C++ standards:**
|
||||
- **Debug builds**: Assertions active (undefined `NDEBUG`)
|
||||
@@ -413,6 +453,7 @@ assert(ptr != nullptr && "Precondition violated: pointer must be non-null");
|
||||
- **Static assertions** for compile-time validation (always active)
|
||||
|
||||
**Usage guidelines:**
|
||||
|
||||
- Use for programming errors: null checks, precondition validation, invariants
|
||||
- Don't use for expected runtime errors: use return codes instead
|
||||
|
||||
@@ -468,26 +509,28 @@ if (result == -1 && errno != EINTR) {
|
||||
|
||||
Most system calls are not interruptible in practice. For these, it is not necessary to add a retry loop. This includes:
|
||||
|
||||
* `fcntl` (with `F_GETFL`, `F_SETFL`, `F_GETFD`, `F_SETFD` - note: `F_SETLKW` and `F_OFD_SETLKW` CAN return EINTR)
|
||||
* `epoll_ctl`
|
||||
* `socketpair`
|
||||
* `pipe`
|
||||
* `setsockopt`
|
||||
* `epoll_create1`
|
||||
* `close` (special case: guaranteed closed even on EINTR on Linux)
|
||||
- `fcntl` (with `F_GETFL`, `F_SETFL`, `F_GETFD`, `F_SETFD` - note: `F_SETLKW` and `F_OFD_SETLKW` CAN return EINTR)
|
||||
- `epoll_ctl`
|
||||
- `socketpair`
|
||||
- `pipe`
|
||||
- `setsockopt`
|
||||
- `epoll_create1`
|
||||
- `close` (special case: guaranteed closed even on EINTR on Linux)
|
||||
|
||||
When in doubt, consult the `man` page for the specific system call to see if it can return `EINTR`.
|
||||
|
||||
---
|
||||
______________________________________________________________________
|
||||
|
||||
## Documentation
|
||||
|
||||
### Doxygen Style
|
||||
|
||||
- **/** for struct and public method documentation
|
||||
- **@brief** for short descriptions
|
||||
- **@param** and **@return** for function parameters
|
||||
- **@note** for important implementation notes
|
||||
- **@warning** for critical usage warnings
|
||||
|
||||
```cpp
|
||||
/**
|
||||
* @brief Type-safe version of realloc_raw for arrays of type T.
|
||||
@@ -502,9 +545,11 @@ T *realloc(T *existing_ptr, int32_t current_size, int32_t requested_size);
|
||||
```
|
||||
|
||||
### Code Comments
|
||||
|
||||
- **Explain why, not what** - code should be self-documenting
|
||||
- **Performance notes** for optimization decisions
|
||||
- **Thread safety** and ownership semantics
|
||||
|
||||
```cpp
|
||||
// Uses O(1) accumulated counters for fast retrieval
|
||||
int64_t total_allocated() const;
|
||||
@@ -514,20 +559,23 @@ Connection(struct sockaddr_storage addr, int fd, int64_t id,
|
||||
ConnectionHandler *handler, std::weak_ptr<Server> server);
|
||||
```
|
||||
|
||||
---
|
||||
______________________________________________________________________
|
||||
|
||||
## Testing
|
||||
|
||||
### Test Framework
|
||||
|
||||
- **doctest** for unit testing
|
||||
- **TEST_CASE** and **SUBCASE** for test organization
|
||||
- **CHECK** for assertions (non-terminating)
|
||||
- **REQUIRE** for critical assertions (terminating)
|
||||
|
||||
### Test Structure
|
||||
|
||||
- **Descriptive test names** explaining the scenario
|
||||
- **SUBCASE** for related test variations
|
||||
- **Fresh instances** for each test to avoid state contamination
|
||||
|
||||
```cpp
|
||||
TEST_CASE("Arena basic allocation") {
|
||||
Arena arena;
|
||||
@@ -546,10 +594,12 @@ TEST_CASE("Arena basic allocation") {
|
||||
```
|
||||
|
||||
### Test Design Principles
|
||||
|
||||
- **Test the contract, not the implementation** - validate what the API promises to deliver, not implementation details
|
||||
- **Both integration and unit tests** - test components in isolation and working together
|
||||
- **Prefer fakes to mocks** - use real implementations for internal components, fake external dependencies
|
||||
- **Always enable assertions in tests** - use `-UNDEBUG` pattern to ensure assertions are checked (see Build Integration section)
|
||||
|
||||
```cpp
|
||||
// Good: Testing through public API
|
||||
TEST_CASE("Server accepts connections") {
|
||||
@@ -569,11 +619,13 @@ TEST_CASE("Server accepts connections") {
|
||||
### What NOT to Test
|
||||
|
||||
**Avoid testing language features and plumbing:**
|
||||
|
||||
- Don't test that virtual functions dispatch correctly
|
||||
- Don't test that standard library types work (unique_ptr, containers, etc.)
|
||||
- Don't test basic constructor/destructor calls
|
||||
|
||||
**Test business logic instead:**
|
||||
|
||||
- When does your code call hooks/callbacks and why?
|
||||
- What state transitions trigger behavior changes?
|
||||
- How does your code handle error conditions?
|
||||
@@ -582,6 +634,7 @@ TEST_CASE("Server accepts connections") {
|
||||
**Ask: "Am I testing the C++ compiler or my application logic?"**
|
||||
|
||||
### Test Synchronization (Authoritative Rules)
|
||||
|
||||
- **ABSOLUTELY NEVER use timeouts** (`sleep_for`, `wait_for`, etc.)
|
||||
- **Deterministic synchronization only:**
|
||||
- Blocking I/O (naturally waits for completion)
|
||||
@@ -592,6 +645,7 @@ TEST_CASE("Server accepts connections") {
|
||||
#### Threading Checklist for Tests/Benchmarks
|
||||
|
||||
**Common threading principles (all concurrent code):**
|
||||
|
||||
- **Count total threads** - Include main/benchmark thread in count
|
||||
- **Always assume concurrent execution needed** - Tests/benchmarks require real concurrency
|
||||
- **Add synchronization primitive** - `std::latch start_latch{N}` (most common), `std::barrier`, or similar where N = total concurrent threads
|
||||
@@ -599,18 +653,21 @@ TEST_CASE("Server accepts connections") {
|
||||
- **Main thread synchronizes before measurement/execution** - ensures all threads start simultaneously
|
||||
|
||||
**Test-specific:**
|
||||
|
||||
- **Perform many operations per thread creation** - amortize thread creation cost and increase chances of hitting race conditions
|
||||
- **Pattern: Create test that spawns threads and runs many operations, then run that test many times** - amortizes thread creation cost while providing fresh test instances
|
||||
- **Run 100-10000 operations per test, and 100-10000 test iterations** - maximizes chances of hitting race conditions
|
||||
- **Always run with ThreadSanitizer** - compile with `-fsanitize=thread`
|
||||
|
||||
**Benchmark-specific:**
|
||||
|
||||
- **NEVER create threads inside the benchmark measurement** - creates thread creation/destruction overhead, not contention
|
||||
- **Create background threads OUTSIDE the benchmark** that run continuously during measurement
|
||||
- **Use `std::atomic<bool> keep_running` to cleanly shut down background threads after benchmark**
|
||||
- **Measure only the foreground operation under real contention from background threads**
|
||||
|
||||
**Red flags to catch immediately:**
|
||||
|
||||
- ❌ Creating threads in a loop without `std::latch`
|
||||
- ❌ Background threads starting work immediately
|
||||
- ❌ Benchmark measuring before all threads synchronized
|
||||
@@ -636,11 +693,12 @@ for (int i = 0; i < 4; ++i) {
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
______________________________________________________________________
|
||||
|
||||
## Build Integration
|
||||
|
||||
### Build Configuration
|
||||
|
||||
```bash
|
||||
# Debug: assertions on, optimizations off
|
||||
cmake .. -G Ninja -DCMAKE_BUILD_TYPE=Debug -DCMAKE_EXPORT_COMPILE_COMMANDS=ON
|
||||
@@ -650,6 +708,7 @@ cmake .. -G Ninja -DCMAKE_BUILD_TYPE=Release -DCMAKE_EXPORT_COMPILE_COMMANDS=ON
|
||||
```
|
||||
|
||||
**Test Target Pattern:**
|
||||
|
||||
- Production targets follow build type (assertions off in Release)
|
||||
- Test targets use `-UNDEBUG` to force assertions on in all builds
|
||||
- Ensures consistent test validation regardless of build type
|
||||
@@ -666,4 +725,5 @@ add_executable(example src/example.cpp src/main.cpp)
|
||||
```
|
||||
|
||||
### Code Generation
|
||||
|
||||
- Generated files go in build directory, not source
|
||||
|
||||
Reference in New Issue
Block a user