Update design.md

This commit is contained in:
2025-08-17 14:28:30 -04:00
parent 67ddcd0fc8
commit 05ee8e05f8

View File

@@ -21,11 +21,18 @@ Key features:
- Move semantics for efficient transfers - Move semantics for efficient transfers
- Requires trivially destructible types only - Requires trivially destructible types only
#### 2. **Commit Request Parser** (`src/commit_request.{hpp,cpp}`) #### 2. **Commit Request Data Model** (`src/commit_request.hpp`)
- **Format-agnostic data structure** for representing transactional commits
- **Arena-backed string storage** with efficient memory management
- **Move-only semantics** for optimal performance
- **Builder pattern** for constructing commit requests
- **Zero-copy string views** pointing to arena-allocated memory
#### 3. **JSON Commit Request Parser** (`src/json_commit_request_parser.{hpp,cpp}`)
- **High-performance JSON parser** using `weaseljson` library - **High-performance JSON parser** using `weaseljson` library
- **Streaming parser support** for incremental parsing of network data - **Streaming parser support** for incremental parsing of network data
- **Arena-based string storage** for zero-copy string handling - **gperf-optimized token recognition** for fast JSON key parsing
- **Base64 decoding** for binary key/value data - **Base64 decoding** using SIMD-accelerated simdutf
- **Comprehensive validation** of transaction structure - **Comprehensive validation** of transaction structure
Parser capabilities: Parser capabilities:
@@ -33,8 +40,15 @@ Parser capabilities:
- Streaming parsing for network protocols - Streaming parsing for network protocols
- Parse state management with error recovery - Parse state management with error recovery
- Memory-efficient string views backed by arena storage - Memory-efficient string views backed by arena storage
- Perfect hash table lookup for JSON keys using gperf
#### 3. **Configuration System** (`src/config.{hpp,cpp}`) #### 4. **Parser Interface** (`src/parser_interface.hpp`)
- **Abstract base class** for commit request parsers
- **Format-agnostic parsing interface** supporting multiple serialization formats
- **Streaming and one-shot parsing modes**
- **Standardized error handling** across parser implementations
#### 5. **Configuration System** (`src/config.{hpp,cpp}`)
- **TOML-based configuration** using `toml11` library - **TOML-based configuration** using `toml11` library
- **Structured configuration** with server, commit, and subscription sections - **Structured configuration** with server, commit, and subscription sections
- **Default fallback values** for all configuration options - **Default fallback values** for all configuration options
@@ -45,6 +59,12 @@ Configuration domains:
- **Commit**: request ID validation, retention policies - **Commit**: request ID validation, retention policies
- **Subscription**: buffer management, keepalive intervals - **Subscription**: buffer management, keepalive intervals
#### 6. **JSON Token Optimization** (`src/json_tokens.gperf`, `src/json_token_enum.hpp`)
- **Perfect hash table** generated by gperf for O(1) JSON key lookup
- **Compile-time token enumeration** for type-safe key identification
- **Minimal perfect hash** reduces memory overhead and improves cache locality
- **Build-time code generation** ensures optimal performance
### Data Model ### Data Model
#### Transaction Structure #### Transaction Structure
@@ -91,6 +111,8 @@ The system implements a RESTful API with three core endpoints:
- **Incremental processing** suitable for network protocols - **Incremental processing** suitable for network protocols
- **Arena storage** eliminates string allocation overhead - **Arena storage** eliminates string allocation overhead
- **SIMD-accelerated base64 decoding** using simdutf for maximum performance - **SIMD-accelerated base64 decoding** using simdutf for maximum performance
- **Perfect hash table** provides O(1) JSON key lookup via gperf
- **Zero hash collisions** for known JSON tokens eliminates branching
### Design Principles ### Design Principles
@@ -113,7 +135,10 @@ Build targets:
- `test_arena_allocator`: Arena allocator functionality tests - `test_arena_allocator`: Arena allocator functionality tests
- `test_commit_request`: JSON parsing and validation tests - `test_commit_request`: JSON parsing and validation tests
- `weaseldb`: Main application demonstrating configuration and parsing - `weaseldb`: Main application demonstrating configuration and parsing
- Various benchmark executables for performance testing - `bench_arena_allocator`: Arena allocator performance benchmarks
- `bench_commit_request`: JSON parsing performance benchmarks
- `bench_parser_comparison`: Comparison benchmarks vs nlohmann::json and RapidJSON
- `debug_arena`: Debug tool for arena allocator analysis
### Dependencies ### Dependencies
@@ -122,6 +147,9 @@ Build targets:
- **toml11**: TOML configuration file parsing - **toml11**: TOML configuration file parsing
- **doctest**: Lightweight testing framework - **doctest**: Lightweight testing framework
- **nanobench**: Micro-benchmarking library - **nanobench**: Micro-benchmarking library
- **gperf**: Perfect hash function generator for JSON token optimization
- **nlohmann::json**: Reference JSON parser for benchmarking comparisons
- **RapidJSON**: High-performance JSON parser for benchmarking comparisons
### Future Considerations ### Future Considerations