Followup updates for new epoll_instances design

2025-08-21 14:20:01 -04:00
parent 1cce8d9950
commit c00d5c576b
3 changed files with 24 additions and 9 deletions
--- a/config.md
+++ b/config.md
@@ -20,10 +20,13 @@ Controls server networking, threading, and request handling behavior.
 |-----------|------|---------|-------------|
 | `bind_address` | string | `"127.0.0.1"` | IP address to bind the server to |
 | `port` | integer | `8080` | Port number to listen on |
 | `unix_socket_path` | string | `""` (empty) | Unix domain socket path. If specified, takes precedence over TCP |
 | `max_request_size_bytes` | integer | `1048576` (1MB) | Maximum size for incoming requests. Requests exceeding this limit receive a `413 Content Too Large` response |
 | `io_threads` | integer | `1` | Number of I/O threads for handling connections and network events |
 | `epoll_instances` | integer | `2` | Number of epoll instances to reduce kernel contention (max: io_threads). Higher values reduce epoll_ctl contention but increase memory usage |
 | `event_batch_size` | integer | `32` | Number of events to process in each epoll batch |
 | `max_connections` | integer | `50000` | Maximum number of concurrent connections (0 = unlimited). Note: Due to race conditions between connection acceptance and cleanup, it's possible to trip this limit without actually having that many concurrent connections, especially under high connection churn. |
 | `read_buffer_size` | integer | `16384` (16KB) | Buffer size for reading from socket connections |
 ### Commit Configuration (`[commit]`)
@@ -50,12 +53,18 @@ Controls behavior of the `/v1/subscribe` endpoint and SSE streaming.
 # WeaselDB Configuration File
 [server]
 # Network configuration
 bind_address = "0.0.0.0"
 port = 8080
 # unix_socket_path = "weaseldb.sock"  # Alternative to TCP
 # Performance tuning
 max_request_size_bytes = 2097152  # 2MB
 io_threads = 8
 epoll_instances = 3  # Reduce kernel contention (max: io_threads)
 event_batch_size = 64
 max_connections = 50000
 read_buffer_size = 32768  # 32KB
 [commit]
 min_request_id_length = 32
--- a/design.md
+++ b/design.md
@@ -19,7 +19,8 @@ WeaselDB is a high-performance write-side database component designed for system
 - **Ultra-fast arena allocation** (~1ns vs ~20-270ns for malloc)
 - **High-performance JSON parsing** with streaming support and SIMD optimization
- **Multi-threaded networking** using epoll with unified I/O thread pool
+- **Multi-threaded networking** using multiple epoll instances with unified I/O thread pool
 - **Configurable epoll instances** to eliminate kernel-level contention
 - **Zero-copy design** throughout the pipeline
 - **Factory pattern safety** ensuring correct object lifecycle management
@@ -90,13 +91,15 @@ Ultra-fast memory allocator optimized for request/response patterns:
 #### **Networking Layer**
 **Server** (`src/server.{hpp,cpp}`):
- **High-performance multi-threaded networking** using epoll with unified I/O thread pool
+- **High-performance multi-threaded networking** using multiple epoll instances with unified I/O thread pool
 - **Configurable epoll instances** to eliminate kernel-level epoll_ctl contention (default: 2, max: io_threads)
 - **Round-robin thread-to-epoll assignment** distributes I/O threads across epoll instances
 - **Connection distribution** keeps accepted connections on same epoll, returns via round-robin
 - **Factory pattern construction** via `Server::create()` ensures proper shared_ptr semantics
 - **Safe shutdown mechanism** with async-signal-safe shutdown() method
 - **Connection ownership management** with automatic cleanup on server destruction
 - **Pluggable protocol handlers** via ConnectionHandler interface
- **Unified I/O architecture:** single thread pool handles both connection acceptance and I/O processing
+- **EPOLL_EXCLUSIVE** on listen socket across all epoll instances prevents thundering herd
 - **EPOLL_EXCLUSIVE** on listen socket prevents thundering herd across I/O threads
 **Connection** (`src/connection.{hpp,cpp}`):
 - **Efficient per-connection state management** with arena-based memory allocation
@@ -205,11 +208,12 @@ The system implements a RESTful API:
 ### Design Principles
 1. **Performance-first** - Every component optimized for high throughput
-2. **Memory efficiency** - Arena allocation eliminates fragmentation
+2. **Scalable concurrency** - Multiple epoll instances eliminate kernel contention
-3. **Zero-copy** - Minimize data copying throughout pipeline
+3. **Memory efficiency** - Arena allocation eliminates fragmentation
-4. **Streaming-ready** - Support incremental processing
+4. **Zero-copy** - Minimize data copying throughout pipeline
-5. **Type safety** - Compile-time validation where possible
+5. **Streaming-ready** - Support incremental processing
-6. **Resource management** - RAII and move semantics throughout
+6. **Type safety** - Compile-time validation where possible
 7. **Resource management** - RAII and move semantics throughout
 ### Future Integration Points
--- a/src/main.cpp
+++ b/src/main.cpp
@@ -101,6 +101,8 @@ int main(int argc, char *argv[]) {
  std::cout << "Max request size: " << config->server.max_request_size_bytes
            << " bytes" << std::endl;
  std::cout << "I/O threads: " << config->server.io_threads << std::endl;
  std::cout << "Epoll instances: " << config->server.epoll_instances
            << std::endl;
  std::cout << "Event batch size: " << config->server.event_batch_size
            << std::endl;
  std::cout << "Max connections: " << config->server.max_connections