Files
weaseldb/http.md

155 lines
5.4 KiB
Markdown

# HTTP Server Design
## Overview
High-performance HTTP server implementation for WeaselDB using epoll-based event loop architecture with configurable threading model and lock-free communication between components.
## Architecture
### Threading Model
#### Accept Threads
- **Configurable number** of dedicated accept threads
- Each thread runs **synchronous accept() loop** on listening socket
- Accept returns file descriptor used to construct connection object
- Connection posted to epoll with **EPOLLIN | EPOLLONESHOT** interest
#### Network Threads
- **Configurable number** of network I/O threads
- Each thread calls **epoll_wait()** to receive connection ownership
- **EPOLLONESHOT** ensures clean ownership transfer without thundering herd
- Handle connection state transitions and HTTP protocol processing
#### Service Threads
- **Configurable service pipelines** (e.g., commit and status can share pipeline)
- **Dequeue connections** from shared LMAX Disruptor inspired ring buffer
- Process business logic and generate responses
- **Post completed connections** back to epoll for response writing
### Connection Lifecycle
1. **Accept**: Accept thread creates connection, posts to epoll with read interest
2. **Read**: Network thread reads data, pumps llhttp streaming parser
3. **Route**: Complete request path triggers handler association
4. **Process**: Handler either:
- Processes in network thread (fast path)
- Posts to service ring buffer (slow path)
5. **Respond**: Connection posted to epoll with write interest
6. **Write**: Network thread writes response data
7. **Keep-alive**: Connection posted back with read interest for next request
### HTTP Protocol Support
#### Request Processing
- **llhttp streaming parser** for incremental request parsing
- **Content-Type based routing** to appropriate service handlers
- **HTTP/1.1 keep-alive support** for connection reuse
- **Malformed request handling** with proper HTTP error responses
#### Response Handling
- **Direct response writing** by network threads
- **Partial write support** with epoll re-registration
- **Connection reuse** after complete response transmission
### Memory Management
#### Arena Allocation
- **Per-connection arena allocator** for request-scoped memory
- **Arena reset after each response** to prepare for next request
- **Zero-copy string views** pointing to arena-allocated memory
- **Efficient bulk deallocation** when connection closes
#### Ring Buffer Communication
- **Ring buffer** for high-performance inter-thread communication
- **Shared ring buffers** between related services (e.g., commit and status share pipeline)
- **Backpressure handling** via ring buffer capacity limits
### Error Handling
#### Backpressure Response
- **Ring buffer full**: Network thread writes immediate error response
- **Partial write**: Connection re-posted to epoll with write interest
- **Graceful degradation** under high load conditions
#### Protocol Errors
- **Malformed HTTP**: Proper HTTP error responses (400 Bad Request, etc.)
- **Partial I/O**: Re-registration with epoll for completion
- **Client disconnection**: Clean connection state cleanup
## Integration Points
### Parser Selection
- **Content-Type header inspection** for parser routing
- **ParserInterface implementation** for format-agnostic parsing
- **CommitRequestParser extensibility** for future format support (protobuf, msgpack)
### Configuration System
- **TOML configuration** using existing config system
- **Configurable parameters**:
- Accept thread count
- Network thread count
- Ring buffer sizes
- Connection timeouts
- Keep-alive settings
### Request Routing
- **Path-based routing** to service handlers
- **Shared ring buffers** for related services with flexible pipeline configuration
- **Handler registration** for extensible endpoint support
## Performance Characteristics
### Scalability
- **Configurable parallelism** for accept and network operations
- **Lock-free communication** eliminates contention bottlenecks
- **EPOLLONESHOT semantics** prevent thundering herd effects and manages connection ownership
### Efficiency
- **Zero-copy I/O** where possible using string views
- **Arena allocation** eliminates per-request malloc overhead
- **Connection reuse** reduces accept/close syscall overhead
- **Streaming parser** handles large requests incrementally
### Resource Management
- **Bounded memory usage** via arena reset and ring buffer limits
- **Graceful backpressure** prevents resource exhaustion
- **Clean connection lifecycle** with deterministic cleanup
## Implementation Phases
### Phase 1: Core Infrastructure
- Basic epoll event loop with accept/network threads
- Connection state management and lifecycle
- llhttp integration for request parsing
### Phase 2: Service Integration
- Ring buffer implementation for service communication
- Request routing and handler registration
- Arena-based memory management
### Phase 3: Protocol Features
- HTTP/1.1 keep-alive support
- Error handling and backpressure responses
- Configuration system integration
### Phase 4: Content Handling
- Content-Type based parser selection
- CommitRequest processing integration
- Response serialization and transmission
## Configuration Schema
```toml
[http]
accept_threads = 2
network_threads = 8
bind_address = "0.0.0.0"
port = 8080
keepalive_timeout_sec = 30
request_timeout_sec = 10
[http.pipelines.commit_status]
ring_buffer_size = 1024
threads = 4
```