Files
weaseldb/config.md

140 lines
6.4 KiB
Markdown

# WeaselDB Configuration
WeaselDB uses a TOML configuration file to control server behavior and API limits. The configuration is organized into three main sections that correspond to different aspects of the system.
## Configuration File Location
By default, WeaselDB looks for `config.toml` in the current directory. You can specify an alternative path:
```bash
./weaseldb /path/to/custom/config.toml
```
## Configuration Sections
### Server Configuration (`[server]`)
Controls server networking, threading, and request handling behavior.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `bind_address` | string | `"127.0.0.1"` | IP address to bind the server to |
| `port` | integer | `8080` | Port number to listen on |
| `max_request_size_bytes` | integer | `1048576` (1MB) | Maximum size for incoming requests. Requests exceeding this limit receive a `413 Content Too Large` response |
| `accept_threads` | integer | `1` | Number of dedicated threads for accepting incoming connections |
| `network_threads` | integer | `1` | Number of threads for epoll-based network I/O processing |
| `event_batch_size` | integer | `32` | Number of events to process in each epoll batch |
| `max_connections` | integer | `1000` | Maximum number of concurrent connections (0 = unlimited) |
### Commit Configuration (`[commit]`)
Controls behavior of the `/v1/commit` endpoint and request ID management.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `min_request_id_length` | integer | `20` | Minimum length required for client-provided `request_id` fields to ensure sufficient entropy for collision avoidance |
| `request_id_retention_hours` | integer | `24` | How long to retain request IDs in memory for `/v1/status` queries. Longer retention reduces the chance of `log_truncated` responses |
| `request_id_retention_versions` | integer | `100000000` | Minimum number of versions to retain request IDs for, regardless of time. Provides additional protection against `log_truncated` responses |
### Subscription Configuration (`[subscription]`)
Controls behavior of the `/v1/subscribe` endpoint and SSE streaming.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `max_buffer_size_bytes` | integer | `10485760` (10MB) | Maximum amount of unconsumed data to buffer for slow subscribers. Connections are closed if this limit is exceeded |
| `keepalive_interval_seconds` | integer | `30` | Interval between keepalive comments in the Server-Sent Events stream to prevent idle timeouts on network proxies |
## Example Configuration
```toml
# WeaselDB Configuration File
[server]
bind_address = "0.0.0.0"
port = 8080
max_request_size_bytes = 2097152 # 2MB
accept_threads = 2
network_threads = 8
event_batch_size = 64
max_connections = 10000
[commit]
min_request_id_length = 32
request_id_retention_hours = 48
request_id_retention_versions = 50000
[subscription]
max_buffer_size_bytes = 52428800 # 50MB
keepalive_interval_seconds = 15
```
## Configuration Loading
WeaselDB uses the `toml11` library for configuration parsing with robust error handling:
- **File Loading**: Uses `ConfigParser::load_from_file(path)` to parse TOML files
- **Fallback Behavior**: If the config file doesn't exist or contains errors, WeaselDB uses default values and logs a warning
- **Optional Parameters**: All configuration parameters are optional - missing values use the defaults shown above
- **Type Safety**: Configuration values are parsed with type validation and meaningful error messages
- **Validation**: All parameters are validated against reasonable bounds before use
## API Relationship
These configuration parameters directly affect server and API behavior:
**Server Performance:**
- **`accept_threads`**: Controls parallelism for accepting new connections. More threads can handle higher connection rates
- **`network_threads`**: Controls I/O processing parallelism. Should typically match CPU core count for optimal performance
- **`event_batch_size`**: Larger batches reduce syscall overhead but may increase latency under light load
- **`max_connections`**: Prevents resource exhaustion by limiting concurrent connections
**Request Handling:**
- **`max_request_size_bytes`**: Determines when `/v1/commit` returns `413 Content Too Large`
- **`min_request_id_length`**: Validates `request_id` fields in `/v1/commit` requests for sufficient entropy
**Request ID Management:**
- **`request_id_retention_*`**: Affects availability of data for `/v1/status` queries and likelihood of `log_truncated` responses
**Subscription Streaming:**
- **`max_buffer_size_bytes`**: Controls when `/v1/subscribe` connections are terminated due to slow consumption
- **`keepalive_interval_seconds`**: Frequency of keepalive comments in `/v1/subscribe` streams
## Configuration Validation
The configuration system includes comprehensive validation with specific bounds checking:
### Server Configuration Limits
- **`port`**: Must be between 1 and 65535
- **`max_request_size_bytes`**: Must be > 0 and ≤ 100MB
- **`accept_threads`**: Must be between 1 and 100
- **`network_threads`**: Must be between 1 and 1000
- **`event_batch_size`**: Must be between 1 and 10000
- **`max_connections`**: Must be between 0 and 100000 (0 = unlimited)
### Commit Configuration Limits
- **`min_request_id_length`**: Must be between 8 and 256 characters
- **`request_id_retention_hours`**: Must be between 1 and 8760 hours (1 year)
- **`request_id_retention_versions`**: Must be > 0
### Subscription Configuration Limits
- **`max_buffer_size_bytes`**: Must be > 0 and ≤ 1GB
- **`keepalive_interval_seconds`**: Must be between 1 and 3600 seconds (1 hour)
### Cross-Validation
- Warns if `max_request_size_bytes` > `max_buffer_size_bytes` (potential buffering issues)
## Configuration Management
### Code Integration
- **Configuration Structure**: Defined in `src/config.hpp` with structured types
- **Parser Implementation**: Located in `src/config.cpp` using template-based parsing
- **Default Values**: Embedded as struct defaults for compile-time initialization
- **Runtime Usage**: Configuration passed to server components during initialization
### Development Guidelines
- **New Parameters**: Add to appropriate struct in `src/config.hpp`
- **Validation**: Include bounds checking in `ConfigParser::validate_config()`
- **Documentation**: Update this file when adding new configuration options
- **Testing**: Verify both valid and invalid configuration scenarios