Files
weaseldb/config.md
Andrew Noyes 0357a41dd8 Implement spend_cpu_cycles in assembly
The compiler was unrolling it previously, so we're doing assembly now for consistency.
2025-09-05 15:16:49 -04:00

7.4 KiB

WeaselDB Configuration

WeaselDB uses a TOML configuration file to control server behavior and API limits. The configuration is organized into three main sections that correspond to different aspects of the system.

Configuration File Location

By default, WeaselDB looks for config.toml in the current directory. You can specify an alternative path:

./weaseldb /path/to/custom/config.toml

Configuration Sections

Server Configuration ([server])

Controls server networking, threading, and request handling behavior.

Parameter Type Default Description
interfaces array of objects TCP on 127.0.0.1:8080 Network interfaces to listen on. Each interface can be TCP or Unix socket
max_request_size_bytes integer 1048576 (1MB) Maximum size for incoming requests. Requests exceeding this limit receive a 413 Content Too Large response
io_threads integer 1 Number of I/O threads for handling connections and network events
epoll_instances integer io_threads Number of epoll instances to reduce kernel contention (max: io_threads). Lower values allow multiple threads per epoll for better load balancing, higher values reduce contention
event_batch_size integer 32 Number of events to process in each epoll batch
max_connections integer 50000 Maximum number of concurrent connections (0 = unlimited). Note: Due to race conditions between connection acceptance and cleanup, it's possible to trip this limit without actually having that many concurrent connections, especially under high connection churn.
read_buffer_size integer 16384 (16KB) Buffer size for reading from socket connections

Commit Configuration ([commit])

Controls behavior of the /v1/commit endpoint and request ID management.

Parameter Type Default Description
min_request_id_length integer 20 Minimum length required for client-provided request_id fields to ensure sufficient entropy for collision avoidance
request_id_retention_hours integer 24 How long to retain request IDs in memory for /v1/status queries. Longer retention reduces the chance of log_truncated responses
request_id_retention_versions integer 100000000 Minimum number of versions to retain request IDs for, regardless of time. Provides additional protection against log_truncated responses

Subscription Configuration ([subscription])

Controls behavior of the /v1/subscribe endpoint and SSE streaming.

Parameter Type Default Description
max_buffer_size_bytes integer 10485760 (10MB) Maximum amount of unconsumed data to buffer for slow subscribers. Connections are closed if this limit is exceeded
keepalive_interval_seconds integer 30 Interval between keepalive comments in the Server-Sent Events stream to prevent idle timeouts on network proxies

Benchmark Configuration ([benchmark])

Controls benchmarking and health check behavior.

Parameter Type Default Description
ok_resolve_iterations integer 4000 CPU-intensive loop iterations for /ok requests in resolve stage. 0 = health check only, 4000 = default benchmark load (~650ns, 1M req/s)

Example Configuration

# WeaselDB Configuration File

[server]
# Network interfaces - can specify multiple TCP and/or Unix socket interfaces
interfaces = [
  { type = "tcp", address = "0.0.0.0", port = 8080 },
  # { type = "unix", path = "weaseldb.sock" },  # Alternative Unix socket
]

# Performance tuning
max_request_size_bytes = 2097152  # 2MB
io_threads = 8
epoll_instances = 8  # Reduce kernel contention (max: io_threads)
event_batch_size = 64
max_connections = 50000
read_buffer_size = 32768  # 32KB

[commit]
min_request_id_length = 32
request_id_retention_hours = 48
request_id_retention_versions = 50000

[subscription]
max_buffer_size_bytes = 52428800  # 50MB
keepalive_interval_seconds = 15

[benchmark]
ok_resolve_iterations = 10000  # Higher load for performance testing

Configuration Loading

WeaselDB uses the toml11 library for configuration parsing with robust error handling:

  • File Loading: Uses ConfigParser::load_from_file(path) to parse TOML files
  • Fallback Behavior: If the config file doesn't exist or contains errors, WeaselDB uses default values and logs a warning
  • Optional Parameters: All configuration parameters are optional - missing values use the defaults shown above
  • Type Safety: Configuration values are parsed with type validation and meaningful error messages
  • Validation: All parameters are validated against reasonable bounds before use

API Relationship

These configuration parameters directly affect server and API behavior:

Server Performance:

  • io_threads: Controls parallelism for both accepting new connections and I/O processing. Should typically match CPU core count for optimal performance
  • event_batch_size: Larger batches reduce syscall overhead but may increase latency under light load
  • max_connections: Prevents resource exhaustion by limiting concurrent connections

Request Handling:

  • max_request_size_bytes: Determines when /v1/commit returns 413 Content Too Large
  • min_request_id_length: Validates request_id fields in /v1/commit requests for sufficient entropy

Request ID Management:

  • request_id_retention_*: Affects availability of data for /v1/status queries and likelihood of log_truncated responses

Subscription Streaming:

  • max_buffer_size_bytes: Controls when /v1/subscribe connections are terminated due to slow consumption
  • keepalive_interval_seconds: Frequency of keepalive comments in /v1/subscribe streams

Configuration Validation

The configuration system includes comprehensive validation with specific bounds checking:

Server Configuration Limits

  • port: Must be between 1 and 65535
  • max_request_size_bytes: Must be > 0 and ≤ 100MB
  • io_threads: Must be between 1 and 1000
  • event_batch_size: Must be between 1 and 10000
  • max_connections: Must be between 0 and 100000 (0 = unlimited)

Commit Configuration Limits

  • min_request_id_length: Must be between 8 and 256 characters
  • request_id_retention_hours: Must be between 1 and 8760 hours (1 year)
  • request_id_retention_versions: Must be > 0

Subscription Configuration Limits

  • max_buffer_size_bytes: Must be > 0 and ≤ 1GB
  • keepalive_interval_seconds: Must be between 1 and 3600 seconds (1 hour)

Cross-Validation

  • Warns if max_request_size_bytes > max_buffer_size_bytes (potential buffering issues)

Configuration Management

Code Integration

  • Configuration Structure: Defined in src/config.hpp with structured types
  • Parser Implementation: Located in src/config.cpp using template-based parsing
  • Default Values: Embedded as struct defaults for compile-time initialization
  • Runtime Usage: Configuration passed to server components during initialization

Development Guidelines

  • New Parameters: Add to appropriate struct in src/config.hpp
  • Validation: Include bounds checking in ConfigParser::validate_config()
  • Documentation: Update this file when adding new configuration options
  • Testing: Verify both valid and invalid configuration scenarios