diff --git a/persistence.md b/persistence.md new file mode 100644 index 0000000..b30f49a --- /dev/null +++ b/persistence.md @@ -0,0 +1,81 @@ +# Persistence Thread Design + +## Overview + +The persistence thread receives commit batches from the main processing pipeline (via ThreadPipeline interface) and uploads them to S3. The transport layer uses a single-threaded TCP client with connection pooling, leaving protocol details (HTTP, authentication, etc.) to higher layers. + +## Core Flow + +The persistence thread operates in a loop with integrated connection management and batching logic. Batches are triggered when any limit is reached: time (`batch_timeout_ms`) or size (`batch_size_threshold`). Flow control via `max_in_flight_requests` prevents new batches when at capacity. + +### Main Loop + +1. **Batch Acquisition** - Use `ThreadPipeline::acquire()` to get commit batches: + - If no in-flight requests: acquire blocking and process immediately + - If in-flight requests exist: collect commits until any trigger condition: + - **Time limit**: `batch_timeout_ms` elapsed since last batch sent + - **Size limit**: `batch_size_threshold` commits collected + - **Flow control**: at `max_in_flight_requests`, block until reply received + - Use epoll_wait with timeout for I/O event handling during collection + - If no commits available and no in-flight requests, switch to blocking acquire + +2. **Connection Management** - Acquire healthy connection from pool: + - Create new connections if pool below `target_pool_size` + - If no healthy connections available, block until one becomes available + - Maintain automatic pool replenishment + +3. **Data Transmission** - Send batch to S3: + - Write protocol data to connection + - Publish accepted transactions to subscriber system + - Track in-flight requests for flow control + +4. **Event Processing** - Drive I/O and monitor health: + - Handle epoll events for all in-flight connections + - Monitor connection health via heartbeats + - Process responses and detect failures + +5. **Response Handling** - Process successful persistence: + - Only acknowledge batch if all prior batches are durable (ordered acknowledgment) + - Release batch via StageGuard destructor (publishes to next pipeline stage) + - Publish durability events to subscriber system + - Return healthy connection to pool + +6. **Failure Handling** - Manage connection failures: + - Remove failed connection from pool + - Retry batch with exponential backoff (up to `max_retry_attempts`) + - Exponential backoff delays only affect the specific failing batch + - If retries exhausted, abort process or escalate error + - Initiate pool replenishment if below target + +### Connection Pool + +Maintains `target_pool_size` connections with automatic replenishment. Pool sizing should be 2x `max_in_flight_requests` to ensure availability during peak load and connection replacement scenarios. + +## Implementation Considerations + +**Batch Ordering**: Batches can be retried out-of-order for performance, but acknowledgment to the next pipeline stage must maintain ordering - a batch can only be acknowledged after all prior batches are durable. + +**Backpressure**: Retry delays for individual batches naturally create backpressure that eventually blocks the persistence thread when in-flight limits are reached. + +**Graceful Shutdown**: On shutdown signal, drain all in-flight batches to completion before terminating the persistence thread. + +## Configurable Parameters + +- **`batch_timeout_ms`** (default: 5ms) - Maximum time to wait collecting additional commits for batching +- **`batch_size_threshold`** - Threshold for triggering batch processing +- **`max_in_flight_requests`** - Maximum number of concurrent requests to persistence backend +- **`target_pool_size`** (recommended: 2x `max_in_flight_requests`) - Target number of connections to maintain in pool +- **`max_retry_attempts`** (default: 3) - Maximum retries for failed batches before aborting +- **`retry_base_delay_ms`** (default: 100ms) - Base delay for exponential backoff retries + +## Configuration Validation + +The following validation rules should be enforced: + +- **`batch_size_threshold`** > 0 (must process at least one commit per batch) +- **`max_in_flight_requests`** > 0 (must allow at least one concurrent request) +- **`target_pool_size`** >= `max_in_flight_requests` (pool must accommodate all in-flight requests) +- **`target_pool_size`** <= 2x `max_in_flight_requests` (recommended for optimal performance) +- **`batch_timeout_ms`** > 0 (timeout must be positive) +- **`max_retry_attempts`** >= 0 (zero disables retries) +- **`retry_base_delay_ms`** > 0 (delay must be positive if retries enabled) \ No newline at end of file