anirudh@Shadow:~$ cat MergeDB.md

MergeDB

A distributed, fault-tolerant key-value store.

Stack: Rust gRPC CRDTs LSM Trees
Links: Source

Architecture

MergeDB handles state reconciliation across distributed nodes without centralized coordination. By leveraging Conflict-Free Replicated Data Types (CRDTs), it ensures eventual consistency even during network partitions.

Implementation Details

Communication & Stack

Inter-node communication is built entirely on tonic, a high-performance gRPC framework, and backed by the tokio asynchronous runtime for fast, non-blocking I/O. This allows nodes to efficiently multiplex epidemic gossip payloads.

Client-facing interactions are handled via a lightweight axum API server, with system observability provided by structured tracing (WIP).

Gossip Protocol & Fan-out

To prevent infinite updates across the cluster, I implemented a custom gossip protocol based on epidemic dissemination. Epidemic algorithms mimic the spread of a contagious disease. When a node receives new data, it forwards that message to a randomly selected set of processes of limited size, defined as the fan-out. MergeDB currently uses a fixed fan-out of $K=3$ to balance network load with reliable propagation.

Furthermore, the implementation uses strict state-change detection during the CRDT merge phase. Nodes only bump the last_updated timestamp if a received payload mathematically mutates the local state, effectively breaking the infinite feedback loops inherent in naïve gossip protocols:

async fn gossip_changes(&self, changes: tonic::Request<GossipChangesRequest>) -> Result<tonic::Response<GossipChangesResponse>, tonic::Status> {
    let key = changes.into_inner().key;
    let remote_crdt = parse_crdt_payload(changes); // PN-Counters, AWSets, LWW-Registers

    self.store.entry(key.clone()).and_modify(|stored_value| {
        // 1. Snapshot state before merge
        let old_state = stored_value.data.clone();
        
        // 2. Apply CRDT Merge (commutative, associative, idempotent)
        stored_value.data.merge(&mut remote_crdt.clone());

        // 3. Strict change detection to halt gossip loops
        if stored_value.data != old_state {
            println!("Merged NEW update for {}", key);
            stored_value.last_updated = SystemTime::now(); 
        } else {
            println!("Ignored redundant update for {}", key);
        }
    }).or_insert_with(|| StoredValue {
        data: remote_crdt,
        last_updated: SystemTime::now(),
    });

    Ok(Response::new(GossipChangesResponse { success: true }))
}

Performance Benchmarks Continuous benchmarking is critical to ensure the merge() operations don't bottleneck the Tokio executors. MergeDB tracks serialization and reconciliation overhead using Criterion. The latest performance reports can be viewed here.


Future Directions

  • Advanced Replication: Maturing the multi-node replication and dynamic peer discovery protocols.
  • Expanded Data Types: Implementing richer CRDT types (like Add-Wins Sets and Maps) beyond standard registers and PN-Counters.
  • Industry Benchmarks: Direct throughput and latency comparisons against Redis and DynamoDB.
  • Ecosystem: Developing native client libraries across multiple languages.
  • LSM Trees: Transitioning the storage engine from an in-memory DashMap to a disk-backed Log-Structured Merge (LSM) Tree. This will utilize custom merge operators to handle CRDT reconciliation during background compaction, allowing datasets to grow beyond RAM limits while maintaining durability.
  • BFT (Byzantine Fault Tolerance): Upgrading the trust model. Currently, CRDTs assume all nodes are honest. Implementing BFT mechanisms will prevent malicious or compromised nodes from inhibiting the proper funtion of mergeDB.
MODE: NORMAL Shadow UTF-8