Design a key-value store

Question

Single server key-value store

Distributed key-value store

Data partition

Automatic scaling

Heterogeneity

Data replication

Consistency (Quorum consensus)

Inconsistency resolution: versioning

Handling failures

Failure detection

gossip protocol

Handling temporary failures

Handling permanent failure

Use Merkel Tree to identify which bucket (set of key-value pair) are not synchronised and synchronise those buckets only.

System architecture diagram

CleanShot 2024-10-16 at 16.44.23@2x.png

CleanShot 2024-10-16 at 16.45.24@2x.png

Write and read path

CleanShot 2024-10-16 at 16.46.28@2x.png

  1. The write request is persisted on a commit log file.
  2. Data is saved in the memory cache.
  3. When the memory cache is full or reaches a predefined threshold, data is flushed to SSTable [9] on disk. Note: A sorted-string table (SSTable) is a sorted list of <key, value> pairs. ”

CleanShot 2024-10-16 at 16.47.07@2x.png

  1. The system first checks if data is in memory. If not, go to step2
  2. If data is not in memory, the system checks the bloom filter.
  3. The bloom filter is used to figure out which SSTables might contain the key.
  4. SSTables return the result of the data set.
  5. The result of the data set is returned to the client.