| tags:rust k:v store distributed cormorant categories:Rust

Let's build a Distributed K:V Store in Rust
Agenda

Rust 1.0 is being released tomorrow. While not a contributor or even power-user, I've been toying with it long enough for my oldest code to be hopelessly out of date, and moderately knowledgeable enough to fumble through dangerous data-structures. I've been interested in Rust for a few years, and closely monitoring it for about a year. It's very exciting to see the 1.0 release come to fruition.

I think Rust is a fascinating language, and I want to learn it better. For better or worse, I'm someone that needs a “real” project to learn a new subject. To that end, I'm embarking on a new learning project: a distributed key:value store written in Rust. This will be a moderately complex project that will touch several components that have been on my to-do list.

Piccolo

The goal of Piccolo will be very simple – create a functioning (in some sense of the word) distributed key:value data store. Along the way, I intend to write articles about code, design decisions, mistakes and general thoughts on building a non-trivial system in Rust.

Components

  • A Mio event loop will spin on a dedicated network IO thread. Connections are hashed to dedicated SPSC queues, such that the same connection always goes to the same queue. Incoming data is placed inside an appropriate queue
  • A dedicated worker thread will be pinned to each CPU Core, draining their assigned SPSC work queue. Each worker will maintain a local hashmap of values (aka the “key:value” part of the project)
  • Workers can send data back to the Mio event loop via the notify queue
  • All node-to-node communication will be done over Cap'n Proto
  • An external “Cerializer” process will provide optional JSON<-->Cap'n Proto serialization so that REST clients can interact with the cluster

Goals

  • Provide a functional key:value store that can insert, read, modify, delete values
  • Provide some basic form of clustering or leader election. Might just be static lists
  • Be moderately high-performance
  • Attempt to use only stable features (no guarantees here)
  • Document the process, design decisions and thought processes. I want this to be a good learning aid, even if it's just a url to point at and say “Don't do that.”

Non-goals

  • Build a datastore fit for production.
  • Care about the CAP theorem. To Piccolo, Jepsen is your weird uncle that wears a kilt.

So yeah, this should be fun :)

comments powered by Disqus