Derecho: Programming the Datacenter

Last updated on Dec 31, 2021

In-datacenter replication just got a lot faster.

Derecho is a new framework for building replicated, fault-tolerant distributed systems within a datacenter. At its core, Derecho provides a best-in-class consistent multicast abstraction, sending multi-target messages at blazing speed and in lock-step. Derecho’s object-oriented programming layer makes it easy to build any distributed application straight from a standard, single-machine approach; individual classes are automatically replicated in user-specified configurations, and a straightforward, type-safe RPC mechanism allows easy communication between replica groups.

Derecho realizes a major observation: that programming a distributed system with an eye towards protocol convergence (or strong eventual consistency) yields significant performance benefits by delaying consensus events for as long as possible. Derecho’s core protocols are written in a simple convergent programming language (watch this space), giving us constructive confidence that our protocols never diverge – which avoids much of the proof (and correctness) burden of traditional distributed system design.

This project is a major collaboration between several groups at Cornell University; the group is expanding every day, but has so far included Sagar Jha, Jonathan Behrens, Theo Gkountouvas, Mae Milano, Weijia Song, Edward Tremel, Sydney Zink, Kenneth P. Birman, Robbert Van Renesse, and the students of CS4999. For more detail, you should check the project website or read about it on Ken Birman’s blog.

Mae Milano

PhD, Programming Languages and Systems

Posts

Mae Milano

Last updated on Dec 31, 2021 3 min read

My Dissertation is Now Available!

My dissertation is now available here

Publications

Verifying a C Implementation of Derecho’s Coordination Mechanism Using VST and Coq

Derecho is a C++ framework for distributed programming leveraging high performance communication primitives such as RDMA. At its core is the shared state table (SST), a replicated data structure that supports efficient protocols for consensus and group membership. Our aim is to formalize the reasoning principles articulated by the designers, which focus on knowledge and monotonicity, as basis for highly assured high performance distributed applications. To this end we develop a high level model that exposes the SST principles in an application-friendly way. We use the model to specify and verify a re-implementation in C of the SST API. We validate the specifications by verifying simple applications that embody key parts of the Derecho protocols. The development is carried out using VST and Coq. This lays groundwork for verification of the full Derecho protocols and applications built on them.

Ramana Nagasamudram, Lennart Beringer, Ken Birman, Mae Milano, David A Naumann

NASA Formal Methods 16th International Symposium, 2024

PDF Project DOI

Monotonicity and Opportunistically-Batched Actions in Derecho

Our work centers on a programming style in which a system separates data movement from control-data exchange, streaming the former over hardware-implemented reliable channels, while using a new form of distributed shared memory to manage the latter. Protocol decisions and control actions are expressed as monotonic predicates over the control data guarding protocol actions. Provable invariants about the protocol are expressed as effectively-common knowledge, which can be derived from the monotonic predicates in effect during a particular membership epoch. The methodology enables a natural style of code that is easy to reason about, and it runs efficiently on modern hardware. We used this approach to create Derecho, an optimal Paxos-based data replication library that sets performance records, and we believe it is broadly applicable to the construction of reliable distributed systems on high-bandwidth networks.

Ken Birman, Sagar Jha, Mae Milano, Lorenzo Rosa, Weijia Song, Edward Tremel

International Symposium on Stabilizing, Safety, and Security of Distributed Systems, 2023

PDF Project Project DOI

Derecho: Fast State Machine Replication for Cloud Services

Cloud computing services often replicate data and may require ways to coordinate distributed actions. Here we present Derecho, a library for such tasks. The API provides interfaces for structuring applications into patterns of subgroups and shards, supports state machine replication within them, and includes mechanisms that assist in restart after failures. Running over 100Gbps RDMA, Derecho can send millions of events per second in each subgroup or shard and throughput peaks at 16GB/s, substantially outperforming prior solutions. Configured to run purely on TCP, Derecho is still substantially faster than comparable widely used, highly-tuned, standard tools. The key insight is that on modern hardware (including non-RDMA networks), data-intensive protocols should be built from non-blocking data-flow components.

Sagar Jha, Jonathan Behrens, Theo Gkountouvas, Mae Milano, Weijia Song, Edward Tremel, Sydney Zink, Kenneth P. Birman, Robbert Van Renesse

ACM Transactions on Computer Systems (TOCS), 2019

PDF Project DOI

Building Smart Memories and Cloud Services with Derecho

The coming generation of Internet-of-Things (IoT) applications will process massive amounts of incoming data while supporting data mining and online learning. In cases with demanding real-time requirements, such systems behave as smart memories: high-bandwidth services that capture sensor input, proceses it using machine-learning tools, replicate and store “interesting” data (discarding uninteresting content), update knowledge models, and trigger urgently-needed responses. Derecho is a high-throughput librry for building smart memories and similar services. At its core Derecho implements atomic multicast and state machine replication. Derecho’s replicated template defines a replicated type; the corresponding objects are associated with subgroups, which can be sharded into keyvalue structures. The persistent and volatile storage templates implement version vectors with optional NVM persistence. These support time-indexed access, offering lock-free snapshot isolation that blends temporal precision and causal consistency. Derecho automates application management, supporting multigroup structures and providing consistent knowledge of the current membership mapping. A query can access data from many shards or subgroups, and consistency is guaranteed without any form of distributed locking. Whereas many systems run consensus on the critical path, Derecho requires consensus only when updating membership. By leveraging an RDMA data plane and NVM storage, and adopting a novel receiver-side batching technique, Derecho can saturate a 12.5GB RDMA network, sending millions of events per second in each subgroup or shard. In a single subgroup with 2-16 members, throughput peaks at 16 GB/s for large (100MB or more) objects. When using version-vector storage, Derecho is limited by the speed of the SSD or RamDisk, showing no loss of performance as group sizes grow. While key-value subgroups would typically use 2 or 3-member shards, unsharded subgroups could be large. In tests with a 128-member group, Derecho’s multicast and Paxos protocols were just 2-3x slower than for a small group, depending on the traffic pattern. With network contention, slow members, or overlapping groups that generate concurrent traffic, Derecho’s protocols remain stable and adapt to the available bandwidth.

Sagar Jha, Jonathan Behrens, Theo Gkountouvas, Mae Milano, Weijia Song, Edward Tremel, Sydney Zink, Kenneth P. Birman, Robbert Van Renesse

2017

PDF Project

Talks

Programming Distributed Systems

Sep 21, 2023 1:40 PM — 2:20 PM

Papers We Love at Strangeloop 2023

Careful Language Design builds Better Systems

Project Project Project Video

Derecho Tutorial

Oct 27, 2019 2:00 PM — 6:00 PM

SOSP 2019

Learn about Derecho, and build a sample ML application with it!

PDF Code Project Slides

Derecho's Intelligent Object Store

Oct 27, 2019 10:00 AM — 11:30 AM

AI Systems 2019

An intelligent object store built in Derecho, optimized for AI at the IoT Edge

Code Project