The coming generation of Internet-of-Things (IoT) applications will process massive amounts of incoming data while supporting data mining and online learning. In cases with demanding real-time requirements, such systems behave as smart memories: high-bandwidth services that capture sensor input, proceses it using machine-learning tools, replicate and store “interesting” data (discarding uninteresting content), update knowledge models, and trigger urgently-needed responses. Derecho is a high-throughput librry for building smart memories and similar services. At its core Derecho implements atomic multicast and state machine replication. Derecho’s replicated template defines a replicated type; the corresponding objects are associated with subgroups, which can be sharded into keyvalue structures. The persistent and volatile storage templates implement version vectors with optional NVM persistence. These support time-indexed access, offering lock-free snapshot isolation that blends temporal precision and causal consistency. Derecho automates application management, supporting multigroup structures and providing consistent knowledge of the current membership mapping. A query can access data from many shards or subgroups, and consistency is guaranteed without any form of distributed locking. Whereas many systems run consensus on the critical path, Derecho requires consensus only when updating membership. By leveraging an RDMA data plane and NVM storage, and adopting a novel receiver-side batching technique, Derecho can saturate a 12.5GB RDMA network, sending millions of events per second in each subgroup or shard. In a single subgroup with 2-16 members, throughput peaks at 16 GB/s for large (100MB or more) objects. When using version-vector storage, Derecho is limited by the speed of the SSD or RamDisk, showing no loss of performance as group sizes grow. While key-value subgroups would typically use 2 or 3-member shards, unsharded subgroups could be large. In tests with a 128-member group, Derecho’s multicast and Paxos protocols were just 2-3x slower than for a small group, depending on the traffic pattern. With network contention, slow members, or overlapping groups that generate concurrent traffic, Derecho’s protocols remain stable and adapt to the available bandwidth.
Sagar Jha, Jonathan Behrens, Theo Gkountouvas, Mae Milano, Weijia Song, Edward Tremel, Sydney Zink, Kenneth P. Birman, Robbert Van Renesse
2017