A Datalog Framework for Conflict-Free Replicated Data Types

2026-05-29Distributed, Parallel, and Cluster Computing

Distributed, Parallel, and Cluster ComputingDatabasesLogic in Computer ScienceProgramming Languages
AI summary

The authors present a new way to describe and analyze how distributed data structures called CRDTs work when many users edit data at the same time without conflicts. They use a logical programming language called Datalog to clearly model how these data structures handle multiple changes happening together. This approach helps them test and compare different CRDT implementations automatically. They show how their method works by looking at an example where people collaboratively edit a graph, checking that it works correctly and scales well as more edits and users are involved.

conflict-free replicated data types (CRDTs)distributed systemslocal-first collaborationconcurrent updatesDatalogdeclarative programmingproperty-based testingdata replicationcollaborative editingscalability
Authors
Elena Yanakieva, Annette Bieniusa, Stefania Dumbrava
Abstract
Distributed applications increasingly support local-first collaboration over shared data, allowing multiple users to perform updates concurrently without global coordination. Such collaboration requires careful design to capture the intended semantics of the concurrent interactions. We introduce a declarative framework for specifying and reasoning about the semantics of conflict-free replicated data types (CRDTs) and CRDT-based applications in Datalog. The framework models CRDT semantics as executable logic programs over operation contexts, making concurrency explicit and compositional, and thus amenable to automated analysis. As one application, we use property-based testing to compare implementations. To the best of our knowledge, this is the first work to systematically use Datalog as a foundation for prototyping and analyzing complex CRDTs and their compositions. We evaluate our methodology using a collaborative graph data editing case study and report experimentation results assessing correctness validation and scalability with an increasing number of operations and replicas.