SyncFree: Large-scale computation without synchronisation

Tyler Crain
Topics recommended for the 2016-2017 Work Programme: 
  1. Conflict-Free Consistency for low-cost cloud computing: Traditional strong-consistency approaches do not scale to extreme-scale distributed applications, such as social networks or multiplayer games, and their huge quantities of frequently-changing geo-distributed shared data, for performance, reliability and scalability reasons. Cheaper, faster, more relaxed consistency levels, in which concurrent updates are allowed, are more appropriate, but have been so far reserved to highly-specialised programmers. Our SyncFree project [1] is developing tools (conflict-free replicated data types, programming tools, and platforms) that provably ensure eventual consistency, while still being easy to program with. This is an important topic, which should be continued and expanded (e.g. via standardisation) to make this approach available to the industry at large. This will lower the cost and the entry barriers to large-scale cloud computing, opening new opportunities for European industry and SMEs.
  2. Interplay between conflict-free consistency and strong consistency: Despite the scalability advantages of weak consistency, an application occasionally needs to issue a strong update, for instance to ensure some global invariant. The correct interplay between weak consistency and occasional strongly-consistent operations is essential to the design of dependable and highly-scalable applications. This topic has been little studied so far, and raises both theoretical and practical issues.
  3. Scalable security in conflict-free distributed systems: Existing approaches to security typically assume that a policy change is immediately visible, i.e, require strong consistency. In today’s decentralised cloud architectures, this is a brittle and non-scalable assumption. It is an important topic to study how to ensure a satisfactory level of security under eventual consistency.
Projects major results: 

Current approaches for ensuring data consistency in modern clouds, made up of loosely coupled, widely-distributed and heterogeneous localized datacentres, require huge investment and highly-specialized expertise, available only to a few large monopolies. SyncFree has already started to address this challenge thanks to a simple yet principled approach called Conflict-Free Replicated Data Types (CRDTs) [9]. CRDTs avoid the complexities of ad-hoc approaches, while maintaining the scalability advantage. Here is the insight. By following a few simple mathematical principles, for example commutativity, distributed updates can occur without synchronisation, while guaranteeing eventual consistency. What’s more, CRDTs ease development, by encapsulating the replication and concurrency properties of common shared objects, such as sets, maps, sequences, or graphs. These data types are easily combined to form robust, scalable, powerful applications. The project partners have already demonstrated a number of CRDTs and deployed them in example applications. Together, the industrial and academic partners of SyncFree are currently formalising specifications for highly-scalable innovative applications, building programming and deployment platforms, mathematically proving that they are correct, and preparing for extreme-scale experiments on real-world crowd-source applications. A natural follow-up of this work will be to develop and propose standards for these data types including libraries of open-source data structures. This will aim to be used by European industry looking to benefit from the cloud, thus providing scalable solutions to quickly develop cloud-based applications.

Potential exploitation strategy: 

The SyncFree project is advancing both the theory and practice of large-scale application architectures, and especially of CRDTs and related mechanisms. With several SyncFree partners coming from European based enterprises who already have large user bases and feel the need for increased scalability in their applications, the project will include an extreme-scale crowd-sourced experiment, pushing the scalability needs of real world applications. Furthermore, an open-source cloud storage platform [2], including a library of CRDTs in addition to strongly consistent abstractions, to be used in future scalable distributed applications, will be made available, leaving a lasting and beneficial impact far beyond the end of the project. Using these open source libraries, organizations will be able to create highly scalable programs more easily, meeting the strict consistency requirements present in today’s highly connected services while improving user experience through low latency and fault tolerance. These advantages will help extend the reach of the cloud into mainstream connected applications and services and provide the platform for the creation of new and innovate cloud-based businesses.

An update since the last Concertation meeting (March 2014): 

The collaborators within in the SyncFree project have been making significant progress on designing abstractions for extreme-scale applications, this work will be essential for defining future cloud standards as they define the system on which future cloud services and applications will be developed to run on top of. Selected results include a designs for scalable dictionary data structure [3], methods for efficiently storing and exchanging consistent data-types across multiple datacentres [4], methods for providing scalable invariants [7], abstractions for increasing the scalability of online collaborative editing programs[4], and methods for verifying the correctness of scalable applications [8] . To verify these and other results that are being exploded during the project, a state-of-the-art, industrially realistic, cloud storage platform is being developed with the support of Basho Technologies, a world leader in scalable could storage technology. This platform is provided as open-source code on GitHub [2].