LEADS - Large-Scale Elastic Architecture for Data as a Service
Only the biggest information technology players, e.g., Google or Amazon, have access to the necessary infrastructure for storing and processing data available on the Web. Small and medium companies (SMEs) have no other choice than relying on such companies with dedicated data centers to provide them the necessary resources to store and mine public data.
The LEADS project aims at developing a set of solutions to change this state of the Cloud business. To that end, LEADS provides a novel Data-as-a-Service framework that makes storing, processing, and querying public data available to almost every size of business or organization in an elastic, low-cost and energy-efficient manner.
Open Source software available on
LEADS fosters the development of data-driven companies. To that end, the LEADS platform provides a long chain of promising value-added services.
- The framework allows SMEs to aggregate their storage and processing capabilities.
- SMEs can collaboratively crawl the web, resulting in an index similar to those of commercial and non-commercial search engines.
- On top of this index, either on their own infrastructure or using the aggregated resources available in LEADS, SMEs can process public data and extract meaningful information.
- Large-scale analyses of collected data can run online and concurrently to the gathering in a stream-oriented manner, as well as offline using classical Big Data oriented processing platform (e.g., map-reduce).
- The LEADS platform stores both the results of the collaborative effort to crawl public data, and the outputs of private computations.
- SMEs may re-sell their private results to others using the LEADS Data-as-a-Service framework as a marketplace.
- Companies and organizations may also use this marketplace to sell and buy business relevant data from and to other entities.