Experiments

Any open source project has an experimental side as part of its DNA. Infinispan goes a step further and experiments with academic and research projects.

Infinispan Ensemble

Infinispan Ensemble exposes Infinispan APIs and federates a collection of data grid clusters. The set of data grids is seen by the client as a single data grid for which Infinispan Ensemble provides the following user-defined guarantees:

geographical guarantees: a certain set of data is guaranteed to be duplicated across two geographically separated clusters.
dependability: a given set of data is guaranteed to be in at least n clusters.

A user can access the data through the set of Infinispan APIs including the key/value APIs and the M/R API. This project opens the door for low latency, local or near-local operations while still guaranteeing business defined dependability or legal guarantees (territoriality constraints for example).

This project is a contribution of the University of Neuchatel through its participation to the LEADS project.

You can find more information on the contribution’s GitHub repository.

Infinispan Atomic Factory

Distributed systems aggregate large numbers of heterogeneous components that are subject to failures and asynchrony. To tame such a capricious nature, systems designers resort to non-blocking techniques such as state machine replication. This approach provides consistent non-blocking operations to a shared object replicated at a quorum of machines. Atomic Object Factory is an implementation of the state machine replication paradigm over Infinispan. Using the factory is as simple as employing the synchronized keyword in Java: it suffices to call it with a Serializable class, and it wraps for you the dependability, consistency and liveness guarantees of the instantiated object over multiple Infinispan servers. The factory is universal in the sense that it can instantiate an object of any (serializable) class atop an Infinispan cache, making transparently the object replicated and durable, while ensuring strong consistency despite concurrent access.

This project opens the door to new use cases to Infinispan that is traditionally served by projects like Apache ZooKeeper.

This project is a contribution of the University of Neuchatel through its participation to the LEADS project.

You can find more information on the contribution’s GitHub repository.

Using Apache Gora APIs with Infinispan

Apache Gora is an abstraction API to persist and execute map reduce operations on big data.

Through its LEADS project contribution, the University of Neuchatel has contributed an Apache Gora backend that persists data in Infinispan.
This is particularly useful when you want to use projects using Apache Gora to persist their data and execute map / reduce jobs. The main example is Apache Nutch, a very popular and efficient Web crawler. You can now store Nutch data into an Infinispan backed data grid and have the M/R jobs executed on each node.

Where to go next?

A nice way to see Infinispan Gora in action is to follow Unicrawl's tutorial. This tutorial runs a small cluster of Infinispan in which Apache Nutch's data is stored through Apache Gora and using the Avro serialization for Infinispan. Check the tutorial.

You can find the code and more information on the contribution's GitHub repository.

Serializing data via Apache Avro

Apache Avro is an efficient and compact serialization format.

Through its LEADS project contribution, the University of Neuchatel has contributed an Avro backend to store, retrieve and query Avro defined types via the HotRod protocol.
This is particularly useful if you want to store data in this portable format and if you want to share data with Avro consuming applications.

Where to go next?

A nice way to see the Apache Avro serialization in action is to follow Unicrawl's tutorial. This tutorial runs a small cluster of Infinispan in which Apache Nutch's data is stored through Apache Gora and using the Avro serialization for Infinispan. Check the tutorial.

You can find the code and more information on the contribution's GitHub repository.