Thursday, 14 April 2011

A new data grid JSR

Following up on my previous response to Antonio Goncalves' blog post, I have submitted a JSR to the JCP on a data grid standard, titled "Java Data Grids".  It has yet to be assigned a number by the JCP, but I thought I’d talk about it a little here anyway.

Here is the description of the JSR that I have submitted:

This specification proposes to provide an API for accessing, storing, and managing data in a distributed data grid. The primary API will build upon and extend JSR-107 (JCACHE) API. In addition to it’s genericized Map-like API to access a Cache, JSR-107 defines SPIs for spooling in-memory data to persistent storage, an API for obtaining a named Cache from a CacheManager and an API to register event listeners. Above and beyond JSR-107, this JSR will define characteristics and expectations from eviction, replication and distribution, and transactions (via the JTA specification). Further, it would define an asynchronous, non-blocking API as an alternative to JSR-107’s primary API, as non-blocking access to data becomes a concern when an implementation needs to perform remote calls, as in the case of a data grid. This specification builds upon JSR-107, which is not yet complete. We intend to work with the JSR-107 EG to ensure that their schedule is compatible with the schedule for this JSR. If JSR-107 is unable to complete, we propose merging the last available draft into this specification.

Data grids are gaining prominence and importance in enterprise Java, particularly as cloud-style deployments gain popularity:

  • Characteristics such as high availability, along with removal of single points of failure become increasingly important, since cloud infrastructure is inherently unreliable and can be re-provisioned with minimal notice; applications deployed on cloud need to be resilient to this.  

  • Further, one of the major benefits of cloud-style deployments is elasticity.  The ability to scale out (and back in) quickly and easily.  Again, data grids have a role to play here.  

  • Finally, with scalable middleware comes additional stress on the data tier (traditionally an RDBMS), as middleware nodes scale out to cope with load.  Data grids - used as a distributed cache - can help with mitigating database bottlenecks.

With one of Java EE 7’s stated goals being "cloud-friendliness", the above are powerful arguments for the inclusion of a distributed data grid standard in Java EE 7.

What about JSR-107?  JSR-107 - the temporary caching API proposed in 2001 - certainly has a role to play in Java EE too.  Temporary caches are an important part of enterprise middleware, but yet a standard has been sadly missing from a Java EE umbrella specification for far too long.  Spring, having identified the need as well, has a temporary caching abstraction in their current development versions.  Several other non-Java frameworks define temporary caching APIs too (Ruby on Rails, Django for Python, .NET).  There is no denying JSR-107 is necessary, and necessary as a part of Java EE.

But JSR-107 isn’t a data grid.  JSR-107 falls short as a standard for data grids, specifically as it doesn’t take into account characteristics of distribution and replication of data, and doesn’t define a contract that implementations would have to adhere to when it comes to moving data around a cluster.  Crucial things for a data grid that, if not baked into a specification, will hinder portability and render the standard itself useless and impotent.

Further, with remote capabilities in mind, a data grid should also expose a non-blocking API, since network calls can be a limiting factor.  Invoking methods that involve remote calls should be able to be done in an asynchronous fashion.  Stuff that is irrelevant to a temporary caching API like JSR-107.

So with all that in mind, I’d love to hear your thoughts on the data grid JSR.  In addition to Red Hat, the JSR is currently backed by a major Java EE and data grid vendor which cannot be named at this stage, along with independent JCP members with relevant interest and background.

Cheers Manik

Posted by Manik Surtani on 2011-04-14
Tags: jcp data grids jsr 107 standards

News

Tags

JUGs alpha as7 asymmetric clusters asynchronous beta c++ cdi chat clustering community conference configuration console data grids data-as-a-service database devoxx distributed executors docker event functional grouping and aggregation hotrod infinispan java 8 jboss cache jcache jclouds jcp jdg jpa judcon kubernetes listeners meetup minor release off-heap openshift performance presentations product protostream radargun radegast recruit release release 8.2 9.0 final release candidate remote query replication queue rest query security spring streams transactions vert.x workshop 8.1.0 API DSL Hibernate-Search Ickle Infinispan Query JP-QL JSON JUGs JavaOne LGPL License NoSQL Open Source Protobuf SCM administration affinity algorithms alpha amazon anchored keys annotations announcement archetype archetypes as5 as7 asl2 asynchronous atomic maps atomic objects availability aws beer benchmark benchmarks berkeleydb beta beta release blogger book breizh camp buddy replication bugfix c# c++ c3p0 cache benchmark framework cache store cache stores cachestore cassandra cdi cep certification cli cloud storage clustered cache configuration clustered counters clustered locks codemotion codename colocation command line interface community comparison compose concurrency conference conferences configuration console counter cpp-client cpu creative cross site replication csharp custom commands daas data container data entry data grids data structures data-as-a-service deadlock detection demo deployment dev-preview development devnation devoxx distributed executors distributed queries distribution docker documentation domain mode dotnet-client dzone refcard ec2 ehcache embedded query equivalence event eviction example externalizers failover faq final fine grained flags flink full-text functional future garbage collection geecon getAll gigaspaces git github gke google graalvm greach conf gsoc hackergarten hadoop hbase health hibernate hibernate ogm hibernate search hot rod hotrod hql http/2 ide index indexing india infinispan infinispan 8 infoq internationalization interoperability interview introduction iteration javascript jboss as 5 jboss asylum jboss cache jbossworld jbug jcache jclouds jcp jdbc jdg jgroups jopr jpa js-client jsr 107 jsr 347 jta judcon kafka kubernetes lambda language learning leveldb license listeners loader local mode lock striping locking logging lucene mac management map reduce marshalling maven memcached memory migration minikube minishift minor release modules mongodb monitoring multi-tenancy nashorn native near caching netty node.js nodejs nosqlunit off-heap openshift operator oracle osgi overhead paas paid support partition handling partitioning performance persistence podcast presentations protostream public speaking push api putAll python quarkus query quick start radargun radegast react reactive red hat redis rehashing releaase release release candidate remote remote events remote query replication rest rest query roadmap rocksdb ruby s3 scattered cache scripting second level cache provider security segmented server shell site snowcamp spark split brain spring spring boot spring-session stable standards state transfer statistics storage store store by reference store by value streams substratevm synchronization syntax highlighting testing tomcat transactions tutorial uneven load user groups user guide vagrant versioning vert.x video videos virtual nodes vote voxxed voxxed days milano wallpaper websocket websockets wildfly workshop xsd xsite yarn zulip

back to top