Friday, 20 October 2017

Cache Store Batch Operations

Infinispan 9.1.x introduces batch write and delete operations for cache stores. The introduction of batching should greatly improve performance when utilising write-behind cache stores, using putAll operations and committing transactions in non-transactional stores.

CacheWriter Interface Additions

The CacheWriter interface has been extended so that it exposes two additional methods: deleteBatch and writeBatch. For the sake of backwards compatibility a default implementation of these methods is provided, however if your cache store is able to utilise batching we strongly recommend you create your own implementations. The additional methods and docs are show below:

Updated Stores

Currently the JDBC, JPA, RocksDB and Remote stores have all been modified to take advantage of these latest changes.

Configuration Changes

As each store implementations has different batching capabilities, it was necessary to introduce a max-batch-size attribute to the AbstractStoreConfiguration. This attribute defines the maximum number of entries that should be included in a single batch operation to the store. If a value less than one is provided, then the underlying store implementation should not place a upper limit on the number of entries in a batch.

Deprecated Attributes

Both TableManipulationConfiguration#batchSize and JpaStoreConfiguration#batchSize have been deprecated, as they serve the same purpose as AbstractStoreConfiguration#maxBatchSize.

Store Benchmark

To measure the impact of batch writes on Cache.putAll, we created a simple benchmark to compare the performance of Infinispan 9.1.1.Final (with batching) and 9.0.3.Final (without). The benchmark consisted of 20 threads inserting 100000 cache entries as fast as possible into a cache via putAll; with each putAll operation containing 20 cache entries and the max-batch-size of each store being set to 20. The table below shows the average time taken for each store type after the benchmark was executed three times.

Store Type	9.0.3.Final	9.1.1-Final	Latency Decrease
JdbcStringBasedStore	29368ms	2597ms	91.12%
JPAStore	30798ms	16640ms	45.97%
RocksDBStore	1164ms	209ms	82.04%

Store Type

9.0.3.Final

9.1.1-Final

Latency Decrease

JdbcStringBasedStore

29368ms

2597ms

91.12%

JPAStore

30798ms

16640ms

45.97%

RocksDBStore

1164ms

209ms

82.04%

The benchmark results above clearly show that performance is increased dramatically when utilising batch updates at the store level.

Conclusions

Infinispan 9.1.x introduces batching capabilities to the CacheWriter interface in order to improve performance. If you currently utilise a custom cache store, we strongly recommend that you provide your own implementation of the delete and write batch methods.

If you have any feedback on the CacheWriter changes, or would like to request some new features/optimisations, let us know via the forum, issue tracker or the #infinispan channel onhttp://webchat.freenode.net/?channels=%23infinispan[ Freenode].

Posted by Ryan Emerson on 2017-10-20
Tags: jdbc rocksdb jpa leveldb cache store

Thursday, 08 December 2016

Meet Ickle!

As you’ve already learned from an earlier post this week, Infinispan 9 is on its final approach to landing and is bringing a new query language. Hurray! But wait, was there something wrong with the old one(s)? Not wrong really … I’ll explain.

Infinispan is a data grid of several query languages. Historically, it has offered search support early in its existence by integrating with Hibernate Search which provides a powerful Java-based DSL enabling you to build Lucene queries and run them on top of your Java domain model living in the data grid. Usage of this integration is confined to embedded mode, but that still succeeds in making Java users happy.

While the Hibernate Search combination is neat and very appealing to Java users it completely leaves non-JVM languages accessing Infinispan via remote protocols out in the cold.

Enter Remote Query. Infinispan 6.0 starts to address the need of searching the grid remotely via Hot Rod. The internals are still built on top of Lucene and Hibernate Search bedrock but these technologies are now hidden behind a new query API, the QueryBuilder, an internal DSL resembling JPA criteria query. The QueryBuilder has implementations for both embedded mode and Hot Rod. This new API provides all relational operators you can think of, but no full-text search initially, we planned to add that later.

Creating a new internal DSL was fun. However, having a long term strategy for evolving it while keeping complete backward compatibility and also doing so uniformly across implementations in multiple languages proved to be a difficult challenge. So while we were contemplating adding new full-text operators to this DSL we decided on making a long leap forward and adopt a more flexible alternative by having our own string based query language instead, another DSL really, albeit an external one this time.

So after the long ado, let me introduce Ickle, Infinispan’s new query language, conspicuously resembling JP-QL.

Ickle:

is a light and small subset of JP-QL, hence the lovely name
queries Java classes and supports Protocol Buffers too
queries can target a single entity type
queries can filter on properties of embedded objects too, including collections
supports projections, aggregations, sorting, named parameters
supports indexed and non-indexed execution
supports complex boolean expressions
does not support computations in expressions (eg. user.age > sqrt(user.shoeSize + 3) is not allowed but user.age >= 18 is fine)
does not support joins
but, navigations along embedded entities are implicit joins and are allowed
joining on embedded collections is allowed
other join types not supported
subqueries are not supported
besides the normal relational operators it offers full-text operators, similar to Lucene’s query parser
is now supported across various Infinispan APIs, wherever a Query produced by the QueryBuilder is accepted (even for continuous queries or in event filters for listeners!)

That is to say we squeezed JP-QL to the bare minimum and added full-text predicates that closely follow the syntax of Lucene’s query parser.

If you are familiar with JPA/JP-QL then the following example will speak for itself:

select accountId, sum(amount) from com.acme.Transaction
    where amount < 20.0
    group by accountId
    having sum(amount) > 1000.0
    order by accountId

The same query can be written using the QueryBuilder:

Query query = queryFactory.from(Transaction.class)
.select(Expression.property("accountId"), Expression.sum("amount"))
.having("amount").lt(20.0)
.groupBy("accountId")
.having(Expression.sum("amount")).gt(1000.0)
.orderBy("accountId").build();

Both examples look nice but I hope you will agree the first one is better.

Ickle supports several new predicates for full-text matching that the QueryBuilder is missing. These predicates use the : operator that you are probably familiar from Lucene’s own query language. This example demonstrates a simple full-text term query:

select transactionId, amount, description from com.acme.Transaction
where amount > 10 and description : "coffee"

As you can see, relational predicates and full-text predicates can be combined with boolean operators at will.

The only important thing to remark here is relational predicates are applicable to non-analyzed fields while full-text predicates can be applied to analyzed field only. How does indexing work, what is analysis and how do I turn it on/off for my fields? That’s the topic of a future post, so please be patient or start readinghttps://docs.jboss.org/hibernate/search/5.6/reference/en-US/html_single/#_analysis[ here].

Besides term queries we support several more:

Term description : "coffee"
Fuzzy description : "cofee"~2
Range amount : [40 to 90}`
Phrase description : "hello world"
Proximity description : "canceling fee"~3
Wildcard description : "te?t"
Regexp description : /[mb]oat/
Boosting description : "beer"^3 and description :"books"

You can read all about them starting from here.

But is Ickle really new? Not really. The name is new, the full-text features are new, but a JP-QL-ish query string was always internally present in the Query objects produced by the QueryBuilder since the beginning of Remote Query. That language was never exposed and specified until now. It evolved significantly over time and now it is ready for you to use it. The QueryBuilder / criteria-like API is still there as a convenience but it might go out of favor over time. It will be limited to non-full-text functionality only. As Ickle grows we’ll probably not be able to include some of the additions in the QueryBuilder in a backward compatible manner. If growing will cause too much pain we might consider deprecating it in favor of Ickle or if there is serious demand for it we might continue to evolve the QueryBuilder in a non compatible manner.

Being a string based query language, Ickle is very convenient for our REST endpoint, the CLI, and the administration console allowing you to quickly inspect the contents of the grid. You’ll be able to use it there pretty soon. We’ll also continue to expand Ickle with more advanced full-text features like spatial queries and faceting, but that’s a subject for another major version. Until then, why not grab the current 9.0 Beta1 and test drive the new query language yourself? We’d love to hear your feedback on the forum, on our issue tracker or on IRC on the #infinispan channel on Freenode.

Happy coding!

Posted by Unknown on 2016-12-08
Tags: JP-QL Hibernate-Search jpa lucene full-text indexing language query DSL

Thursday, 01 October 2015

Hibernate Second Level Cache improvements

Infinispan has been implementing Hibernate Second Level Cache for a long time, replacing the previous JBoss Cache implementation with very similar logic. The main aim of the implementation has always been to have very fast reads, keeping the overhead of cache during reads on minimum. This was achieved using local reads in invalidation-mode cache and Infinispan’s putForExternalRead operation, where the request to cache never blocks.

Recently we’ve looked on the implementation again to see whether we can speed it up even more. For a long time you could use only transactional caches to keep the cache in sync with database. However transactions come at some cost so we thought about a way to get around it. And we have found it, through custom interceptors we have managed to do two-phase updates to the cache and now the non-transactional caches are the default configuration. So, if you’re using Hibernate with your own configuration, don’t forget to update that when migrating to Hibernate ORM 5!

With transactions gone, our task was not over. So far entity/collection caching has been implemented for invalidation mode caches, but it’s tempting to consider replication mode, too. For replicated caches, we got rid of a special cache for pending puts (this local cache detects out-of-date reads, keeping the entity cache consistent). Instead, we used different technique where a logical removal from the cache is substituted by replace with a token called tombstone, and updates pre-invalidate the cache in a similar way. This change opened the possibility for non-transactional replicated and distributed caches (transactional mode is not supported). We were pleased to see the results of some benchmark where the high hit ratio in replicated caches has dramatically speeded up all operations.

There is one downside of the current implementation - in replication mode, you should not use eviction, as eviction cannot tell regular entity (which can be evicted) from the tombstone. If tombstone was evicted, there’s a risk of inconsistent reads. So when using replicated caches, you should rely on expiration to keep your cache slender. We hope that eventually we’ll remove this limitation.

All modes described above give us cache without any stale reads. That comes at a cost - each modification (insert, update or removal) requires 2 accesses to the cache (though, sometimes the second access can be asynchronous). Some applications do not require such strict consistency - and that’s where nonstrict-read-write comes to the scene. Here we guarantee that the cache will provide the same result as DB after the modifying transaction commits - between DB commit and transaction commit a stale value can be provided. If you use asynchronous cache, this may be delayed even more but unless the operation fails (e.g. due to locking timeout) the cache will eventually get into a state consistent with DB. This allows us to limit modifications to single cache access per modification.

Note that nonstrict-read-write mode is supported only for versioned entities/collections (that way we can find out which entity is actually newer). Also, you cannot use eviction in nonstrict-read-write mode, for the same reason as in tombstone-based modes. Invalidation cache mode is not supported neither.

If you’ll try out the most recent Hibernate ORM, you’ll find out that Infinispan 7.2.x is used there. This is because ORM 5.0.0.Final was released before Infinispan 8.0.0.Final went out and we can’t change the major version of dependency in micro-release. However, we try to keep Infinispan 8.0.x binary compatible (in parts used by Hibernate), and therefore you can just replace the dependencies on classpath and use the most recent Infinispan, if you prefer to do so.

To sum things up, here is the table of supported configurations:

Concurrency strategy

Cache transactions

Cache mode

Implementation

Eviction

transactional

invalidation

pending puts

yes

read-write

non-transactional

replicated/distributed

tombstones

nonstrict-read-write

versioned entries

There’s also the read-only mode - this can be used instead of both transactional or read-write modes, but at this point it does not offer any further performance gains, since we have to make sure that you don’t see a removed value. Actually, it also does not matter whether you specify transactional or read-write mode; the proper strategy will be picked according to your cache configuration (transactional vs. non-transactional).

We hope that you’ll try these new modes and many consistency fixes included along (you should use Hibernate ORM 5.0.2.Final or later), and tell us about your experience.

Happy caching!

Posted by Unknown on 2015-10-01
Tags: jpa hibernate second level cache provider

Thursday, 30 May 2013

Infinispan 5.3.0.CR1 is out!

Besides a handful of fixes, this release contains two very important contributions:

a mongoDB cache store which allows using Infinispan as a cache on top of a mongodb instance. Courtesy of Guillaume Scheibel
a JPA based cache store that allows an easy setup for Infinispan as a cache in front of a database. Courtesy of Ray Tsang

Please stay tuned for blogs detailing these features.

For a complete list of features included in this release refer to the release notes.

Visit our downloads section to find the latest release and if you have any questions please check our forums, our mailing lists or ping us directly on IRC.

Cheers,

Mircea

Posted by Mircea Markus on 2013-05-30
Tags: cachestore jpa mongodb loader release candidate

Friday, 17 June 2011

So you want JPA-like access to Infinispan?

Back in the early days of Infinispan (since our first public announcement, in fact) we always had it in mind to expose a JPA-like layer to Infinispan. Initially this was as a replacement to the fine-grained replication that JBoss Cache's POJO Cache variant offered, but it grew beyond just a technique to do fine-grained replication on complex object graphs. The fact that it offered a familiar data storage API to Java developers was big. Huge, in fact.

So we realised JPA-on-Infinispan was firmly on the roadmap. The original plan was to implement the entire set of JPA APIs from scratch, but this was a daunting and Herculean task. After much discussion with core Hibernate architects and Infinispan contributors Emmanuel Bernard and Sanne Grinovero, we came to a decision that rather than implementing all this from scratch, it served both Infinispan and the community better to fork Hibernate’s core ORM engine, and replace the relational database mappings with key/value store mappings. And we get to reuse the mature codebase of Hibernate’s session and transaction management, object graph dehydration code, proxies, etc.

And Hibernate OGM (Object-Grid Mapping) was born. After initial experiments and even a large-scale public demo at the JBoss World 2011 Keynote, Emmanuel has officially blogged about the launch of Hibernate OGM. Very exciting times, Infinispan now has a JPA-like layer. :-)

To reiterate a key point from Emmanuel’s blog, Hibernate OGM is still in its infancy. It needs community participation to help it grow up and mature. This is where the Infinispan community should step in; consider Hibernate OGM as Infinispan’s JPA-like layer and get involved. For more details, please read Emmanuel’s announcement.

Enjoy! Manik

Posted by Manik Surtani on 2011-06-17
Tags: hibernate ogm jpa hibernate API

Tuesday, 07 June 2011

Faster Infinispan-based Second Level Cache coming to a JPA app near you!

Starting with Hibernate 4.0.0.Beta1, Infinispan second level cache interacts with the transaction manager as a JTA Synchronization instead of an XA resource. Based on some testing we’ve done in JBoss AS 7, this has resulted in a huge performance increase thanks to the optimisations the transaction manager can apply to Synchronizations, which work very well when Infinispan is used as a cache rather than as a authoritative data store.

From an Infinispan configuration perspective, nothing needs changing. The Infinispan provider still uses the same base configuration by default. Transactional configuration happens within the cache provider itself and it’s here where the Infinispan is configured with Hibernate’s transaction manager and where Infinispan is configured to participate as a JTA synchronization. This is the default configuration, so from an user perspective, there’s nothing you have to do to take advantage of this new change.

However, you can always switch back to previous behaviour where Infinispan interacted as an XA resource via a dedicated Hibernate property called hibernate.cache.infinispan.use_synchronization

_ _

By default this property is set to true. If you set it false, Infinispan will interact with the transaction manager as an XA resource.

For more detailed information, check the "http://community.jboss.org/docs/DOC-14105[Using Infinispan As JPA Hibernate Second Level Cache Provider]" wiki.

Cheers,

Galder

Posted by Galder Zamarreño on 2011-06-07
Tags: transactions jpa synchronization hibernate second level cache provider