Monday, 12 March 2012
Doug Lea and the folks on the concurrency-interest group have been hard at work on an update of JSR 166 (concurrency utilities) for Java 8 - called JSR 166e. These include some pretty impressive changes in a building-block we’ve all come to rely pretty heavily on, the ConcurrentHashMap.
One if the big drawbacks in the ConcurrentHashMap, since it was introduced in Java 5, has always been memory footprint. It is kinda bulky, especially when compared to a regular HashMap - 1.6 kb in memory versus about 100 bytes! Yes, these are for empty maps.
In Java 8, one of the improvements in the ConcurrentHashMap has been memory footprint - now closer to a regular HashMap. In addition to that, the new Java 8 CHM performs better under concurrent load when compared to its original form. See this discussion and comments in the proposed ConcurrentHashMapV8 sources for more details.
So, Infinispan makes pretty heavy use of ConcurrentHashMaps internally. One change in the recently released Infinispan 5.1.2.Final is these internal CHMs are built using a factory, and we’ve included a version of the Java 8 CHM in Infinispan. So by default, Infinispan uses the JDK’s provided CHM. But if you wish to force Infinispan to use the backported Java 8 CHM, all you need to do is include the following JVM parameter when you start Infinispan:
We’d love to hear what you have to say about this, in terms of memory footprint, garbage collection and overall performance. Please use the Infinispan user forums to provide your feedback.
Tags: event performance community garbage collection concurrency
Thursday, 22 December 2011
One of the things I’ve done recently was to benchmark how quickly Infinispan starts up. Specifically looking at LOCAL mode (where you don’t have the delays of opening sockets and discovery protocols you see in clustered mode), I wrote up a very simple test to start up 2000 caches in a loop, using the same cache manager.
This is a pretty valid use case, since when used as a non-clustered 2nd level cache in Hibernate, a separate cache instance is created per entity type, and in the past this has become somewhat of a bottleneck.
In this test, I compared Infinispan 5.0.1.Final, 5.1.0.CR1 and 5.1.0.CR2. 5.1.0 is significantly quicker, but I used this test (and subsequent profiling) to commit a couple of interesting changes in 5.1.0.CR2, which has improved things even more - both in terms of CPU performance as well as memory footprint.
Essentially, 5.1.0.CR1 made use of Jandex to perform annotation scanning of internal components at build-time, to prevent expensive reflection calls to determine component dependencies and lifecycle at runtime. 5.1.0.CR2 takes this concept a step further - now we don’t just cache annotation lookups at build-time, but entire dependency graphs. And determining and ordering of lifecycle methods are done at build-time too, again making startup times significantly quicker while offering a much tighter memory footprint.
Enough talk. Here is the test used, and here are the performance numbers, as per my laptop, a 2010 MacBook Pro with an i5 CPU.
Multiverse:InfinispanStartupBenchmark manik [master]$ ./bench.sh ---- Starting benchmark ---
Please standby …
Using Infinispan 5.0.1.FINAL (JMX enabled? false) Created 2000 caches in 10.9 seconds and consumed 172.32 Mb of memory.
Using Infinispan 5.0.1.FINAL (JMX enabled? true) Created 2000 caches in 56.18 seconds and consumed 315.21 Mb of memory.
Using Infinispan 5.1.0.CR1 (JMX enabled? false) Created 2000 caches in 7.13 seconds and consumed 157.5 Mb of memory.
Using Infinispan 5.1.0.CR1 (JMX enabled? true) Created 2000 caches in 34.9 seconds and consumed 243.33 Mb of memory.
Using Infinispan 5.1.0.CR2(JMX enabled? false) Created 2000 caches in 3.18 seconds and consumed 142.2 Mb of memory.
Using Infinispan 5.1.0.CR2(JMX enabled? true) Created 2000 caches in 17.62 seconds and consumed 176.13 Mb of memory.
A whopping 3.5 times faster, and significantly more memory-efficient especially when enabling JMX reporting. :-)
Tags: benchmarks cpu memory performance
Wednesday, 21 December 2011
Infinispan 'Brahma' 5.1.0.CR2 is out now with a load of fixes and a few internal changes such the move to a StaX based XML parser as opposed to relying on JAXB which did not get in for CR1. The new parser is a lot faster and has less overhead and does not require any changes from a user perspective.
We’ve also worked on improving startup time by indexing annotation metadata at build time and reading it at runtime. From a Infinispan user perspective, there’s been some changes to how Infinispan is extended, in particular related to custom command implementations, where we know use JDK’s ServiceLoader to load them.
Cheers, Merry Christmas and a Happy New Year to all the Infinispan community! :) Galder
Tags: custom commands performance
Wednesday, 23 November 2011
Don’t be scared the name, use1PcForAutoCommitTransactions is a new feature (5.1.CR1) that does quite a cool thing: increases your transactions’s performance. Let me explain. Before Infinispan 5.1 you could access the cache both transactional and non-transactional. Naturally the non-transactional access is faster and offers less consistency guarantees.But we don’t support mixed access in Infinispan 5.1, so what what’s to be done when you need the speed of non-transactional access and you are ready to trade some consistency guarantees for it? Well here is where use1PcForAutoCommitTransactions helps you. What is does is forces an induced (autoCommit=true) transaction to commit in a single phase. So only 1 RPC instead of 2RPCs as in the case of a full 2 Phase Commit (2PC).
You might end up with inconsistent data if multiple transactions modify the same key concurrently. But if you know that’s not the case, or you can live with it then use1PcForAutoCommitTransactions will help your performance considerably.
Let’s say you do a simple put operation outside the scope of a transaction:
Now let’s see how this would behave if the cache has several different transaction configurations:
The put will happen in two RPCs/steps: a prepare message is sent around and then a commit.
The put happens in one RPC as the prepare and the commit are merged. Better performance.
Tags: transactions performance
Monday, 03 October 2011
If you ever used Infinispan in a transactional way you might be very interested in this article as it describes some very significant improvements in version 5.1 "Brahma" (released with 5.1.Beta1):
starting with this release an Infinispan cache can accessed either transactionally or non-transactionally. The mixed access mode is no longer supported (backward compatibility still maintained, see below). There are several reasons for going this path, but one of them most important result of this decision is a cleaner semantic on how concurrency is managed between multiple requestors for the same cache entry.
starting with 5.1 the supported transaction models are optimistic and pessimistic. Optimistic model is an improvement over the existing default transaction model by completely deferring lock acquisition to transaction prepare time. That reduces lock acquisition duration and increases throughput; also avoids deadlocks. With pessimistic model, cluster wide-locks are being acquired on each write and only being released after the transaction completed (see below).
It’s up to you as an user to decide weather you want to define a cache as transactional or not. By default, infinispan caches are non transactional. A cache can be made transactional by changing the transactionMode attribute:
transactionMode can only take two values: TRANSACTIONAL and NON_TRANSACTIONAL. Same thing can be also achieved programatically:
Important:for transactional caches it is required to configure a TransactionManagerLookup.
The autoCommit attribute was added in order to assure backward compatibility. If a cache is transactional and autoCommit is enabled (defaults to true) then any call that is performed outside of a transaction’s scope is transparently wrapped within a transaction. In other words Infinispan adds the logic for starting a transaction before the call and committing it after the call. So if your code accesses a cache both transactionally and non-transactionally, all you have to do when migrating to Infinispan 5.1 is mark the cache as transactional and enable autoCommit (that’s actually enabled by default, so just don’t disable it :) The autoCommit feature can be managed through configuration:
With optimistic transactions locks are being acquired at transaction prepare time and are only being held up to the point the transaction commits (or rollbacks). This is different from the 5.0 default locking model where local locks are being acquire on writes and cluster locks are being acquired during prepare time. Optimistic transactions can be enabled in the configuration file:
By default, a transactional cache is optimistic.
From a lock acquisition perspective, pessimistic transactions obtain locks on keys at the time the key is written. E.g.
When cache.put(k1,v1) returns k1 is locked and no other transaction running anywhere in the cluster can write to it. Reading k1 is still possible. The lock on k1 is released when the transaction completes (commits or rollbacks).
Pessimistic transactions can be enabled in the configuration file:
From a use case perspective, optimistic transactions should be used when there’s not a lot of contention between multiple transactions running at the same time. That is because the optimistic transactions rollback if data has changed between the time it was read and the time it was committed (writeSkewCheck). On the other hand, pessimistic transactions might be a better fit when there is high contention on the keys and transaction rollbacks are less desirable. Pessimistic transactions are more costly by their nature: each write operation potentially involves a RPC for lock acquisition.
This major transaction rework has opened the way for several other transaction related improvements:
Single node locking model is a major step forward in avoiding deadlocks and increasing throughput by only acquiring locks on a single node in the cluster, disregarding the number of redundant copies (numOwners) on which data is replicated
Lock acquisition reordering is a deadlock avoidance technique that will be used for optimistic transactions
Incremental locking is another technique for minimising deadlocks.
Stay tuned! Mircea
Tags: transactions locking deadlock detection performance