Data Container Changes Part 1
Infinispan 9.0 Beta 1 introduces some big changes to the Infinispan data container. This is the first of two blog posts detailing those changes.
This post will cover the changes to eviction which utilizes a new provider, Caffeine. As you may already know Infinispan has supported our own implementations of LRU (Least Recently Used) and LIRS (Low Inter-reference Receny Set) algorithms for our bounded caches.
Our implementations of eviction were even rewritten for Infinispan 8, but we found we still had some issues or limitations with them, especially LIRS. Our old implementation had some problems with keeping the correct number of entries. The new implementation while not having that issue had others, such as being considerably more complex. And while it implemented the entire LIRS specification, it could have memory usage issues. This led us to looking at alternatives and Caffeine seemed like a logical fit as well as being well maintained and the author, Ben Manes, is quite responsive.
Caffeine doesn’t utilize LRU or LIRS for its eviction algorithm and instead implements TinyLFU with an admission window. This has the benefit of the high hit ratio like LIRS, while also requiring low memory overhead like LRU. Caffeine also provides custom weighting for objects, which allow us to reuse the code that was developed for MEMORY based eviction as well.
The only thing that Caffeine doesn’t support is our idea of a custom Equivalence. Thus Infinispan now wraps byte instances to ensure equals and hashCode methods work properly. This also gives us a good opportunity to reevaluate the dataContainer configuration element.
The data container configuration has thus been deprecated and is now replaced by a new configuration element named memory. Also since we are adding a new element the eviction configuration could also be consolidated into memory, and thus eviction is also deprecated. And last but not least the storeAsBinary configuration element has also been integrated into the new memory configuration element. Now we have 1 configuration element instead of 3, can’t beat that!
The new memory configuration will start out pretty simple and new elements can be added as there is a need. The memory element will be composed of a single sub element that can be of three different choices. For this post we will go over two of the sub elements: OBJECT and BINARY.
Object storage stores the actual objects as provided from the user in the Java Heap. This is the default storage method when no memory configuration is provided. This method will provide the best performance when using operations that operate upon the entire data set, such as distributed streams, indexing and local reads etc.
Unfortunately OBJECT storage only allows for COUNT based eviction as we cannot properly estimate user object types properly. This could be improved in a feature version if there is enough interest. Note that you can technically configured MEMORY eviction type with the OBJECT storage type with declarative configuration, but it will throw an exception when you build the configuration. Therefore OBJECT only has a single element named size to determine the amount of entries that can be stored in the cache.
An example of how Object storage can be configured:
Binary storage stores the object in its serialized form in a byte array. This has an interesting side effect of objects are always stored as a deep copy. This can be useful if you want to modify an object after retrieving it without affecting the underlying cache stored object. Since objects have to be deserialized when performing operations on them some things such as distributed streams and local gets will be a little bit slower.
A nice benefit of storing entries as BINARY is that we can estimate the total on heap size of the object. Thus BINARY supports both COUNT and MEMORY based eviction types.
An example of how Binary storage can be configured:
Caffeine should bring us a great solution, while also reducing a lot of maintenance ourselves. The new memory configuration also provides a simpler solution by removing two other configuration elements.