Thursday, 26 September 2013
Embedded and remote queries in Infinispan 6.0.0.Beta1
If you’re following Infinispan’s mailing lists you’ve probably caught a glimpse of the new developments in the Query land: a new DSL, remote querying via Hot Rod client, a new marshaller based on Google’s Protobuf. Time to unveil these properly!
==== The new Query DSL
Starting with version 6.0 Infinispan offers a new (experimental) way of running queries against your cached entities based on a simple filtering DSL. The aim of the new DSL is to simplify the way you write queries and to be agnostic of the underlying query mechanism(s) making it possible to provide alternative query engines in the future besides Lucene and still being able to use the same query language/API. The previous Hibernate Search & Lucene based approach is still in place and will continue to be supported and in fact the new DSL is currently implemented right on top of it. The future will surely bring index-less searching based on map-reduce and possibly other new cool search technologies.
Running DSL-based queries in embedded mode is almost identical to running the existing Lucene-based queries. All you need to do is have infinispan-query-dsl.jar and infinispan-query.jar in your classpath (besides Infinispan and its dependecies), enable indexing for your caches, annotate your POJO cache values and your’re ready.
__
ConfigurationBuilder cfg = new ConfigurationBuilder(); cfg.indexing().enable(); DefaultCacheManager cacheManager = new DefaultCacheManager(cfg.build()); Cache cache = cacheManager.getCache();
____Alternatively, indexing (and everything else) can also be configured via XML configuration, as already described in the user guide, so we’ll not delve into details here.
Your Hibernate Search annotated entity might look like this.
__
import org.hibernate.search.annotations.*; ... @Indexed public class User { @Field(store = Store.YES, analyze = Analyze.NO) private String name; @Field(store = Store.YES, analyze = Analyze.NO, indexNullAs = Field.DEFAULT_NULL_TOKEN) private String surname; @IndexedEmbedded(indexNullAs = Field.DEFAULT_NULL_TOKEN) private List addresses; // .. the rest omitted for brevity }
___Running a DSL based query involves obtaining a _https://github.com/infinispan/infinispan/blob/6.0.0.Beta1/query-dsl/src/main/java/org/infinispan/query/dsl/QueryFactory.java[QueryFactory] from the (cache scoped) SearchManager and then constructing the query as follows:
__
import org.infinispan.query.Search; import org.infinispan.query.dsl.QueryFactory; import org.infinispan.query.dsl.Query; ... QueryFactory qf = Search.getSearchManager(cache).getQueryFactory(); Query q = qf.from(User.class) .having("name").eq("John") .toBuilder().build(); List list = q.list(); assertEquals(1, list.size()); assertEquals("John", list.get(0).getName()); assertEquals("Doe", list.get(0).getSurname());
___That’s it! I’m sure this raised your curiosity as to what the DSL is actually capable of so you might want to look at the list of supported filter operators in _https://github.com/infinispan/infinispan/blob/6.0.0.Beta1/query-dsl/src/main/java/org/infinispan/query/dsl/FilterConditionEndContext.java[FilterConditionEndContext]. Combining multiple conditions with boolean operators, including sub-conditions, is also possible:
Query q = qf.from(User.class) .having("name").eq("John") .and().having("surname").eq("Doe") .and().not(qf.having("address.street").like("%Tanzania%").or().having("address.postCode").in("TZ13", "TZ22")) .toBuilder().build();
The DSL is pretty nifty right now and will surely be expanded in the future based on your feedback. It also provides support for result pagination, sorting, projections, embedded objects, all demonstrated in QueryDslConditionsTest which I encourage you to look at until the proper user guide is published. Still, this is not a relational database, so keep in mind that all queries are written in the scope of the single targeted entity (and its embedded entities). There are no joins (yet), no correlated subqueries, no grouping or aggregations.
Moving further, probably the most exciting thing about the new DSL is using it remotely via the Hot Rod client. But to make this leap we first had to adopt a common format for storing our cache entries and marshalling them over the wire that would also be cross-language and robust enough to support evolving object schemas. But probably most of all, this format had to have a schema rather than just being an opaque blob otherwise indexing and searching are meaningless. Enter Protocol Buffers.
The Protobuf marshaller
Configuring the RemoteCacheManager of the Java Hot Rod client to use it is straight forward: __
import org.infinispan.client.hotrod.configuration.ConfigurationBuilder; ... ConfigurationBuilder clientBuilder = new ConfigurationBuilder(); clientBuilder.addServer() .host("127.0.0.1").port(11234) .marshaller(new ProtoStreamMarshaller());
___Now you’ll be able to store and get from the remote cache your _User instaces encoded in protobuf format provided that:
-
a Protobuf type was declared for your entity in a .proto file which was then compiled into a .protobin binary descriptor
-
the binary descriptor was registered with your RemoteCacheManager's ProtoStreamMarshaller instance like this: __
ProtoStreamMarshaller.getSerializationContext(remoteCacheManager) .registerProtofile("my-test-schema.protobin");
__3. a per-entity marshaller was registered:
ProtoStreamMarshaller.getSerializationContext(remoteCacheManager) .registerMarshaller(User.class, new UserMarshaller());
___Steps 2 and 3 are closely tied to the way Protosteam library works, which is pretty straight forward but cannot be detailed here. Having a look at our _UserMarshaller sample should clear this up.
Keeping your objects stored in protobuf format has the benefit of being able to consume them with compatible clients written in other languages. But if this does not sound enticing enough probably the fact they can now be easily indexed should be more appealing.
Remote querying via the Hot Rod client
Given a RemoteCacheManager configured as previously described the next steps to enable remote query over its caches are:
-
add the DSL jar to client’s classpath, infinispan-remote-query-server.jar to server’s classpath and infinispan-remote-query-client.jar to both
-
enable indexing in your cache configuration - same as for embedded mode
-
register your protobuf binary descriptor by invoking the 'registerProtofile' method of the server’s ProtobufMetadataManager MBean (one instance per EmbeddedCacheManager)
All data placed in cache now is being indexed without the need to annotate your entities for Hibernate Search. In fact these classes are only meaningful to the Java client and do not even exist on the server.
Running the queries over the Hot Rod client is now very similar to embedded mode. The DSL is in fact the same. The only part that is slightly different is how you obtain the QueryFactory:
__
import org.infinispan.client.hotrod.Search; import org.infinispan.query.dsl.QueryFactory; import org.infinispan.query.dsl.Query; ... remoteCache.put(2, new User("John", "Doe", 33)); QueryFactory qf = Search.getQueryFactory(remoteCache); Query query = qf.from(User.class) .having("name").eq("John") .toBuilder().build(); List list = query.list(); assertEquals(1, list.size()); assertEquals("John", list.get(0).getName()); assertEquals("Doe", list.get(0).getSurname());
__
Voila! The end of our journey for today! Stay tuned, keep an eye on Infinispan Query and please share your comments with us.
Tags: protostream hotrod lucene Protobuf remote query hibernate search embedded query Infinispan Query DSL
Wednesday, 23 September 2009
Infinispan Query breaks into 4.0.0.CR1
Hello all,
Querying is an important feature for Infinispan, so we’ve decided to include a technology preview of this for 4.0.0.CR1 and 4.0.0.GA, even though it is only really scheduled for Infinispan 4.1.0.
Browse to this wiki page to see how the new API works for querying, along with usage examples.#
Origins#
Some of the API has come from JBoss Cache Searchable but has been enhanced and runs slicker. A lot more work is being done under the hood so it makes it easier for users. For example, the API method on the QueryFactory.getBasicQuery() just needs two Strings and builds a basic Lucene Query instance, as opposed to forcing the user to create a Lucene query manually. This is still possible however, should a user want to create a more complex query.
The indexing for Lucene is now done through interceptors as opposed to listeners, and hence more tightly integrated into Infinispan’s core.
You can also choose how indexes are maintained. If indexes are shared (perhaps stored on a network mounted drive), then you only want nodes to index changes made locally. On the other hand, if each node maintains its own indexes (either in-memory on on a local filesystem) then you want each node to index changes made, regardless of where the changes are made. This behaviour is controlled by a system property - -Dinfinispan.query.indexLocalOnly=true. However, this is system property temporary and will be replaced with a proper configuration property once the feature is out of technology preview.
What’s coming up? Future releases of Hibernate Search and Infinispan will have improvements that will change the way that querying works. The QueryHelper class - as documented in the wiki - is temporary so that will eventually be removed, as you will not need to provide the class definitions of the types you wish to index upfront. We will be able to detect this on the fly (see HSEARCH-397)
There will be a better system for running distributed queries. And the system properties will disappear in favour of proper configuration attributes.
And also, GSoC student Lukasz Moren’s work involving an Infinispan-based Lucene Directory implementation will allow indexes to be shared cluster-wide by using Infinispan itself to distribute these indexes. All very clever stuff.
Thanks for reading!
Navin.
Tags: jboss cache lucene hibernate hibernate search index query