Thursday, 08 December 2016

Meet Ickle!

As you’ve already learned from an earlier post this week, Infinispan 9 is on its final approach to landing and is bringing a new query language. Hurray! But wait, was there something wrong with the old one(s)? Not wrong really …​  I’ll explain.

Infinispan is a data grid of several query languages. Historically, it has offered search support early in its existence by integrating with Hibernate Search which provides a powerful Java-based DSL enabling you to build Lucene queries and run them on top of your Java domain model living in the data grid. Usage of this integration is confined to embedded mode, but that still succeeds in making Java users happy.

While the Hibernate Search combination is neat and very appealing to Java users it completely leaves non-JVM languages accessing Infinispan via remote protocols out in the cold.

Enter Remote Query. Infinispan 6.0 starts to address the need of searching the grid remotely via Hot Rod. The internals are still built on top of Lucene and Hibernate Search bedrock but these technologies are now hidden behind a new query API, the QueryBuilder, an internal DSL resembling JPA criteria query. The QueryBuilder has implementations for both embedded mode and Hot Rod. This new API provides all relational operators you can think of, but no full-text search initially, we planned to add that later.

Creating a new internal DSL was fun. However, having a long term strategy for evolving it while keeping complete backward compatibility and also doing so uniformly across implementations in multiple languages proved to be a difficult challenge. So while we were contemplating adding new full-text operators to this DSL we decided on making a long leap forward and adopt a more flexible alternative by having our own string based query language instead, another DSL really, albeit an external one this time.

So after the long ado, let me introduce Ickle, Infinispan’s new query language, conspicuously resembling JP-QL.

Ickle:

  • is a light and small subset of JP-QL, hence the lovely name

  • queries Java classes and supports Protocol Buffers too

  • queries can target a single entity type

  • queries can filter on properties of embedded objects too, including collections

  • supports projections, aggregations, sorting, named parameters

  • supports indexed and non-indexed execution

  • supports complex boolean expressions

  • does not support computations in expressions (eg. user.age > sqrt(user.shoeSize + 3) is not allowed but user.age >= 18 is fine)

  • does not support joins

  • but, navigations along embedded entities are implicit joins and are allowed

  • joining on embedded collections is allowed

  • other join types not supported

  • subqueries are not supported

  • besides the normal relational operators it offers full-text operators, similar to Lucene’s  query parser

  • is now supported across various Infinispan APIs, wherever a Query produced by the QueryBuilder is accepted (even for continuous queries or in event filters for listeners!)

That is to say we squeezed JP-QL to the bare minimum and added full-text predicates that closely follow the syntax of Lucene’s query parser.

If you are familiar with JPA/JP-QL then the following example will speak for itself:

select accountId, sum(amount) from com.acme.Transaction
    where amount < 20.0
    group by accountId
    having sum(amount) > 1000.0
    order by accountId

The same query can be written using the QueryBuilder:

Query query = queryFactory.from(Transaction.class)
.select(Expression.property("accountId"), Expression.sum("amount"))
.having("amount").lt(20.0)
.groupBy("accountId")
.having(Expression.sum("amount")).gt(1000.0)
.orderBy("accountId").build();

Both examples look nice but I hope you will agree the first one is better.

Ickle supports several new predicates for full-text matching that the QueryBuilder is missing. These predicates use the : operator that you are probably familiar from Lucene’s own query language.  This example demonstrates a simple full-text term query:

select transactionId, amount, description from com.acme.Transaction
where amount > 10 and description : "coffee"

As you can see, relational predicates and full-text predicates can be combined with boolean operators at will.

The only important thing to remark here is relational predicates are applicable to non-analyzed fields while full-text predicates can be applied to analyzed field only. How does indexing work, what is analysis and how do I turn it on/off for my fields? That’s the topic of a future post, so please be patient or start readinghttps://docs.jboss.org/hibernate/search/5.6/reference/en-US/html_single/#_analysis[ here].

Besides term queries we support several more:

  • Term                     description : "coffee"

  • Fuzzy                    description : "cofee"~2

  • Range                    amount : [40 to 90}`

  • Phrase                   description : "hello world"

  • Proximity                description : "canceling fee"~3

  • Wildcard                 description : "te?t"

  • Regexp                  description : /[mb]oat/

  • Boosting                 description : "beer"^3 and description :"books"

You can read all about them starting from here.

But is Ickle really new? Not really. The name is new, the full-text features are new, but a JP-QL-ish query string was always internally present in the Query objects produced by the QueryBuilder since the beginning of Remote Query. That language was never exposed and specified until now. It evolved significantly over time and now it is ready for you to use it. The QueryBuilder / criteria-like API is still there as a convenience but it might go out of favor over time. It will be limited to non-full-text functionality only. As Ickle grows we’ll probably not be able to include some of the additions in the QueryBuilder in a backward compatible manner. If growing will cause too much pain we might consider deprecating it in favor of Ickle or if there is serious demand for it we might continue to evolve the QueryBuilder in a non compatible manner.

Being a string based query language, Ickle is very convenient for our REST endpoint, the CLI, and the administration console allowing you to quickly inspect the contents of the grid. You’ll be able to use it there pretty soon. We’ll also continue to expand Ickle with more advanced full-text features like spatial queries and faceting, but that’s a subject for another major version. Until then, why not grab the current 9.0 Beta1 and test drive the new query language yourself? We’d love to hear your feedback on the forum, on our issue tracker or on IRC on the #infinispan channel on Freenode.

Happy coding!

Posted by Unknown on 2016-12-08
Tags: JP-QL Hibernate-Search jpa lucene full-text indexing language query DSL

Friday, 17 April 2015

Infinispan 7.2.0.CR1 released

Dear community,

We are proud to announce the release of Infinispan 7.2.0.CR1!

This is the first release candidate of 7.2, bringing some exciting new features:

  • Faster bulk operations, putAll and getAll, for both embedded and HotRod client (ISPN-2183, ISPN-5264, ISPN-5266)

  • Cache creation and configuration changes without the need to restart the server (ISPN-5147)

  • Support for defining filters using the Query DSL for event listeners (ISPN-5349 and ISPN-5350)

  • Lock-free clear() operation (ISPN-5370)

For the complete list of features and bug fixes, please refer to the release notes

Feel free to join us and shape the future releases on our forums, our mailing lists or our #infinispan IRC channel.

Posted by Gustavo on 2015-04-17
Tags: getAll hotrod putAll listeners release candidate DSL

Thursday, 26 September 2013

Embedded and remote queries in Infinispan 6.0.0.Beta1

If you’re following Infinispan’s mailing lists you’ve probably caught a glimpse of the new developments in the Query land: a new DSL, remote querying via Hot Rod client, a new marshaller based on Google’s Protobuf. Time to unveil these properly!

==== The new Query DSL

Starting with version 6.0 Infinispan offers a new (experimental) way of running queries against your cached entities based on a simple filtering DSL. The aim of the new DSL is to simplify the way you write queries and to be agnostic of the underlying query mechanism(s) making it possible to provide alternative query engines in the future besides Lucene and still being able to use the same query language/API. The previous Hibernate Search & Lucene based approach is still in place and will continue to be supported and in fact the new DSL is currently implemented right on top of it. The future will surely bring index-less searching based on map-reduce and possibly other new cool search technologies.

Running DSL-based queries in embedded mode is almost identical to running the existing Lucene-based queries. All you need to do is have infinispan-query-dsl.jar and infinispan-query.jar in your classpath (besides Infinispan and its dependecies), enable indexing for your caches, annotate your POJO cache values and your’re ready.

__

ConfigurationBuilder cfg = new ConfigurationBuilder();
cfg.indexing().enable();

DefaultCacheManager cacheManager = new DefaultCacheManager(cfg.build());

Cache cache = cacheManager.getCache();

____Alternatively, indexing (and everything else) can also be configured via XML configuration, as already described in the user guide, so we’ll not delve into details here.

Your Hibernate Search annotated entity might look like this.

__

import org.hibernate.search.annotations.*;
...

@Indexed
public class User {

    @Field(store = Store.YES, analyze = Analyze.NO)
    private String name;

    @Field(store = Store.YES, analyze = Analyze.NO, indexNullAs = Field.DEFAULT_NULL_TOKEN)
    private String surname;

    @IndexedEmbedded(indexNullAs = Field.DEFAULT_NULL_TOKEN)
    private List addresses;

    // .. the rest omitted for brevity
}

___Running a DSL based query involves obtaining a _https://github.com/infinispan/infinispan/blob/6.0.0.Beta1/query-dsl/src/main/java/org/infinispan/query/dsl/QueryFactory.java[QueryFactory] from the (cache scoped) SearchManager and then constructing the query as follows:

__

import org.infinispan.query.Search;
import org.infinispan.query.dsl.QueryFactory;
import org.infinispan.query.dsl.Query;
...

QueryFactory qf = Search.getSearchManager(cache).getQueryFactory();

Query q = qf.from(User.class)
    .having("name").eq("John")
    .toBuilder().build();

List list = q.list();

assertEquals(1, list.size());
assertEquals("John", list.get(0).getName());
assertEquals("Doe", list.get(0).getSurname());

___That’s it! I’m sure this raised your curiosity as to what the DSL is actually capable of so you might want to look at the list of supported filter operators in _https://github.com/infinispan/infinispan/blob/6.0.0.Beta1/query-dsl/src/main/java/org/infinispan/query/dsl/FilterConditionEndContext.java[FilterConditionEndContext]. Combining multiple conditions with boolean operators, including sub-conditions, is also possible:

Query q = qf.from(User.class)
    .having("name").eq("John")
    .and().having("surname").eq("Doe")
    .and().not(qf.having("address.street").like("%Tanzania%").or().having("address.postCode").in("TZ13", "TZ22"))
    .toBuilder().build();

The DSL is pretty nifty right now and will surely be expanded in the future based on your feedback. It also provides support for result pagination, sorting, projections, embedded objects, all demonstrated in QueryDslConditionsTest which I encourage you to look at until the proper user guide is published. Still, this is not a relational database, so keep in mind that all queries are written in the scope of the single targeted entity (and its embedded entities). There are no joins (yet), no correlated subqueries, no grouping or aggregations.

Moving further, probably the most exciting thing about the new DSL is using it remotely via the Hot Rod client. But to make this leap we first had to adopt a common format for storing our cache entries and marshalling them over the wire that would also be cross-language and robust enough to support evolving object schemas. But probably most of all, this format had to have a schema rather than just being an opaque blob otherwise indexing and searching are meaningless. Enter Protocol Buffers.

The Protobuf marshaller

Configuring the RemoteCacheManager of the Java Hot Rod client to use it is straight forward: __

import org.infinispan.client.hotrod.configuration.ConfigurationBuilder;
...

ConfigurationBuilder clientBuilder = new ConfigurationBuilder();
clientBuilder.addServer()
    .host("127.0.0.1").port(11234)
    .marshaller(new ProtoStreamMarshaller());

___Now you’ll be able to store and get from the remote cache your _User instaces encoded in protobuf format provided that:

  1. a Protobuf type was declared for your entity in a .proto file which was then compiled into a .protobin binary descriptor

  2. the binary descriptor was registered with your RemoteCacheManager's ProtoStreamMarshaller instance like this: __

ProtoStreamMarshaller.getSerializationContext(remoteCacheManager)
    .registerProtofile("my-test-schema.protobin");

__3. a per-entity marshaller was registered:

ProtoStreamMarshaller.getSerializationContext(remoteCacheManager)
    .registerMarshaller(User.class, new UserMarshaller());

___Steps 2 and 3 are closely tied to the way Protosteam library works, which is pretty straight forward but cannot be detailed here. Having a look at our _UserMarshaller sample should clear this up.

Keeping your objects stored in protobuf format has the benefit of being able to consume them with compatible clients written in other languages. But if this does not sound enticing enough probably the fact they can now be easily indexed should be more appealing.

Remote querying via the Hot Rod client

Given a RemoteCacheManager configured as previously described the next steps to enable remote query over its caches are:

  1. add the DSL jar to client’s classpath, infinispan-remote-query-server.jar to server’s classpath and infinispan-remote-query-client.jar to both

  2. enable indexing in your cache configuration - same as for embedded mode

  3. register your protobuf binary descriptor by invoking the 'registerProtofile' method of the server’s ProtobufMetadataManager MBean (one instance per EmbeddedCacheManager)

All data placed in cache now is being indexed without the need to annotate your entities for Hibernate Search. In fact these classes are only meaningful to the Java client and do not even exist on the server.

Running the queries over the Hot Rod client is now very similar to embedded mode. The DSL is in fact the same. The only part that is slightly different is how you obtain the QueryFactory:

__

import org.infinispan.client.hotrod.Search;
import org.infinispan.query.dsl.QueryFactory;
import org.infinispan.query.dsl.Query;
...

remoteCache.put(2, new User("John", "Doe", 33));

QueryFactory qf = Search.getQueryFactory(remoteCache);

Query query = qf.from(User.class)
    .having("name").eq("John")
    .toBuilder().build();

List list = query.list();
assertEquals(1, list.size());
assertEquals("John", list.get(0).getName());
assertEquals("Doe", list.get(0).getSurname());

__

  

Voila! The end of our journey for today! Stay tuned, keep an eye on Infinispan Query and please share your comments with us.

Posted by Unknown on 2013-09-26
Tags: protostream hotrod lucene Protobuf remote query hibernate search embedded query Infinispan Query DSL

News

Tags

JUGs alpha as7 asymmetric clusters asynchronous beta c++ cdi chat clustering community conference configuration console data grids data-as-a-service database devoxx distributed executors docker event functional grouping and aggregation hotrod infinispan java 8 jboss cache jcache jclouds jcp jdg jpa judcon kubernetes listeners meetup minor release off-heap openshift performance presentations product protostream radargun radegast recruit release release 8.2 9.0 final release candidate remote query replication queue rest query security spring streams transactions vert.x workshop 8.1.0 API DSL Hibernate-Search Ickle Infinispan Query JP-QL JSON JUGs JavaOne LGPL License NoSQL Open Source Protobuf SCM administration affinity algorithms alpha amazon anchored keys annotations announcement archetype archetypes as5 as7 asl2 asynchronous atomic maps atomic objects availability aws beer benchmark benchmarks berkeleydb beta beta release blogger book breizh camp buddy replication bugfix c# c++ c3p0 cache benchmark framework cache store cache stores cachestore cassandra cdi cep certification cli cloud storage clustered cache configuration clustered counters clustered locks codemotion codename colocation command line interface community comparison compose concurrency conference conferences configuration console counter cpp-client cpu creative cross site replication csharp custom commands daas data container data entry data grids data structures data-as-a-service deadlock detection demo deployment dev-preview development devnation devoxx distributed executors distributed queries distribution docker documentation domain mode dotnet-client dzone refcard ec2 ehcache embedded embedded query equivalence event eviction example externalizers failover faq final fine grained flags flink full-text functional future garbage collection geecon getAll gigaspaces git github gke google graalvm greach conf gsoc hackergarten hadoop hbase health hibernate hibernate ogm hibernate search hot rod hotrod hql http/2 ide index indexing india infinispan infinispan 8 infoq internationalization interoperability interview introduction iteration javascript jboss as 5 jboss asylum jboss cache jbossworld jbug jcache jclouds jcp jdbc jdg jgroups jopr jpa js-client jsr 107 jsr 347 jta judcon kafka kubernetes lambda language learning leveldb license listeners loader local mode lock striping locking logging lucene mac management map reduce marshalling maven memcached memory migration minikube minishift minor release modules mongodb monitoring multi-tenancy nashorn native near caching netty node.js nodejs non-blocking nosqlunit off-heap openshift operator oracle osgi overhead paas paid support partition handling partitioning performance persistence podcast presentation presentations protostream public speaking push api putAll python quarkus query quick start radargun radegast react reactive red hat redis rehashing releaase release release candidate remote remote events remote query replication rest rest query roadmap rocksdb ruby s3 scattered cache scripting second level cache provider security segmented server shell site snowcamp spark split brain spring spring boot spring-session stable standards state transfer statistics storage store store by reference store by value streams substratevm synchronization syntax highlighting tdc testing tomcat transactions tutorial uneven load user groups user guide vagrant versioning vert.x video videos virtual nodes vote voxxed voxxed days milano wallpaper websocket websockets wildfly workshop xsd xsite yarn zulip

back to top