Getting started with Elasticsearch using Lucee Part 2
In Part 1 of this basic introduction to Elasticsearch (ES) for Lucee developers, we covered the basics of creating, populating, updating and deleting an ES index. We also briefly tested searching an index by passing a keyword in the request query string: /blogposts/_search?q=simplicity
However, to access the full power of ES our search requests will need to contain a lot more than just a "q=" parameter. ES provides a sophisticated API allowing us to precisely control how we want the search to operate, and like the indexing requests we looked at before, these instructions need to be built-up and passed to ES as JSON.
Executing a search
To take a very basic scenario, we again want to search our "blogpost" index for the term "simplicity", but we need the request to:
- search the three content fields: "title", "summary" and "body"
- prioritize in the results posts that have our search term in the title
- be able to display the title in our search results listing
- limit the results to the most relevant 10 posts
The following code will generate the request and convert the response to a CFML struct that we can work with:
searchQueryString = "simplicity";
// ^3 means "give this field a boost in terms of relevance scoring"
fieldsToSearch = [ "title^3", "summary", "body" ];
fieldsToReturnForDisplay = [ "title" ];
// build our search request instructions for ES
q = {
size: 10
,_source: fieldsToReturnForDisplay
,query: {
query_string: {
query: searchQueryString
,fields: fieldsToSearch
}
}
};
// pass the instructions as JSON to ES
requestBody = serializeJson( q );
http url="http://localhost:9200/blogposts/_search" method="POST" result="result" {
httpParam type="header" name="Content-Type" value="application/json";
httpParam type="body" value=requestBody;
}
// convert the JSON response to a CFML struct
resultsData = deserializeJson( result.filecontent )
dump( resultsData );
Working with the results
The converted JSON response is as follows:
In the dumped struct we can see the data that we'll need for our search results display page, but we first need to pull those bits out and assemble them as a simple data structure we can pass to a view.
// pull out the hits array from the ES response
elasticHitsData = resultsData.hits.hits;
// convert the hits to a struct with just 2 keys - "id" and "title" - which we can use for our results display
convertElasticHitsToDisplayData = function( hit ){
return { id: hit._id, title: hit._source.title };
};
displayData = elasticHitsData.map( convertElasticHitsToDisplayData );
It would make sense to pass the display data directly to our search results page view to avoid further processing, but it can easily be converted to a query for ease of output.
// create a query from the simplified data
displayResultsData = queryNew( "id, title ", "numeric,varchar", displayData );
dump( displayResultsData );
More complex, but more powerful
Having been introduced to the basics of working with an ES index, you may be thinking that compared to the ColdFusion search tags it all seems quite a bit more complicated. And you'd be right. (Although modularizing the code into wrapper components will reduce the complexity and is highly recommended).
But what you lose in comparative simplicity with ES you gain many times over in flexibility and control. No longer are you limited to the basic "title"/"description" plus 4 custom fields allowed by <cfindex>
. You can put in what you like and get out what you like, and also tweak the way things work in many, many other ways.
I've only covered the absolute basics here, but ES supports all of the standard features you may be using with <cfsearch>
such as pagination, context highlighting and "suggestions", as well as many other features not supported in CFML.