Searching with Apache Solr

This tutorial is adapted from the Web Age course Apache Solr for Data Engineers.

Part 1. Solr Sets

Solr has many capabilities when it comes to searching, of course, this is dependent on the data that is being utilized in the set. Like SQL, it is very important to know and understand the data sets prior to running high-level queries on them.

Let’s work again with the tech products data set to get a feel for types of queries that make sense with the data we have.

  1. To launch Solr, run: bin/solr start -e techproducts on

bin/solr start -e techproducts

  1. Open the Solr Admin web interface and view the data by choosing the tech products and using the Execute Query button. Next, fill in the following fields… start, row set both values to 0, check the box next to facet, in the field enter a price. Now use the Execute Query again. What do you see?

The result is a count of how many times a unique price was found in the data. Notice at the top of the Solr Admin tool, there is a link. This is the query we ran.

http://localhost:8983/solr/techproducts/select?facet.field=price&facet=on&q=*%3 A*&rows=0&start=0&wt=json

Look familiar? Ever done a Google search?

  1. Try a few more fields like cat or inStock. To see all the field names replace the start, rows with new values of 0 and
  2. Next return to the terminal. Here is an example of a simple facet query being run through curl. Try this now:

curl http://localhost:8983/solr/techproducts/query -d ‘

{

“query”: “*:*”, “facet”: {

“high_popularity”: { “type”: “query”,

“q”: “popularity:[8 TO 10]”, “facet” : {

“average_price” : “avg(price)”

}

}

}

}’

 

Here is the response header information. How many documents were searched for this query? What restrictions were made using facet? Scroll up in the terminal until you see.

 

Now scroll to the bottom. Here the average price rolled up for the 32 products in the set.

Part 2. Searching with Highlighting

Highlighting is the ability to pick out a particular search value in a set of data. Maybe it is the manufacture like Apple or Corsair, or maybe something more specific like SATA or SDRAM that we are searching for.

  1. Build a query and pass it from

In this example, we are going to use highlighting to find the manufacturer Cosair. In the terminal type the following:

http://curl “http://localhost:8983/solr/techproducts/select?fl=id%2Cname%2Cmanu%2Ccat&hl.fl=m anu&hl=on&q=%27%3Dcorsair%27&rows=100&start=0&wt=json”

The result shows:

A perfect match of the criteria requested.

  1. Now let’s see if we can use the Solr Admin interface to produce the same Choose the techproducts data again and move to the Query screen. http://localhost:8983/solr/#/techproducts/query  Here we will being to modify the query to find the results we want. First change the q field to ‘=cosair’ and the start,rows range to 0, 100 respectively. Let’s limit the fields being returned to id, name, manu, cat. In the fl field enter id,name,manu,cat with no separation. Next choose the checkbox next to hl and finally in the hl.fl field type manu and run Execute Query? What do you see?

 

  1. Try a different query for Apple, SDRAM and SATA. Remember you may need to match different fields for the last Hint: name.

 

Press CTRL + c to exit the Shell window. This is the last step in this lab.

  1. Now let’s stop the Solr daemon, type:

 

bin/solr stop -all

 

  1. Close the web

Part 3 Review

In this tutorial, we learned about the Solr lab environment and queried Solr.

Leave a Reply

Your email address will not be published. Required fields are marked *