Search API

Use the search endpoint to get a fulltext search functionality with advanced filtering options.

To use this feature, we need to synchronize your product database with our search index. See Indexing the data for more details.

Luigi's Box Search can learn the best results ordering. In order to enable learning, you need to integrate Luigi's Box Search Analytics service with your website by following the instructions.

The search endpoint is publicly available and requires no authentication.

Search

GET https://live.luigisbox.com/search

Required Parameters

   
q User input - query. Optional, if you do not send q parameter, the API will only apply filters (f[] parameter). This is useful for generating listing pages.
tracker_id Identifier of your site within Luigi's Box. You can see this identifier in every URL in the Luigi's Box app once you are logged-in.

Optional Parameters

   
f[]optional Filter using key:value syntax e.g., f[]=categories:Gadgets to filter hits according to chosen criteria.

Filtering on top of numerical and date attributes supports ranges, using pipe as a separator, e.g., f[]=price:5|7. This range can be left open from either side, e.g., f[]=price:6|.

If a combination of filters for the same field is provided, they are applied with OR. E.g., filters f[]=categories:jackets&f[]=categories:coats will retrieve products, that have either jackets OR coats category.
f_must[]optional The same logic applies as for the f[] parameter, except when there are several f_must for the same attribute, they are treated as boolean AND.
sizeoptional How many hits you want the endpoint to return. Defaults to 10, maximum is capped to 200.
sortoptional Allows you to specify ordering of the results, using attr:{asc|desc} syntax, e.g., sort=created_at:desc. In the case of sorting by geo field (e.g., sort=geo_location:asc), search request needs to contain also context[geo_location] representing visitors location.
sort_typeoptional Sort, where type part of the parameter name is a name of a requested type. Allows you to specify ordering of the specific types of the results, using attr:{asc|desc} syntax, e.g., sort_item=created_at:desc.

You can use several sorts in one request, e.g., sort_item=price_amount:asc together with sort_article=introduced_at:desc.
quicksearch_typesoptional A comma separated list of other content types (e.g., category, brand, helpdesk content), which should be (also) searched for alongside the main type (products). These will be without any facets though.
facetsoptional A comma separated list of facets you want to have included in the response. Can be provided as coma separated list, where any value can be provided as facet_name:values_count, e.g. facets=category,material:5 (default values count is 30).
dynamic_facets_sizeoptional If you wish our service to include additional, dynamically identified facets in the response, send the maximum number of such facets in this parameter. Defaults to 0 , i.e., no dynamically identified facets are returned. Dynamic identification of facets is based mainly on categories of retrieved items and their interesting attributes.
pageoptional Which page of the results you want the endpoint to return. Defaults to 1.
fromoptional If you prefer to use an equivalent of offset instead of page number, you can pass it as from parameter, which should be a non-negative integer. An equivalent of page=1 would be from=0.
use_fixitsoptional Allows to control use of fixit rules. Use use_fixits=1 or use_fixits=true to explicitly enable usage of fixit rules. Use other values (such as use_fixits=false) to disable fixit rules for current request. Default value is true, so fixit rules are enabled by default. Look for suggested_url in response to find out whether our system indicates that a redirect should be performed and what should be the destination (based on a matched fixit rule).
prefer[]optional Soft filter, using key:value syntax e.g., prefer[]=category:Gadgets to prefer hits according to chosen criteria. See Query-time boosting for more details.
hit_fieldsoptional A comma separated list of attributes and product parameters. Only these fields (in addition to some default ones) will be retrieved and present in results. If not provided, all fields will be present in results.
remove_fieldsoptional A comma separated list of attributes and product parameters. If provided, these fields will be ommited from the results. If not provided, all fields will be present in results.
user_idoptional If supplied and is equal to user id collected in analytics, it can drive personalization of search results. In case you use identifiers of logged in users (customer_id in analytics), please put the ID of logged in user here and fill in parameter client_id as well.
client_idoptional Set this parameter to client_id (sent in analytics) in case you store identifier of logged in user into user_id.
ctx[]optional drives model selection, using key:value syntax e.g., ctx[]=warehouse:berlin. you can provide multiple key:value pairs, that are combined into one context definition. order of key:value pairs in request is not important. however, please note that key:value pairs must match one of the contexts which are being reported into luigi's box search analytics. see the multi-warehouse solution and context in analytics for more details.
quoptional Allows to control query understanding process. Use qu=1 or qu=0 to turn it on or off. This feature is currently off by default. Important: if you want to use this feature, you must also include user_id parameter with the value of _lb cookie from your site. Look for suggested_url in response to find out whether our system indicates that a redirect should be performed and what should be the destination (based on results of the query understanding process).

Context Parameters

See the standard solutions for more information about context parameter usage.

Multi-warehouse
Read more →
   
context[geo_location]optional A coma separated list of geographical coordinates (lat, lon) representing visitors location, e.g., context[geo_location]=49.0448,18.5530. Allows to consider distance between a visitor and the items she is searching for. To be able to consider geographical context in search, catalog objects also need to contain an attribute which holds geo coordinates. By default, we assume that these are stored at geo_location.
context[geo_location_field]optional A definition of a custom field with geo coordinates to be used for geo search by context[geo_location]. If not defined, we assume that these are stored at geo_location field but you can override this by specifying context['geo_location_field']=my_field.
context[availability_field]optional Allows to change or disable consideration of item availability on results ranking. Without context definition, the default availability field is considered for ranking. Supply context[availability_field]=my_custom_field parameter to override this to your custom field. This field must contain integer value (0 for unavailable items or 1 for available items). If you want to disable influence of items availability on results ranking, set this context explicitly to nil: context[availability_field]=nil.
context[availability_rank_field]optional Allows to change or disable consideration of item availability_rank on results ranking. Without context definition, the default availability_rank field is considered for ranking. Supply context[availability_rank_field]=my_custom_field parameter to override this to your custom field. This field must contain integer value (15 for unavailable items or 1-14 for available items with descending priority (1 is most available)). If you want to disable influence of items availability_rank on results ranking, set this context explicitly to nil: context[availability_rank_field]=nil. In case of both availability_rank_field and availability_field are defined, availability_rank_field has priority. If either attribute is set to nil, availability will be disabled.
context[boost_field]optional Allows to change the default field used for boosting or disable boosting on results ranking. Without context definition, the default boost field is considered for ranking. Provide context[boost_field]=my_custom_field to change this to your custom field. Make sure that your custom field contains integer values from the interval 0-3 (where higher number means higher boosting priority). If you want to disable influence of boosting on results ranking, set this context explicitly to nil: context[boost_field]=nil.
context[freshness_field]optional Allows to change or disable consideration of item freshness (boosting of new items) on results ranking. Without context definition, the default freshness field is considered for ranking. Provide context[freshness_field]=my_custom_field to change this to your custom field. Make sure that your custom field holds date/timestamp value in ISO 8601 format. If you want to disable influence of freshness on results ranking, set this context explicitly to nil: context[freshness_field]=nil.

Request Headers

Consider sending request header of Accept-Encoding as well with values for supported encoding methods of your HTTP client, e.g. gzip or br, gzip, deflate for multiple supported methods. Encodings make the response from the JSON API considerably smaller and thus faster to transfer.

Example request

require 'faraday'
require 'faraday_middleware'
require 'json'

connection = Faraday.new(url: 'https://live.luigisbox.com') do |conn|
  conn.use FaradayMiddleware::Gzip
end

response = connection.get("/search?q=harry+potter&tracker_id=1234-5678")

if response.success?
  puts JSON.pretty_generate(JSON.parse(response.body))
else
  puts "Error, HTTP status #{response.status}"
  puts response.body
end

#!/bin/bash

curl -i -XGET --compressed\
  "https://live.luigisbox.com/search?q=harry+potter&tracker_id=1234-5678"\



<?php

// Using Guzzle (http://guzzle.readthedocs.io/en/latest/overview.html#installation)
require 'GuzzleHttp/autoload.php';


$client = new GuzzleHttp\Client();
$res = $client->request('GET', "https://live.luigisbox.com/search?q=harry+potter&tracker_id=1234-5678", [
  'headers' => [
    'Accept-Encoding' => 'gzip'
  ]
]);

echo $res->getStatusCode();
echo $res->getBody();

// This endpoint requires no authentication

// This endpoint requires no body

HTTP Response

The response to search request is a structured json. You will see two top-level fields: results and next_page. The results field contains all information about requested results. The next_page field contains link used for pagination to second page of result.

Results fields

   
query Requested query (q request parameter) as a string.
corrected_queryoptional This field is returned only if Luigi's Box altered the requested query. See corrected_query.
total_hits Number of hits found for requested type.
hits A list of results for requested type. Content of each result item depends on data stored in catalog.
facets A list of facets (requested or automatically identified) calculated for matched items.
filters A list of filters used for matching results.
quicksearch_types A list of results for all requested quicksearch_types.
suggested_facet Optional. Indicates one facet with its facet values which Luigi's Box evaluated as most useful for the current situation. Can be used to provide an "assistent-like" user interface, where a user is presented with one question in each step, allowing her to efficiently narrow-down the result set.
suggested_url Optional. In case when LB algorithm recognizes the posibility to redirect the requested query (query understanding of fixit), it returns this url for redirect in this field.
offset Deprecated, please ignore.
campaignsoptional A list of campaigns for the query. See banner campaigns
{
  "results": {
    "total_hits": 223,
    "hits": [
      {
        "url": "http://www.e-shop.com/products/123456",
        "attributes": {
          "image_link": "http://www.e-shop.com/assets/imgs/products/123456.jpg",
          "description": "Description field from your product catalog",
          "categories": [
            "Gadgets",
            "Kids"
          ],
          "categories_count": 2,
          "title": "<em>Product</em> X",
          "title.untouched": "Product X",
          "availability_rank_text": "true",
          "price": "5.52 EUR",
          "condition": "new"
        },
        "type": "item",
        "updated_at": "2017-11-23T00:00:00+00:00"
      },
      {
        "url": "http://www.e-shop.com/products/456789",
        "attributes": {
          "image_link": "http://www.e-shop.com/assets/imgs/products/456789.jpg",
          "description": "Description field from your product catalog",
          "categories": [
            "Gadgets",
            "Kids"
          ],
          "categories_count": 2,
          "title": "Product Y",
          "title.untouched": "<em>Product</em> Y",
          "availability_rank_text": "preorder",
          "price": "12.14 EUR",
          "condition": "new"
        },
        "type": "item",
        "updated_at": "2017-11-23T00:00:00+00:00"
      }
    ],
    "facets": [
      {
        "name": "type",
        "type": "text",
        "values": [
          {
            "value": "item",
            "hits_count": 123
          },
          {
            "value": "article",
            "hits_count": 14
          }
        ]
      },
      {
        "name": "price",
        "type": "float",
        "values": [
          {
            "value": "0.0|9.0",
            "hits_count": 1
          },
          {
            "value": "9.0|18.0",
            "hits_count": 1
          }
        ]
      },
      {
        "name": "categories_count",
        "type": "float",
        "values": [
          {
            "value": "1.0|2.0",
            "hits_count": 147
          },
          {
            "value": "2.0|3.0",
            "hits_count": 71
          }
        ]
      },
      {
        "name": "created_at",
        "type": "date",
        "values": [
          {
            "value": "2017-10-23T00:00:00+00:00|2017-11-23T00:00:00+00:00",
            "hits_count": 18
          },
          {
            "value": "2017-11-23T00:00:00+00:00|2017-12-23T00:00:00+00:00",
            "hits_count": 80
          }
        ]
      }
    ],
    "offset": "20",
    "campaigns": [
      {
        "id": 13,
        "target_url": "https://www.e-shop.com/harry-potter",
        "banners": {
          "search_header": {
            "desktop_url": "https://www.e-shop.com/harry-potter-1.jpg",
            "mobile_url": "https://www.e-shop.com/harry-potter-2.jpg"
          },
          "search_footer": {
            "desktop_url": "https://www.e-shop.com/harry-potter-3.jpg",
            "mobile_url": "https://www.e-shop.com/harry-potter-4.jpg"
          }
        }
      }
    ]
  },
  "next_page": "https://live.luigisbox.com/search?q=harry+potter&tracker_id=1234-5678&page=2"
}

Facets

The returned facets are available for the given query. Only the filtered facet returns all available values, other facets and their values are returned only for the filtered results.

Example:

Search query is "yamaha". Let's say the returned facet options are "Condition: new (822), used (1)" and "Category: guitars (423), pianos (400)". If you filter "Condition = used", the returned facet options will be "Category: guitars (1)" and "Condition: new (822), used (1)".

Integration with other Luigi's Box services

Query rewrite

Query rewrite is a way to control your search and autocomplete results. You can set up query rewrites in Luigi's Box application and they will have an effect on autocomplete and search results.

If you are using search.js then query rewrite will work out of the box and no integration is necessary on your side.

If you are using API to build search, then you must adapt your code to incorporate some functionality of query rewrite.

Each query rewrite has exactly one search query it responds to (diacritics and case do not matter). You can choose if you want to rewrite query, or keep the original one. You can also define filters, which will be applied to your search requests for given query.

When creating query rewrite, you can choose if you want to admit to the customer, that you have rewritten the original query. You can also define a message, which will be shown in case of applying query rewrite.

{
  "query_rewrite": {
        "id": 9,
        "original_query": "mini guitar",
        "admit_rewrite": true,
        "message": "We rewrote your entered query to another with better search results for you."
    }
}

Banners

Search API response will include data related to banner campaigns set up in the application. Refer to the Banner campaigns documentation for more details.

Scenarios

Filtering search results

To implement filtering, use the f[] and f_must[] parameters.

By default when searching filters of same type are applied with OR and filters of different types are applied with AND. E.g., request with filters f[]=category:jackets&f[]=category:windproof will find products, that have category jackets OR category windproof OR both, and request with filters f[]=category:jackets&f[]=protection:windproof will find products, that have category jackets AND protection windproof.

If you want to combine two filters of same type in AND like fashion, use f_must[] instead of f[]. E.g., you want to find only products that have category jackets and category windproof matching query 'adidas'. So instead of using this request

GET https://live.luigisbox.com/search?tracker_id=*your_tracker_id*&f[]=type:item&f[]=category:jackets&f[]=category:windproof&query=adidas

you should use this request

GET https://live.luigisbox.com/search?tracker_id=*your_tracker_id*&f[]=type:item&f_must[]=category:jackets&f_must[]=category:windproof&query=adidas

Filtering using complex compound filters (OR, NOT)

You might have a use-case where you need to submit a more complex filter, perhaps a compound of nested conditions, mixing logic of and, or or not. You can achieve this by changing the request method to POST from default GET and submitting the complex filter within request body as JSON. Keep all the other parameters (tracker_id, q, ...) in the request URL. You can even put additional filters in the request URL. These will be combined using AND logic with the complex filter.

POST https://live.luigisbox.com/search?tracker_id=*your_tracker_id*&f[]=type:item&f[]=category:jackets&query=adidas`

The payload must have filters on a top-level, which is a Hash/Dict. Within filters, you first define a type for which the filter should be applied to (e.g., item or product) and define the desired filter as a compound of (even more deeply) nested combination of and, or and not. The actual individual low-level filter follows the same syntax as other filters (key-value separated by a colon, pipe used for ranges) and should be placed under filter key where needed. See example below:

{
  "filters": {
    "item": {
      "and": [
        {
          "or": [
            {
              "or": [
                {
                  "filter": "price:1|3"
                },
                {
                  "filter": "price:9|"
                }
              ]
            },
            {
              "and": [
                {
                  "filter": "category:foo"
                },
                {
                  "filter": "price:6"
                }
              ]
            }
          ]
        },
        {
          "not": [
            {
              "filter": "price:2"
            }
          ]
        }
      ]
    }
  }
 }

Notice about complex filters and facets

Unlike traditional filters, which are made not to influence the facet of the same type (e.g., you can see other categories in a category facet despite having a category filter turned on), the complex filter is always applied and facets cannot show values outside the scope defined by the filter.

Filtering with geographical distance

To filter results based on geographical distance from the user's current location, for example to to find result within 50km, use f[]=geo_range:|50km. This way, all results with geo location farther than 50km will be filtered out. (For this filter to work, you must have a geo field indexed within your data, and provide geo location context in search parameters.)

The pattern for value of geo range filter is lower_range|upper_range, and lower and upper range need to match the pattern of /\d+km/. You can also ommit the lower or upper range to achieve an open interval.

Filtering and allowing missing values

By default, when filter is used, items that have the required attribute missing are filtered out. However, if you don't want to filter out items that have the required attribute missing, you can use special value 'value_missing' for the filter.

So for example, if you would want to get all the items that have the color attribute set to red OR they don't have the color attribute specified at all, you could use this combination of filters.

f[]=color:red&f[]=color:value_missing

This special filter value is allowed for numeric, date, boolean and text filters.

Query correction

Luigi's Box search endpoint offers optional functionality that allows it to avoid no-results or low-relevance results for the search query. If it recognizes that the requested query would end in a no-result state, it automatically augments the query to provide higher chances of finding results. There are two ways a query can be augmented, depending on the type of entered query. If a query includes a typo, such as searching for sheos instead of shoes, Luigi's Box can "fix" the typo prior the actual search, in order to avoid fuzzy search with uncertain results.

In this case, the corrected_query would be a string looking like this:

<strike>sheos</strike> <b>shoes</b>

If there is no typo but a part of query is causing the no result state, for example if there is no whiskey or whiskey shoes in catalog and query would be shoes whiskey, the corrected query would be this:

shoes <strike>whiskey</strike>

The last case is a search query consisting of a code. For example, 6834a88asc. But, there is no product in catalog with this code. There is only one with 6834a77asb. Since Luigi's Box is strict with codes and does not allow fuzziness for them, the query would end in no result state. But Luigi's Box can try to get a match with corrected query, in which case it would look like this:

6834a<strike>88asc</strike>

In every case, the corrected query is a html representation of the augmented query, that can be used to inform the user on the site, that the original query was in fact altered in some way.

Best practices

Provide filter for the main type

Make sure that you are requesting only the type that you want to search in. The API will search in all types by default — you send a request with a query and we will return a mix of results from all types. Even if you are not explicitely indexing multiple types, we are always automatically indexing your users' queries (type = queries), so you will always get mixed results by default. We sometimes see that clients are requesting large numbers of results and then filter only the relevant types locally, but there is a much simpler and more efficient way to do this. Simply request search results only for the relevant type by adding a type filter: f[]=type:item.

Request all types in a single HTTP request

Searching across multiple types is a very frequent requirement, e.g. you want to show search results for products, brands and categories on a single search results page. You can get results for several types by using quicksearch_types. The facets, sorting, filtering and pagination only applies to the main type that you are specifying vie the f[] filter on type attribute.

The results for the quicksearch_types will be in the quicksearch_hits structure in the JSON response.

GET https://live.luigisbox.com/search?tracker_id=179075-204259&f[]=type:product&quicksearch_types=category,brand&q=ukulele

Try it live →

Use pagination

The API supports pagination, you can page through the result set by using the page parameter. Request a smaller size of results for better performance and let the user request more.

GET https://live.luigisbox.com/search?tracker_id=179075-204259&f[]=type:product&q=ukulele&size=30&page=2

Try it live →

Avoid default explicit sorting

The results are sorted by Luigi's Box AI by default. The AI models are only involved if you do not specify explicit sorting though. Once you set the sort parameters, the results are ordered by the sort field you requested and not by AI.

Use dynamic facets

For products with hundreds, perhaps thousands of different parameters, it is often impossible to settle on a static list of filters (facets) to show to the users. Use the dynamic_facets_size parameter to let the AI model choose the most suitable facets for the given phrase. Compare the two requests below. The search request for "ukulele" will compute facets such as "Bridge" or "Finish", while the search request for "piano" responds with facets such as "Number of Keys".

GET https://live.luigisbox.com/search?tracker_id=179075-204259&f[]=type:product&q=ukulele&facets=price_amount,category&dynamic_facets_size=3

Try it live →

GET https://live.luigisbox.com/search?tracker_id=179075-204259&f[]=type:product&q=piano&facets=price_amount,category&dynamic_facets_size=3

Try it live →

Avoid loading unnecessary data

By default, the search API will include all of visible product attributes in the search response. Most of the time, that is not necessary and you can improve performance and decrease latency by only asking for the attributes that you will need and use. The hit_fields parameter drives the attribute selection. You pass it a list of comma separated attributes that you require to be included in the API response, such as hit_fields=image_link,price. Note that title is always returned by default, whether you specify it or not.

GET https://live.luigisbox.com/search?tracker_id=179075-204259&f[]=type:product&q=ukulele&hit_fields=image_link,price

Try it live →

Note that the API response has 2.13kB (at the time of writing) while the original unfiltered API response has 8.23kB. That's roughly a 4-fold improvement.

Alternatively, you can use a reverse approach and instead of specifying what should be included, specify what attributes should be excluded by setting remove_fields. It is, again, a comma separated list of attributes that you want to remove from the API response, such as remove_fields=image_link,price.

GET https://live.luigisbox.com/search?tracker_id=179075-204259&f[]=type:product&q=ukulele&remove_fields=image_link,price

Try it live →

Notice that the nested data is included in the API response implicitely and you can remove it via remove_fields. For the smalles possible API response size and the best latency, combine hit_fields with remove_fields=nested.

GET https://live.luigisbox.com/search?tracker_id=179075-204259&f[]=type:product&q=ukulele&hit_fields=image_link,price&remove_fields=nested

Try it live →

Notice that the API response is only 1.8kB for this scenario (at the time of writing).