Search API
Use the search endpoint to get a fulltext search functionality with advanced filtering options.
To use this feature, we need to synchronize your product database with our search index. See Indexing the data for more details.
Luigi's Box Search can learn the best results ordering. In order to enable learning, you need to integrate Luigi's Box Search Analytics service with your website by following the instructions.
Search
GET https://live.luigisbox.com/search
Required Parameters
q |
User input - query. Optional, if you do not send q parameter, the API will only apply filters (f[] parameter). This is useful for generating listing pages. |
tracker_id |
Identifier of your site within Luigi's Box. You can see this identifier in every URL in the Luigi's Box app once you are logged-in. |
Optional Parameters
f[] optional
|
Filter using key:value syntax e.g., f[]=categories:Gadgets to filter hits according to chosen criteria.Filtering on top of numerical and date attributes supports ranges, using pipe as a separator, e.g., f[]=price:5|7 . This range can be left open from either side, e.g., f[]=price:6| .If a combination of filters for the same field is provided, they are applied with OR . E.g., filters f[]=categories:jackets&f[]=categories:coats will retrieve products, that have either jackets OR coats category. |
f_must[] optional
|
The same logic applies as for the f[] parameter, except when there are several f_must for the same attribute, they are treated as boolean AND. |
size optional
|
How many hits you want the endpoint to return. Defaults to 10, maximum is capped to 200. |
sort optional
|
Allows you to specify ordering of the results, using attr:{asc|desc} syntax, e.g., sort=created_at:desc . In the case of sorting by geo field (e.g., sort=geo_location:asc ), search request needs to contain also context[geo_location] representing visitors location. |
quicksearch_types optional
|
A comma separated list of other content types (e.g., category, brand, helpdesk content), which should be (also) searched for alongside the main type (products). These will be without any facets though. |
facets optional
|
A comma separated list of facets you want to have included in the response. Can be provided as coma separated list, where any value can be provided as facet_name:values_count , e.g. facets=category,material:5 (default values count is 30). |
dynamic_facets_size optional
|
If you wish our service to include additional, dynamically identified facets in the response, send the maximum number of such facets in this parameter. Defaults to 0 , i.e., no dynamically identified facets are returned. Dynamic identification of facets is based mainly on categories of retrieved items and their interesting attributes. |
page optional
|
Which page of the results you want the endpoint to return. Defaults to 1. |
from optional
|
If you prefer to use an equivalent of offset instead of page number, you can pass it as from parameter, which should be a non-negative integer. An equivalent of page=1 would be from=0 . |
use_fixits optional
|
Allows to control use of fixit rules. Use use_fixits=1 or use_fixits=true to explicitly enable usage of fixit rules. Use other values (such as use_fixits=false ) to disable fixit rules for current request. Default value is true , so fixit rules are enabled by default. Look for suggested_url in response to find out whether our system indicates that a redirect should be performed and what should be the destination (based on a matched fixit rule). |
prefer[] optional
|
Soft filter, using key:value syntax e.g., prefer[]=category:Gadgets to prefer hits according to chosen criteria. See Query-time boosting for more details. |
hit_fields optional
|
A comma separated list of attributes and product parameters. Only these fields (in addition to some default ones) will be retrieved and present in results. If not provided, all fields will be present in results. |
remove_fields optional
|
A comma separated list of attributes and product parameters. If provided, these fields will be ommited from the results. If not provided, all fields will be present in results. |
user_id optional
|
If supplied and is equal to user id collected in analytics, it can drive personalization of search results. In case you use identifiers of logged in users (customer_id in analytics), please put the ID of logged in user here and fill in parameter client_id as well. |
client_id optional
|
Set this parameter to client_id (sent in analytics) in case you store identifier of logged in user into user_id . |
ctx[] optional
|
drives model selection, using key:value syntax e.g., ctx[]=warehouse:berlin . you can provide multiple key:value pairs, that are combined into one context definition. order of key:value pairs in request is not important. however, please note that key:value pairs must match one of the contexts which are being reported into luigi's box search analytics. see the multi-warehouse solution and context in analytics for more details. |
qu optional
|
Allows to control query understanding process. Use qu=1 or qu=0 to turn it on or off. This feature is currently off by default. Important: if you want to use this feature, you must also include user_id parameter with the value of _lb cookie from your site. Look for suggested_url in response to find out whether our system indicates that a redirect should be performed and what should be the destination (based on results of the query understanding process). |
Context Parameters
See the standard solutions for more information about context parameter usage.
Multi-warehouse
Read more →
context[geo_location] optional
|
A coma separated list of geographical coordinates (lat, lon) representing visitors location, e.g., context[geo_location]=49.0448,18.5530 . Allows to consider distance between a visitor and the items she is searching for. To be able to consider geographical context in search, catalog objects also need to contain an attribute which holds geo coordinates. By default, we assume that these are stored at geo_location . |
context[geo_location_field] optional
|
A definition of a custom field with geo coordinates to be used for geo search by context[geo_location] . If not defined, we assume that these are stored at geo_location field but you can override this by specifying context['geo_location_field']=my_field . |
context[availability_field] optional
|
Allows to change or disable consideration of item availability on results ranking. Without context definition, the default availability field is considered for ranking. Supply context[availability_field]=my_custom_field parameter to override this to your custom field. This field must contain integer value (0 for unavailable items or 1 for available items). If you want to disable influence of items availability on results ranking, set this context explicitly to nil: context[availability_field]=nil . |
context[availability_rank_field] optional
|
Allows to change or disable consideration of item availability_rank on results ranking. Without context definition, the default availability_rank field is considered for ranking. Supply context[availability_rank_field]=my_custom_field parameter to override this to your custom field. This field must contain integer value (15 for unavailable items or 1-14 for available items with descending priority (1 is most available)). If you want to disable influence of items availability_rank on results ranking, set this context explicitly to nil: context[availability_rank_field]=nil . In case of both availability_rank_field and availability_field are defined, availability_rank_field has priority. If either attribute is set to nil, availability will be disabled. |
context[boost_field] optional
|
Allows to change the default field used for boosting or disable boosting on results ranking. Without context definition, the default boost field is considered for ranking. Provide context[boost_field]=my_custom_field to change this to your custom field. Make sure that your custom field contains integer values from the interval 0-3 (where higher number means higher boosting priority). If you want to disable influence of boosting on results ranking, set this context explicitly to nil: context[boost_field]=nil . |
context[freshness_field] optional
|
Allows to change or disable consideration of item freshness (boosting of new items) on results ranking. Without context definition, the default freshness field is considered for ranking. Provide context[freshness_field]=my_custom_field to change this to your custom field. Make sure that your custom field holds date/timestamp value in ISO 8601 format. If you want to disable influence of freshness on results ranking, set this context explicitly to nil: context[freshness_field]=nil . |
Request Headers
Consider sending request header of Accept-Encoding
as well with values for supported encoding methods of your HTTP client, e.g. gzip
or br, gzip, deflate
for multiple supported methods. Encodings make the response from the JSON API considerably smaller and thus faster to transfer.
Example request
require 'faraday'
require 'faraday_middleware'
require 'json'
connection = Faraday.new(url: 'https://live.luigisbox.com') do |conn|
conn.use FaradayMiddleware::Gzip
end
response = connection.get("/search?q=harry+potter&tracker_id=1234-5678")
if response.success?
puts JSON.pretty_generate(JSON.parse(response.body))
else
puts "Error, HTTP status #{response.status}"
puts response.body
end
#!/bin/bash
curl -i -XGET --compressed\
"https://live.luigisbox.com/search?q=harry+potter&tracker_id=1234-5678"\
<?php
// Using Guzzle (http://guzzle.readthedocs.io/en/latest/overview.html#installation)
require 'GuzzleHttp/autoload.php';
$client = new GuzzleHttp\Client();
$res = $client->request('GET', "https://live.luigisbox.com/search?q=harry+potter&tracker_id=1234-5678", [
'headers' => [
'Accept-Encoding' => 'gzip'
]
]);
echo $res->getStatusCode();
echo $res->getBody();
// This endpoint requires no authentication
// This endpoint requires no body
HTTP Response
The response to search request is a structured json.
You will see two top-level fields: results
and next_page
. The results
field contains all information
about requested results. The next_page
field contains link used for pagination to second page of result.
Results fields
query |
Requested query (q request parameter) as a string. |
corrected_query optional
|
This field is returned only if Luigi's Box altered the requested query. See corrected_query. |
total_hits |
Number of hits found for requested type . |
hits |
A list of results for requested type . Content of each result item depends on data stored in catalog. |
facets |
A list of facets (requested or automatically identified) calculated for matched items. |
filters |
A list of filters used for matching results. |
quicksearch_types |
A list of results for all requested quicksearch_types . |
suggested_facet |
Optional. Indicates one facet with its facet values which Luigi's Box evaluated as most useful for the current situation. Can be used to provide an "assistent-like" user interface, where a user is presented with one question in each step, allowing her to efficiently narrow-down the result set. |
suggested_url |
Optional. In case when LB algorithm recognizes the posibility to redirect the requested query (query understanding of fixit), it returns this url for redirect in this field. |
offset |
Deprecated, please ignore. |
campaigns optional
|
A list of campaigns for the query. See banner campaigns |
{
"results": {
"total_hits": 223,
"hits": [
{
"url": "http://www.e-shop.com/products/123456",
"attributes": {
"image_link": "http://www.e-shop.com/assets/imgs/products/123456.jpg",
"description": "Description field from your product catalog",
"categories": [
"Gadgets",
"Kids"
],
"categories_count": 2,
"title": "<em>Product</em> X",
"title.untouched": "Product X",
"availability_rank_text": "true",
"price": "5.52 EUR",
"condition": "new"
},
"type": "item"
},
{
"url": "http://www.e-shop.com/products/456789",
"attributes": {
"image_link": "http://www.e-shop.com/assets/imgs/products/456789.jpg",
"description": "Description field from your product catalog",
"categories": [
"Gadgets",
"Kids"
],
"categories_count": 2,
"title": "Product Y",
"title.untouched": "<em>Product</em> Y",
"availability_rank_text": "preorder",
"price": "12.14 EUR",
"condition": "new"
},
"type": "item"
}
],
"facets": [
{
"name": "type",
"type": "text",
"values": [
{
"value": "item",
"hits_count": 123
},
{
"value": "article",
"hits_count": 14
}
]
},
{
"name": "price",
"type": "float",
"values": [
{
"value": "0.0|9.0",
"hits_count": 1
},
{
"value": "9.0|18.0",
"hits_count": 1
}
]
},
{
"name": "categories_count",
"type": "float",
"values": [
{
"value": "1.0|2.0",
"hits_count": 147
},
{
"value": "2.0|3.0",
"hits_count": 71
}
]
},
{
"name": "created_at",
"type": "date",
"values": [
{
"value": "2017-10-23T00:00:00+00:00|2017-11-23T00:00:00+00:00",
"hits_count": 18
},
{
"value": "2017-11-23T00:00:00+00:00|2017-12-23T00:00:00+00:00",
"hits_count": 80
}
]
}
],
"offset": "20",
"campaigns": [
{
"id": 13,
"target_url": "https://www.e-shop.com/harry-potter",
"banners": {
"search_header": {
"desktop_url": "https://www.e-shop.com/harry-potter-1.jpg",
"mobile_url": "https://www.e-shop.com/harry-potter-2.jpg"
},
"search_footer": {
"desktop_url": "https://www.e-shop.com/harry-potter-3.jpg",
"mobile_url": "https://www.e-shop.com/harry-potter-4.jpg"
}
}
}
]
},
"next_page": "https://live.luigisbox.com/search?q=harry+potter&tracker_id=1234-5678&page=2"
}
Facets
The returned facets are available for the given query. Only the filtered facet returns all available values, other facets and their values are returned only for the filtered results.
Example:
Search query is "yamaha". Let's say the returned facet options are "Condition: new (822), used (1)" and "Category: guitars (423), pianos (400)". If you filter "Condition = used", the returned facet options will be "Category: guitars (1)" and "Condition: new (822), used (1)".
Integration with other Luigi's Box services
Query rewrite
Query rewrite is a way to control your search and autocomplete results. You can set up query rewrites in Luigi's Box application and they will have an effect on autocomplete and search results.
If you are using search.js then query rewrite will work out of the box and no integration is necessary on your side.
If you are using API to build search, then you must adapt your code to incorporate some functionality of query rewrite.
Each query rewrite has exactly one search query it responds to (diacritics and case do not matter). You can choose if you want to rewrite query, or keep the original one. You can also define filters, which will be applied to your search requests for given query.
When creating query rewrite, you can choose if you want to admit to the customer, that you have rewritten the original query. You can also define a message, which will be shown in case of applying query rewrite.
{
"query_rewrite": {
"id": 9,
"original_query": "mini guitar",
"admit_rewrite": true,
"message": "We rewrote your entered query to another with better search results for you."
}
}
Banners
Search API response will include data related to banner campaigns set up in the application. Refer to the Banner campaigns documentation for more details.
Scenarios
Filtering search results
To implement filtering, use the f[]
and f_must[]
parameters.
By default when searching filters of same type are applied with OR and
filters of different types are applied with AND. E.g., request with filters
f[]=category:jackets&f[]=category:windproof
will find products, that have
category jackets
OR category windproof
OR both, and request with
filters f[]=category:jackets&f[]=protection:windproof
will find products,
that have category jackets
AND protection windproof
.
If you want to combine two filters of same type in AND like fashion, use
f_must[]
instead of f[]
. E.g., you want to find only products that have
category jackets
and category windproof
matching query 'adidas'. So instead
of using this request
GET https://live.luigisbox.com/search?tracker_id=*your_tracker_id*&f[]=type:item&f[]=category:jackets&f[]=category:windproof&query=adidas
you should use this request
GET https://live.luigisbox.com/search?tracker_id=*your_tracker_id*&f[]=type:item&f_must[]=category:jackets&f_must[]=category:windproof&query=adidas
Filtering using complex compound filters (OR, NOT)
You might have a use-case where you need to submit a more complex filter, perhaps a compound of nested conditions, mixing logic of and
, or
or not
.
You can achieve this by changing the request method to POST
from default GET
and submitting the complex filter within request body as JSON.
Keep all the other parameters (tracker_id, q, ...) in the request URL. You can even put additional filters in the request URL. These will be combined using AND
logic with the complex filter.
POST https://live.luigisbox.com/search?tracker_id=*your_tracker_id*&f[]=type:item&f[]=category:jackets&query=adidas
`
The payload must have filters
on a top-level, which is a Hash/Dict. Within filters
, you first define a type
for which the filter should be applied to (e.g., item
or product
) and define the desired filter as a compound of
(even more deeply) nested combination of and
, or
and not
. The actual individual low-level filter follows the same
syntax as other filters (key-value separated by a colon, pipe used for ranges) and should be placed under filter
key where needed.
See example below:
{
"filters": {
"item": {
"and": [
{
"or": [
{
"or": [
{
"filter": "price:1|3"
},
{
"filter": "price:9|"
}
]
},
{
"and": [
{
"filter": "category:foo"
},
{
"filter": "price:6"
}
]
}
]
},
{
"not": [
{
"filter": "price:2"
}
]
}
]
}
}
}
Notice about complex filters and facets
Unlike traditional filters, which are made not to influence the facet of the same type (e.g., you can see other categories in a category facet despite having a category filter turned on), the complex filter is always applied and facets cannot show values outside the scope defined by the filter.
Filtering with geographical distance
To filter results based on geographical distance from the user's current location, for example to
to find result within 50km, use f[]=geo_range:|50km
. This way, all
results with geo location farther than 50km will be filtered out. (For this filter to
work, you must have a geo field indexed within your data, and provide geo location context
in search parameters.)
The pattern for value of geo range filter is lower_range|upper_range
, and lower and upper range
need to match the pattern of /\d+km/
. You can also ommit the lower or upper range to achieve an
open interval.
Filtering and allowing missing values
By default, when filter is used, items that have the required attribute missing are filtered out. However, if you don't want to filter out items that have the required attribute missing, you can use special value 'value_missing' for the filter.
So for example, if you would want to get all the items that have the color
attribute set to red
OR they don't have the color attribute specified at all, you could use this combination of filters.
f[]=color:red&f[]=color:value_missing
This special filter value is allowed for numeric
, date
, boolean
and text
filters.
Query correction
Luigi's Box search endpoint offers optional functionality that allows it to avoid no-results or low-relevance results for the search query.
If it recognizes that the requested query would end in a no-result state, it automatically augments the query to provide higher chances of finding results.
There are two ways a query can be augmented, depending on the type of entered query. If a query includes a typo, such as searching for sheos
instead of shoes
,
Luigi's Box can "fix" the typo prior the actual search, in order to avoid fuzzy search with uncertain results.
In this case, the corrected_query
would be a string looking like this:
<strike>sheos</strike> <b>shoes</b>
If there is no typo but a part of query is causing the no result state, for example if there is no whiskey
or whiskey shoes
in catalog and query would be shoes whiskey
, the corrected query would be this:
shoes <strike>whiskey</strike>
The last case is a search query consisting of a code. For example, 6834a88asc
. But, there is no product in catalog with this code. There is only one with 6834a77asb
. Since Luigi's Box is strict with codes and does not allow fuzziness for them, the query would end in no result state. But Luigi's Box can try to get a match with corrected query, in which case it would look like this:
6834a<strike>88asc</strike>
In every case, the corrected query is a html representation of the augmented query, that can be used to inform the user on the site, that the original query was in fact altered in some way.
Best practices
Provide filter for the main type
Make sure that you are requesting only the type that you want to search in.
The API will search in all types by default — you send a request with a
query and we will return a mix of results from all types. Even if you are not
explicitely indexing multiple types, we are always automatically indexing your
users' queries (type = queries), so you will always get mixed results by
default. We sometimes see that clients are requesting large numbers of results
and then filter only the relevant types locally, but there is a much simpler
and more efficient way to do this. Simply request search results only for the
relevant type by adding a type filter: f[]=type:item
.
Request all types in a single HTTP request
Searching across multiple types is a very frequent requirement, e.g. you want
to show search results for products, brands and categories on a single search
results page. You can get results for several types by using quicksearch_types
.
The facets, sorting, filtering and pagination only applies to the main type
that you are specifying vie the f[]
filter on type
attribute.
The results for the quicksearch_types
will be in the quicksearch_hits
structure in the JSON response.
GET https://live.luigisbox.com/search?tracker_id=179075-204259&f[]=type:product&quicksearch_types=category,brand&q=ukulele
Use pagination
The API supports pagination, you can page through the result set by using the page
parameter. Request a smaller size
of results for better performance and let the user request more.
GET https://live.luigisbox.com/search?tracker_id=179075-204259&f[]=type:product&q=ukulele&size=30&page=2
Avoid default explicit sorting
The results are sorted by Luigi's Box AI by default. The AI models are only involved if you do not specify explicit sorting though. Once you set the sort
parameters, the results are ordered by the sort field you requested and not by AI.
Use dynamic facets
For products with hundreds, perhaps thousands of different parameters, it is often impossible to settle on a static list of filters (facets
) to show to the users. Use the dynamic_facets_size
parameter to let the AI model choose the most suitable facets for the given phrase. Compare the two requests below. The search request for "ukulele" will compute facets such as "Bridge" or "Finish", while the search request for "piano" responds with facets such as "Number of Keys".
GET https://live.luigisbox.com/search?tracker_id=179075-204259&f[]=type:product&q=ukulele&facets=price_amount,category&dynamic_facets_size=3
GET https://live.luigisbox.com/search?tracker_id=179075-204259&f[]=type:product&q=piano&facets=price_amount,category&dynamic_facets_size=3
Avoid loading unnecessary data
By default, the search API will include all of visible product attributes in the search response. Most of the time, that is not necessary and you can improve performance and decrease latency by only asking for the attributes that you will need and use. The hit_fields
parameter drives the attribute selection. You pass it a list of comma separated attributes that you require to be included in the API response, such as hit_fields=image_link,price
. Note that title
is always returned by default, whether you specify it or not.
GET https://live.luigisbox.com/search?tracker_id=179075-204259&f[]=type:product&q=ukulele&hit_fields=image_link,price
Note that the API response has 2.13kB (at the time of writing) while the original unfiltered API response has 8.23kB. That's roughly a 4-fold improvement.
Alternatively, you can use a reverse approach and instead of specifying what should be included, specify what attributes should be excluded by setting remove_fields
. It is, again, a comma separated list of attributes that you want to remove from the API response, such as remove_fields=image_link,price
.
GET https://live.luigisbox.com/search?tracker_id=179075-204259&f[]=type:product&q=ukulele&remove_fields=image_link,price
Notice that the nested
data is included in the API response implicitely and you can remove it via remove_fields
. For the smalles possible API response size and the best latency, combine hit_fields
with remove_fields=nested
.
GET https://live.luigisbox.com/search?tracker_id=179075-204259&f[]=type:product&q=ukulele&hit_fields=image_link,price&remove_fields=nested
Notice that the API response is only 1.8kB for this scenario (at the time of writing).