Indexing Data overview
To use Luigi's Box product discovery APIs, you need to synchronize your product catalog with our service. Once we have your catalog, we continuously and automatically match the products from the catalog with our analytics data and adjust their ranking. This ensures that your search results are not only accurate but also optimized based on user behavior. Luigi's Box supports two primary methods for catalog synchronization: near real-time updates via our Content Updates API or periodic processing of data feeds (XML, JSON, or CSV). While feeds are supported, the API is the preferred method to ensure your search index is never stale.
Choose your integration path
Content Updates API
The preferred method for near real-time synchronization of your product catalog with our search index.
Feeds (XML, JSON, CSV)
Use XML, JSON, or CSV files for periodic, batch-based synchronization of your catalog.
- Real-time synchronization: Keep your search index accurate and up-to-date, avoiding issues with stale data from periodic feed processing.
- Flexible updates: Atomically update individual product attributes (like price or stock) with Partial Content Updates without re-indexing the entire object.
- Bulk processing: Efficiently handle full catalog imports and large-scale changes using batching and the Content Generations feature to prevent stale items.
- Broad format support: Works with any valid XML, JSON, or CSV file, making it easy to adapt existing data exports, such as a Google Merchant feed.
- Simple for batch systems: A straightforward choice if your product data is already managed in a system that generates periodic data dumps.
- Handles complex data: Can merge data from up to five separate feeds using a join attribute to construct complete product information from different sources.
Core concepts
Regardless of the integration path you choose, understanding these core concepts is essential for a successful implementation.
-
Data Layout: Luigi's Box follows a "convention over configuration" approach. While you have flexibility in naming attributes, several special fields have predefined behaviors that impact search results and ranking. For example,
availability
is used to sort available products first,price
is used for faceting and value extraction, andimage_link
is used by frontend components to display product images. - Object Identity: Before indexing, it's crucial to establish a stable, unique identity for every object (product, category, etc.). This identity is the key that links the indexed object in your catalog to the behavioral data collected through analytics events, forming the feedback loop that powers our AI-driven ranking.
- Content Export: Once your data is indexed, you can retrieve it at any time using the Content Export API. This endpoint allows you to download your entire catalog stored in Luigi's Box for verification, backup, or other purposes. The results are paginated and the API is designed for bulk export, not real-time consumption.