Content export

GET https://live.luigisbox.com/v1/content_export

The content export endpoint returns all objects (except of type query) stored in our catalog in no particular order. It returns a list of records identified by their canonical URLs (relative ones) along with their attributes and nested fields.

If you are only interested in certain types of records, you can use requested_types parameter to control what types are present in the output.

The results returned by this API endpoint are paginated. To get to the next page, use the href attribute in the links section, where "rel": "next". When you receive a response which contains no link with "rel": "next", it means that there are no more pages to scroll and you have downloaded the full export.

  • Output of the API is not sorted.
  • This API is not designed for real-time consumption. If you wish to search within the catalog, use our autocomplete and search endpoints.
  • You have 10 minutes to use the next page's link before it expires.
This endpoint requires HMAC authentication. Refer to the Authentication section for details.

Query Parameters

Parameter Description
size Number of results in one response / page. Optional, with a default value of 300. Is limited to 500 if a greater value is requested.
hit_fields Optional. A comma separated list of fields. Only these fields (in addition to record identifier) will be retrieved and present in results. If not provided, all fields will be present in results.
requested_types Optional. A comma separated list of types. Only records of these types will be retrieved and present in results. If not provided, all types except query type will be present in results.

Request Headers

Consider sending request header of Accept-Encoding as well with values for supported encoding methods of your HTTP client, e.g. gzip or br, gzip, deflate for multiple supported methods. Encodings make the response from Content export endpoint considerably smaller and thus faster to transfer.

Sample request

require 'faraday'
require 'faraday_middleware'
require 'json'
require 'time'
require 'openssl'
require 'base64'

def digest(key, method, endpoint, date)
  content_type = 'application/json; charset=utf-8'

  data = "#{method}\n#{content_type}\n#{date}\n#{endpoint}"

  dg = OpenSSL::Digest.new('sha256')
  Base64.strict_encode64(OpenSSL::HMAC.digest(dg, key, data)).strip
end


public_key = "<your-public-key>"
private_key = "<your-private-key>"

date = Time.now.httpdate

connection = Faraday.new(url: 'https://live.luigisbox.com') do |conn|
  conn.use FaradayMiddleware::Gzip
end

response = connection.get("/v1/content_export") do |req|
  req.headers['Content-Type'] = "application/json; charset=utf-8"
  req.headers['Date'] = date
  req.headers['Authorization'] = "faraday #{public_key}:#{digest(private_key, "GET", "/v1/content_export", date)}"
end

if response.success?
  puts JSON.pretty_generate(JSON.parse(response.body))
else
  puts "Error, HTTP status #{response.status}"
  puts response.body
end

#!/bin/bash

digest() {
  KEY=$1
  METHOD=$2
  CONTENT_TYPE="application/json; charset=utf-8"
  ENDPOINT=$3
  DATE=$4

  DATA="$METHOD\n$CONTENT_TYPE\n$DATE\n$ENDPOINT"

  printf "$DATA" | openssl dgst -sha256 -hmac "$KEY" -binary | base64
}


public_key="<your-public-key>"
private_key="<your-private-key>"

date=$(env LC_ALL=en_US date -u '+%a, %d %b %Y %H:%M:%S GMT')
signature=$(digest "$private_key" "GET" "/v1/content_export" "$date")

curl -i -XGET --compressed\
  -H "Date: $date" \
  -H "Content-Type: application/json; charset=utf-8" \
  -H "Authorization: curl $public_key:$signature" \
  "https://live.luigisbox.com/v1/content_export"

<?php

// Using Guzzle (http://guzzle.readthedocs.io/en/latest/overview.html#installation)
require 'GuzzleHttp/autoload.php';

function digest($key, $method, $endpoint, $date) {
  $content_type = 'application/json; charset=utf-8';

  $data = "{$method}\n{$content_type}\n{$date}\n{$endpoint}";

  $signature = trim(base64_encode(hash_hmac('sha256', $data, $key, true)));

  return $signature;
}


$date = gmdate('D, d M Y H:i:s T');

$public_key = "<your-public-key>";
$private_key = "<your-private-key>";

$signature = digest($private_key, 'GET', '/v1/content_export', $date);

$client = new GuzzleHttp\Client();
$res = $client->request('GET', "https://live.luigisbox.com/v1/content_export", [
  'headers' => [
    'Accept-Encoding' => 'gzip, deflate',
    'Content-Type' => 'application/json; charset=utf-8',
    'Date' => $date,
    'Authorization' => "guzzle {$public_key}:{$signature}",
  ],
]);

echo $res->getStatusCode();
echo $res->getBody();

// This configuration and code work with the Postman tool
// https://www.getpostman.com/
//
// Start by creating the required HTTP headers in the "Headers" tab
//  - Accept-Encoding: gzip, deflate
//  - Content-Type: application/json; charset=utf-8
//  - Authorization: {{authorization}}
//  - Date: {{date}}
//
// The {{variable}} is a postman variable syntax. It will be replaced
// by values precomputed by the following pre-request script.

var privateKey = "your-secret";
var publicKey = "your-tracker-id";

var requestPath = '/v1/content_export'
var timestamp = new Date().toUTCString();
var signature = ['GET', "application/json; charset=utf-8", timestamp, requestPath].join("\n");

var encryptedSignature = CryptoJS.HmacSHA256(signature, privateKey).toString(CryptoJS.enc.Base64);

postman.setGlobalVariable("authorization", "ApiAuth " + publicKey + ":" + encryptedSignature);
postman.setGlobalVariable("date", timestamp);

// This endpoint requires no body

The above command returns JSON structured like this.

{
  "total": 14256,
  "objects": [
    {
      "url": "/item/1",
      "attributes":{
        "title": "Super product 1",
        ...
      },
      "nested": [],
      "type": "item",
      "exact": true
    },
    ...
  ],
  "links": [
    {
      "rel": "next",
      "href": "https://live.luigisbox.com/v1/content_export?cursor=23937182663"
    }
  ]
}

Tips

  • Make sure that you are requesting only the fields that you want to export using hit_fields parameter. It is a much simpler and more efficient way than requesting all the fields and filtering only the relevant fields afterwards.