Train data import

The behavioral recommenders are trained on user behavior. Luigi's Box analytics collects co-purchases (items bought together within same session, by same user, etc.). It however takes some time from the beginning of data collection until we have enough knowledge to learn good quality recommendations. To speed the learning period, we can import a log of historical transactions from a file.

The import file must be in the json or csv format. It has two mandatory attributes, session_id and identity, which are used as a basis for global (anonymous) co-purchases learning. The file can contain two additional attributes, user_id and created_at, which, if present, allow the transaction metadata to be stored in a user profile and improve personalization.

Attribute Description
session_id required Any value enabling to identify products (rows) purchased in the same session.
identityrequired Resource identifier of the purchased product.
user_id optional Id of the user who purchased the product.
created_at optional Timestamp of a purchase used to sort purchases in time.

Example of an import file in the json format:

{"session_id": "1","identity": "/p/123","user_id": "4", "created_at": "2023-04-22 15:04:30.12312"}
{"session_id": "1","identity": "/p/234","user_id": "4", "created_at": "2023-04-22 15:01:33.12345"}
{"session_id": "2","identity": "/p/123","user_id": "3", "created_at": "2023-04-21 00:04:38.12121"}
{"session_id": "2","identity": "/p/345","user_id": "3"}

Example of an import file in the csv format. File should not contain the header, rows contain fields in the following order - session_id, identity (optionally followed by user_id, created_at):

1,/p/123,4,"2023-04-22 15:04:30.12312"
1,/p/234,4,"2023-04-22 15:01:33.12345"
2,/p/123,3,"2023-04-21 00:04:38.12121"
2,/p/345,3