I need to create a recommendation system for a small company using its 500 first orders.
I received a .json file having the following structure:
{
"data":
[
{
"order_id": 1,
"seller_id": 1,
"customer_id": 4,
"order_status_id": 6,
"product_id": 2591,
"quantity": 4
},
{
"order_id": 1,
"seller_id": 1,
"customer_id": 4,
"order_status_id": 6,
"product_id": 2592,
"quantity": 1
},
{
"order_id": 2,
"seller_id": 19,
"customer_id": 5,
"order_status_id": 6,
"product_id": 1025,
"quantity": 3
},
...
I am new to machine learning and data analysis using python. What I need is group each order_id as a line a dataframe, where it contains:
order id 1 product 1 product 2 product 3
order id 2 product 3 product 4 product 6 product 125
This is for the apriori algorithms.
But for the data analysis, how can split each row into a line with columns in a series or data frame.
I did the following:
import pandas as pd
import numpy as np
data = pd.read_json('orders.json')
And I tried this:
data = pd.read_json('orders.json')['data']
But what I've got is the following:
