What type of technique can be used to solve this question?

Question

Apology for the ambiguous title, I do not know the term.

I have data of some products which a few variables: origin, weight, brand. For example:

Product A = "China, 100g, Brand X"

Product B = "Japan, 50g, Brand Y"

Product C = "China, 30g, Brand Y"

... and so on. All products are homogeneous - you may assume they are all same type of bread, just with different attributes.

These products are not sold separately, but in bundle, e.g.

Bundle 1: 3 x Product A + 1 x Product B + 2 x Product C => $500

Bundle 2: 1 x Product E + 2 x Product F => $700

(Usually each product exists only in 1 bundle)

I have the data of many bundles and prices. Is there a way to estimate the individual price of each product?

==Edited==

If we represent the bundles as linear equations directly it looks:

3*A + 1*B + 2*C = 500

1*E + 2*F = 700

Since all variables A,B,C etc. appear in only 1 equation, I do not know how to solve it. But we have the attributes of each product, and that the price of each product can be determined by these attributes.

Siong Thye Goh · Accepted Answer · 2024-04-27T18:51:48.840

The price of each product can be determined by these attributes.

I think we can try to make use of this information.

Suppose for a start, we assume that price of product $i$, $X_i$, is a linear combination of the attributes $\sum_{j} a_{j}x_{ij}$ where parameters $a_j$ is what we want to learn.

The bundle equation of bundle $k$ is of the form of

$$\sum_{i}b_{ki}X_i=p_k$$

Now replacing

$$\sum_{i}b_{ki}\left(\sum_{j} a_{j}x_{ij}\right)=p_k$$

and hence we can try to solve for the $a_j$ say using Gaussian elimination.

In general, $X_i$ can be a parametrized nonlinear function of $x_{ij}$ and we have to learn the corresponding parameters.

To handle the nonlinear case, one of the possibility is as follows

consider the price $X_i$ is a nonlinear function of feature $x_{ij}$ where $j=1, \ldots, m$.

$$X_i = f(x_{i1}, \ldots, x_{im} | \theta)$$

Our goal is to learn $\theta$ such that a loss function $$\sum_k\sum_i L (b_{ki} f(x_{i1}, \ldots, x_{im} | \theta) , p_k)$$ is minimized where $b_{ki}$ has been given to us.

For example, we can choose our loss function to be $$\frac1{K}\sum_k\sum_i (b_{ki} f(x_{i1}, \ldots, x_{im} | \theta) - p_k)^2$$ where $K$ is the number of bundle.

One possibility

The input of the algorithm: Features of each product, i.e. for the input layer, one node for each feature of the product, $x_{i1}, \ldots, x_{im}$
Hidden Layers: Be creative and explore, this is where the parameters $\theta$ come in. Use nonlinear activation functions to introduce nonlinearity.
output layer: One node for each product, $f(x_{i1}, \ldots, x_{im} | \theta)$, this is at the product level where we estimate the price of each product.
Loss function: $\frac1{K}\sum_k\sum_i (b_{ki} f(x_{i1}, \ldots, x_{im} | \theta) - p_k)^2$, this is at bundle level where the real price is being used to evaluate how good is our prediction.

Darren Cook · Answer 2 · 2024-04-25T21:48:00.587

Substitute in the attributes of each bundle, using 1-hot encoding for the categories of country and brand.

So given:

A = China, 100g, Brand X
B = Japan, 50g, Brand Y
C = China, 30g, Brand Y
D = Germany, 30g, Brand Y
E = Germany, 75g, Brand Z

Now instead of the equation in your question

 3*A + 1*B + 2*C = 500
 1*E + 2*F = 700

we can make (I'm using c/j/g for countries, w for weight-in-grams, x/y/z for brands):

5*c + 1*j + 0*g + 410*w + 3*x + 3*y + 0*z = 500
0*c + 0*j + 3*g + 180*w + 0*x + 1*y + 2*z = 700

Obviously you are going to need more equations than you have attributes, to be able to solve it.

BTW, if you have an attribute with a lot of categories causing you trouble you can drop it, group it, or quantify it. For instance if you have 30 brands, you could make a judgement call for each and group them into high/low-value brands, so you only have two categories. Even better is to assign each brand a "brand quality" number from say 0 to 100. Now you've replaced 30 one-hot columns with a single number, and you will need much less data.

For your question about non-linear relations, once you have the data in the above format, you can run it through any machine learning algorithm, such as linear models, random forest, or neural net, giving it the training task of predicting the price.

EDIT based on OP's comment: As there is relatively little data (20-30 bundles), what I would do is first try and get the linear version to solve. Then check that it is giving sensible answers (giving more value to famous brands, and heavier products). Then I'd use random forest (or even a simple decision tree) to see what non-linear interactions it is discovering. I'd then manually add those as extra features to the data, and run the linear model again.

When data is limited like this I'd also do the old-fashioned idea of asking domain experts. They might tell you Japanese bread is in fashion, but only in small sizes, so then you can add "Japan and <50g" as a non-linear feature.

What type of technique can be used to solve this question?

2 Answers2