9

I am having difficulty finding where my error is while building deep learning models, but I typically have issues when setting the input layer input shape.

This is my model:

model = Sequential([
Dense(32, activation='relu', input_shape=(1461, 75)),
Dense(32, activation='relu'),
Dense(ytrain.size),])

It is returning the following error:

 ValueError: Error when checking input: expected dense_1_input to have 3

 dimensions, but got array with shape (1461, 75)

The array is the training set from the kaggle housing price competition and my dataset has 75 columns and 1461 rows. My array is 2 dimensional, so why are 3 dimensions expected? I have tried adding a redundant 3rd dimension of 1 or flattening the array before the first dense layer but the error simply becomes:

ValueError: Input 0 is incompatible with layer flatten_1: expected 

min_ndim=3, found ndim=2

How do you determine what the input size should be and why do the dimensions it expects seem so arbitrary?

For reference, I attached the rest of my code:

xtrain = pd.read_csv("pricetrain.csv")
test = pd.read_csv("pricetest.csv")
xtrain.fillna(xtrain.mean(), inplace=True)
xtrain.drop(["Alley"], axis=1, inplace=True)
xtrain.drop(["PoolQC"], axis=1, inplace=True)
xtrain.drop(["Fence"], axis=1, inplace=True)
xtrain.drop(["MiscFeature"], axis=1, inplace=True)
xtrain.drop(["PoolArea"], axis=1, inplace=True)
columns = list(xtrain)
for i in columns:
    if xtrain[i].dtypes == 'object':
        xtrain[i] = pd.Categorical(pd.factorize(xtrain[i])[0])
from sklearn import preprocessing

le = preprocessing.LabelEncoder()
for i in columns:
    if xtrain[i].dtypes == 'object':
        xtrain[i] = le.fit_transform(xtrain[i])
ytrain = xtrain["SalePrice"]
xtrain.drop(["SalePrice"], axis=1, inplace=True)
ytrain = ytrain.values
xtrain = xtrain.values
ytrain.astype("float32")

size = xtrain.size
print(ytrain)
model = Sequential(
    [Flatten(),
     Dense(32, activation='relu', input_shape=(109575,)),
     Dense(32, activation='relu'),
     Dense(ytrain.size),
     ])
model.compile(loss='mse', optimizer='adam')
model.fit(xtrain, ytrain, epochs=10, verbose=1)

Any advice would be incredibly helpful!

Thank you.

Ethan
  • 1,657
  • 9
  • 25
  • 39
Josh Zwiebel
  • 193
  • 1
  • 1
  • 6

2 Answers2

11

The number of rows in your training data is not part of the input shape of the network because the training process feeds the network one sample per batch (or, more precisely, batch_size samples per batch). And in input_shape, the batch dimension is not included for the first layer. You can read more on this here.

So, the input shape for your problem will be:

input_shape=(75, )
bkshi
  • 2,303
  • 2
  • 14
  • 23
0

Try this bunch of code:

x_train=x_train.reshape(-1,75,1)

but before you train(fit) model

Negative one (-1) in reshape(-1,75,1) simply mean, that you don't know how much should be in first dimension, but you know that second one should be equals 75 and last one 1.

fuwiak
  • 1,373
  • 8
  • 14
  • 26