Questions tagged [gru]
22 questions
184
votes
6 answers
When to use GRU over LSTM?
The key difference between a GRU and an LSTM is that a GRU has two gates (reset and update gates) whereas an LSTM has three gates (namely input, output and forget gates).
Why do we make use of GRU when we clearly have more control on the network…
Sayali Sonawane
- 2,101
- 3
- 13
- 13
8
votes
1 answer
TensorFlow / Keras: What is stateful = True in LSTM layers?
Could you elaborate on this argument? I found the brief explanation from the docs unsatisfying:
stateful: Boolean (default False). If True, the last state for each sample at index i in a batch will be used as initial state for the sample of index i…
Leevo
- 6,445
- 3
- 18
- 52
4
votes
1 answer
RNN performing worse than random guessing on large dataset
I have to start off by saying I am 100% a beginner here.
I trained a RNN model on a 30 class dataset with over 90000 samples and it achieved less than 2% accuracy. Training the same model on a small subset of the same data (with only 3 classes), the…
adithom
- 41
- 1
2
votes
1 answer
GRU and LSTM does not "take risk" predicting
I tested LSTM and GRU models to predict the exchange rate between currencies. I do not take the raw price but a the delta with the previous day, so the data is stationnary around zero.
My problem is that my model always predict really close-to-zero…
alarty
- 21
- 1
2
votes
0 answers
Custom GRU With 3D Spatial Convolution Layer In Keras
I am trying to implement a custom GRU model that is shown in this paper 3D-R2N2 The GRU pipeline looks like:
The original implementation is theano based and I am trying to apply the model in tf2/Keras.
I have tried to create a custom GRU Cell from…
b15h0y
- 21
- 3
2
votes
1 answer
Impact of varying sequence length in ensemble GRU model
I am using ensemble gru for my project and keeping different cell sizes for different models !For example, first gru model is of size 16 and the second is of 8 and 4 for the third model.
The model is running well but I don't see any difference in…
Mogambo0001
- 21
- 1
2
votes
0 answers
Wiggle in the initial part of an LSTM prediction
I working on using LSTMs and GRUs to make time series predictions. For the most part the predictions are pretty good.
However, there seems to be a wiggle (or initial up-then-down) before the prediction settles out similar to the left side of this…
AGirlHasNoUsername
- 21
- 1
2
votes
0 answers
GRU learns small-scale features, but misses large scales
Playing around with weather data, I have set up a simple RNN with one layer of GRUs. It is trained to recover the temperature of the next day, given weather data of the last 5 days, each with 1-hour intervals.
What I find peculiar is that after…
rugermini
- 21
- 2
2
votes
1 answer
Training 3 models in different order gave different results
I have the following loop to train some models on a time series.
my_seed = 7
time_frames = [4,5]
layers = [3,4,5]
----- basic data formating, always gives the same output -------
x1 = numpy.concatenate((x1,x2), axis=0)
y1 =…
Stefan
- 21
- 1
2
votes
0 answers
TF: What is the difference between the 'kernel weights' and the 'recurrent kernel weights' in LSTMs/GRUs?
Context:
I am trying to understand the differences between the GRU/LSTM cells from tensorflow and pytorch (for research reproducibility) and noticed that TensorFlow differentiates between the kernel_initializer and the recurrent_initializer (see…
Robin van Hoorn
- 160
- 12
1
vote
1 answer
How to add a Decoder & Attention Layer to Bidirectional Encoder with tensorflow 2.0
I am a beginner in machine learning and I'm trying to create a spelling correction model that spell checks for a small amount of vocab (approximately 1000 phrases). Currently, I am refering to the tensorflow 2.0 tutorials for 1. NMT with Attention,…
Dom
- 11
- 2
1
vote
0 answers
LSTM / GRU prediction with hidden state?
I am trying to predict a value based on time series by series of 24 periods (the 25th period)
While training I have a validation set with I babysit the training (RMSE) and each epoch, eval the validation:
Receive errors as:
Train RMSE:…
Yuval Asher
- 11
- 1
1
vote
1 answer
Using GRU with FeedForward layers in Python
I'm trying to reproduce the codes in this paper here for the multi-labeling problem (11 classes), which is using
1- Embedding layer
2- GRU
3- two Feed forward Layers with the ReLU activation function
4- sigmoid unit.
I've tried to run the…
Zahra Hnn
- 33
- 4
1
vote
1 answer
Keras RNN (batch_size
I created an RNN model for text classification with the LSTM layer, but when I put the batch_size in the fit method, my model trained on the whole batch instead of just the mini-batch _size.
This also happened when I used GRU and Bidirectional layer…
cho_uc
- 38
- 4
1
vote
1 answer
Aside from trial and error, how do I select the number of layers and unit counts for LSTMS, GRUs, and Transformer units for text and time series?
When deciding on the number of units and layers for text processing or time-series prediction I rely heavily on trial and error. First, I look for a reference or paper on the topic such as the white paper on transformers: Attention Is All You Need.…
Joachim Rives
- 153
- 5