Questions tagged [sequence-to-sequence]
115 questions
18
votes
3 answers
How to determine feature importance in a neural network?
I have a neural network to solve a time series forecasting problem. It is a sequence-to-sequence neural network and currently it is trained on samples each with ten features. The performance of the model is average and I would like to investigate…
Aesir
- 458
- 1
- 6
- 15
17
votes
1 answer
Why do we need to add START + END symbols when using Recurrent Neural Nets for Sequence-to-Sequence Models?
In the Sequence-to-Sequence models, we often see that the START (e.g. ) and END (e.g. ) symbols are added to the inputs and outputs before training the model and before inference/decoding unseen data.
E.g.…
alvas
- 2,510
- 7
- 28
- 40
11
votes
2 answers
How do attention mechanisms in RNNs learn weights for a variable length input
Attention mechanisms in RNNs are reasonably common to sequence to sequence models.
I understand that the decoder learns a weight vector $\alpha$ which is applied as a weighted sum of the output vectors from the encoder network. This is used to…
davidparks21
- 433
- 1
- 4
- 18
10
votes
4 answers
How are Q, K, and V Vectors Trained in a Transformer Self-Attention?
I am new to transformers, so this may be a silly question, but I was reading about transformers and how they use attention, and it involves the usage of three special vectors. Most articles say that one will understand their purpose after reading…
arctic_hen7
- 201
- 1
- 2
- 3
9
votes
2 answers
Input for LSTM for financial time series directional prediction
I'm working on using an LSTM to predict the direction of the market for the next day.
My question concerns the input for the LSTM. My data is a financial time series $x_1 \ldots x_t$ where each $x_i$ represents a vector of features for day $i$, i.e…
articuno
- 99
- 3
7
votes
1 answer
Minimal working example or tutorial showing how to use Pytorch's nn.TransformerDecoder for batch text generation in training and inference modes?
I want to solve a sequence-to-sequence text generation task (e.g. question answering, language translation, etc.).
For the purposes of this question, you may assume that I already have the input part already handled. (I already have a tensor of…
Pablo Messina
- 197
- 1
- 3
- 11
5
votes
1 answer
How does the Transformer predict n steps into the future?
I have barely been able to find an implementation of the Transformer (that is not bloated nor confusing), and the one that I've used as reference was the PyTorch implementation. However, the Pytorch implementation requires you to pass the input…
skevelis
- 53
- 1
- 3
5
votes
1 answer
ValueError: Cannot convert a partially known TensorShape to a Tensor: (?, 256)
I'm working on a sequence to sequence approach using LSTM and a VAE with an attention mechanism.
p = np.random.permutation(len(input_data))
input_data = input_data[p]
teacher_data = teacher_data[p]
target_data = target_data[p]
BUFFER_SIZE =…
Kahina
- 644
- 1
- 9
- 24
5
votes
2 answers
Does this encoder-decoder LSTM make sense for time series sequence to sequence?
TASK
given $\vec x = [x_{t=-3}, x_{t=-2}, x_{t=-1}, x_{t=0}]$
predict $\vec y = [x_{t=1}, x_{t=2}]$
Whith an LSTM encoder-decoder (seq2seq)
MODEL
NOTE: the ? symbol in the shape of the tensors refers to batch_size, following tensorflow…
ignatius
- 1,696
- 9
- 22
5
votes
1 answer
Why do position embeddings work?
In the papers "Convolutional Sequence to Sequence Learning" and
"Attention Is All You Need", positions embeddings are simply added to the input words embeddings to give the model a sense of the order of the input sequence. These position embeddings…
Robin
- 1,347
- 9
- 20
5
votes
1 answer
How/What to initialize the hidden states in RNN sequence-to-sequence models?
In an RNN sequence-to-sequence model, the encode input hidden states and the output's hidden states needs to be initialized before training.
What values should we initialize them with? How should we initialize them?
From the PyTorch tutorial, it…
alvas
- 2,510
- 7
- 28
- 40
4
votes
2 answers
Sentences language translation with neural network, with a simple layer structure (if possible sequential)
Context: Many language sentences translation systems (e.g. French to English) with neural networks use a seq2seq structure:
"the cat sat on the mat" -> [Seq2Seq model] -> "le chat etait assis sur le tapis"
Example: A ten-minute introduction to…
Basj
- 180
- 2
- 18
4
votes
1 answer
Answer to Question
Looking for a system which can generate answers to questions. Most systems and blogs posted on internet are on Question to answer but not on answer to question or paraphrasing or keyword to questions.
Seq2Seq I tried and even after training for many…
Sandeep Bhutani
- 914
- 1
- 7
- 26
4
votes
1 answer
SymbolicException: Inputs to eager execution function cannot be Keras symbolic tensors
I am writing Encoder-Decoder architecture with Bahdanau Attention using tf.keras with TensorFlow 2.0. Below is my code This is working with TensorFlow 1.15 but getting the error in 2.0. you can check the code in colab notebook here.
can you please…
Uday
- 576
- 4
- 9
4
votes
2 answers
Is this a problem for a Seq2Seq model?
I'm struggling to find a tutorial/example which covers using an seq2seq model for sequential inputs other then text/translation.
I have a multivariate dataset with n number of input variables each composed of sequences, and a single output sequence…
Ellio
- 103
- 1
- 8