Questions tagged [machine-translation]

Machine translation in a data science context refers to the process of using machine learning techniques to translate input provided in one language into output in another. It includes topics referring to using text/corpus data, neural machine translation (NMT), deep learning, and speech recognition.

Machine Translation in a data science context refers to the process of using machine learning techniques to translate text input provided in one language into text output in another. It includes topics referring to using text/corpus data, neural machine translation (NMT), deep learning, and speech recognition. Popular applications of machine translation include Google's Neural Machine Translation which employs artificial neural networks to translate text and power its Google Translate.

88 questions
29
votes
7 answers

Why is the decoder not a part of BERT architecture?

I can't see how BERT makes predictions without using a decoder unit, which was a part of all models before it including transformers and standard RNNs. How are output predictions made in the BERT architecture without using a decoder? How does it do…
hathalye7
  • 445
  • 1
  • 5
  • 7
21
votes
3 answers

What is the bleu score of professional human translators?

Machine translation models are usually evaluated using bleu score. I want to get some intuition for this score. What is the bleu score of professional human translator? I know it depends on the languages, the translator ect. I just want to get the…
Amit Keinan
  • 816
  • 7
  • 19
12
votes
3 answers

BPE vs WordPiece Tokenization - when to use / which?

What's the general tradeoff between choosing BPE vs WordPiece Tokenization? When is one preferable to the other? Are there any differences in model performance between the two? I'm looking for a general overall answer, backed up with specific…
11
votes
1 answer

Multi-Head attention mechanism in transformer and need of feed forward neural network

After reading the paper, Attention is all you need, I have two questions: 1. What is the need of a multi-head attention mechanism? The paper says that: "Multi-head attention allows the model to jointly attend to information from different…
9
votes
1 answer

What is the BLEU score used in Google Brain's "Attention Is All You Need" paper?

Google Brain's Attention Is All You Need paper on sequence-to-sequence translation reports: Our model achieves 28.4 BLEU on the WMT 2014 Englishto-German translation task, improving over the existing best results, including ensembles, by over 2…
Imran
  • 2,381
  • 13
  • 22
9
votes
2 answers

What's an LSTM-LM formulation?

I am reading this paper "Sequence to Sequence Learning with Neural Networks" http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf Under "2. The Model" it says: The LSTM computes this conditional probability by…
6
votes
1 answer

What is Bit Per Character?

What is Bits per Character (bpc) metric which has been used to measure the model accuracy with reference to text8 and enwiki8 datasets. I encountered the term bpc in transformer -XL paper here. How different is it from the perplexity as a metric?
Ashwin Geet D'Sa
  • 1,217
  • 2
  • 11
  • 20
6
votes
3 answers

Is there "Attention Is All You Need" implementation in Keras?

Has anyone seen this model's implementation using Keras? inb4: tensorflow, pytorch
Anton
  • 243
  • 2
  • 10
5
votes
1 answer

What is context window size?

I am trying to implement a recurrent neural network machine translation system, and I am just learning the things. I am creating a word embedding matrix. In order to do that, I should know my vocabulary size, dimension of the embedding space, and…
yusuf
  • 165
  • 1
  • 7
4
votes
2 answers

Sentences language translation with neural network, with a simple layer structure (if possible sequential)

Context: Many language sentences translation systems (e.g. French to English) with neural networks use a seq2seq structure: "the cat sat on the mat" -> [Seq2Seq model] -> "le chat etait assis sur le tapis" Example: A ten-minute introduction to…
4
votes
2 answers

How to get phrase tables from word alignments?

The output of my word alignment file looks as such: I wish to say with regard to the initiative of the Portuguese Presidency that we support the spirit and the political intention behind it . In bezug auf die Initiative der portugiesischen…
alvas
  • 2,510
  • 7
  • 28
  • 40
3
votes
3 answers

A good way to organize/store a lot of datasets

In machine translation, we often have bilingual dataset, e.g. for German-English and French-English we will have something that looks like this: /en-de train.de train.en dev.de dev.en test.de test.en /en-fr train.fr …
alvas
  • 2,510
  • 7
  • 28
  • 40
3
votes
0 answers

Back-Translation model for German and English

Do you know of any pre-trained models for back translation between German and English? I am aware that there are ways to include a monolingual corpus into the training of a machine translation model (often referred to as back translation). I am…
3
votes
1 answer

Implementing back translation as a data augmentation for text classification

Since back translation English->other language -> English seems like quite a useful data augmentation technique , I wanted to experiment with it. E.g. it occurred to me that languages from very different language families (but very well supported…
3
votes
0 answers

How to train tensorflow's transformer model on my own data?

https://github.com/tensorflow/models/blob/master/official/transformer has an implementation of transformer model. I want to train the model on my own data(consisting of two files, src.txt, and tgt.txt), However, I'm unable to figure out how to…
1
2 3 4 5 6