28

Given a sentence: "When I open the ?? door it starts heating automatically"

I would like to get the list of possible words in ?? with a probability.

The basic concept used in word2vec model is to "predict" a word given surrounding context.

Once the model is build, what is the right context vectors operation to perform my prediction task on new sentences?

Is it simply a linear sum?

model.most_similar(positive=['When','I','open','the','door','it','starts' ,
                   'heating','automatically'])
Ethan
  • 1,657
  • 9
  • 25
  • 39
DED
  • 345
  • 1
  • 3
  • 7

2 Answers2

13

Word2vec works in two models CBOW and skip-gram. Let's take CBOW model, as your question goes in the same way that predict the target word, given the surrounding words.

Fundamentally, the model develops input and output weight matrices, which depends upon the input context words and output target word with the help of a hidden layer. Thus back-propagation is used to update the weights when the error difference between predicted output vector and the current output matrix.

Basically speaking, predicting the target word from given context words is used as an equation to obtain the optimal weight matrix for the given data.

To answer the second part, it seems a bit complex than just a linear sum.

  1. Obtain all the word vectors of context words
  2. Average them to find out the hidden layer vector h of size Nx1
  3. Obtain the output matrix syn1(word2vec.c or gensim) which is of size VxN
  4. Multiply syn1 by h, the resulting vector will be z with size Vx1
  5. Compute the probability vector y = softmax(z) with size Vx1, where the highest probability denotes the one-hot representation of the target word in vocabulary. V denotes size of vocabulary and N denotes size of embedding vector.

Source : http://cs224d.stanford.edu/lecture_notes/LectureNotes1.pdf

Update: Long short term memory models are currently doing a great work in predicting the next words. seq2seq models are explained in tensorflow tutorial. There is also a blog post about text generation.

chmodsss
  • 1,974
  • 2
  • 19
  • 37
6

Missing word prediction has been added as a functionality in the latest version of Word2Vec. Of course your sentence need to match the Word2Vec model input syntax used for training the model (lower case letters, stop words, etc)

Usage for predicting the top 3 words for "When I open ? door":

print(model.predict_output_word(['When','I','open','door']), topn = 3)
Christof Henkel
  • 161
  • 1
  • 5