import numpy as np
import pandas as pd
vocabulary = ['I','like','to','play','football','rome','paris','mango','apple']
one_hot_matrix={}
for i in range(len(vocabulary)):
l = [0]*len(vocabulary)
l[i]=1
one_hot_matrix[vocabulary[i]] = l
one_hot_matrix
The word2vec algorithms uses a neural network model to learn word associations from a large corpus of text. Once trained, such a model can detect synonymous words or suggest additional words for a partial sentence. In learning, these models take into account the context of the corpus in which the word occurs
In Skipgram: Each word starts with a [1x V] input, where V is the vocabulary size, W1:[V x E] and W2:[E x V] --> so the final output is softmaxed over [1xV] vector, giving the probability of the context word w.r.t the target word
In CBOW: Just as presented in skipgram, two mapping weight vectors are used, although here, the target word is predicted throught the aggregation of context words
W1,W2 also known as word-vector lookup table
According to [1], it is found that Skip-Gram works well with small datasets, and can better represent less frequent words. However, CBOW is found to train faster than Skip-Gram, and can better represent more frequent words.
[1]Efficient Estimation of Word Representations in Vector Space (Mikolov et. al)