Skip to main content

word2vec

2018


Models and Architectures in Word2vec

·3 mins
Generally, word2vec is a language model to predict the words probability based on the context. When build the model, it create word embedding for each word, and word embedding is widely used in many NLP tasks. Models #CBOW (Continuous Bag of Words) #Use the context to predict the probability of current word.

2017


Parameters in doc2vec

·2 mins
Here are some parameter in gensim’s doc2vec class. window #window is the maximum distance between the predicted word and context words used for prediction within a document. It will look behind and ahead. In skip-gram model, if the window size is 2, the training samples will be this:(the blue word is the input word)