Word2Vec

Intro

“Just as Van Gogh’s painting of sunflowers is a two-dimensional mixture of oil on canvas that represents vegetable matter in a three-dimensional space in Paris in the late 1880s, so 500 numbers arranged in a vector can represent a word or group of words.” –DL4J

Word2Vec can guess a word’s association with other words, or cluster documents and define them by topic. It makes qualities into quantities, and similar things and ideas are shown to be “close” in its 500-dimension vectorspace.

Word2Vec is not classified as “deep learning” because it is only a 2-layer neural net.

Input → text corpus Output → set of vectors, or neural word embeddings

Examples

Rome - Italy = Beijing - China, so Rome - Italy + China = Beijing

king : queen :: man : woman

house : roof :: castle : [dome, bell_tower, spire, crenellations, turrets]

China : Taiwan :: Russia : [Ukraine, Moscow, Moldova, Armenia]

Notation

Algebraic notation

knee - leg = elbow - arm

English logic

knee is to leg as elbow is to arm

Logical analogy notation

knee : leg :: elbow : arm

Models

Continuous bag of words (CBOW) model

Uses a context to predict a target word. Faster.
Several times faster to train than the skip-gram, slightly better accuracy for frequent words.

Skip-gram model

Uses a word to predict a target context.
Works well with small amount of the training data, represents well even rare words or phrases.
Produces more accurate results on large datasets.

Implementation

Word2Vec can be implemented in DL4J, TensorFlow

To research

Implementation
Cosine similarity, dot product equation usage

Links

http://deeplearning4j.org/word2vec

mind/word2vec.txt · Last modified: 2016/09/25 16:49 by bayb2

Back to top

Table of Contents