top of page
Shayaan

The Word2Vec Algorithm: What is it?

Updated: Aug 26, 2022

There are many burgeoning algorithms worldwide that have many purposes. Click here to learn more about the Word2Vec algorithm: a program that has the capability to represent words as vectors!


Figure 1: Code on a computer


Introduction


Word2Vec is an algorithm that represents words as vectors. The algorithm works by looking at words that surround a given word (Or the word’s context). There are two ways to implement Word2Vec: a continuous skip-gram model and a continuous bag of words model. To summarize, the skip-gram model tries to predict the context of a given word and the continuous bag of words model tries to predict the word given its context. Neural networks are trained on a large set of texts (a corpus) to make these predictions.


The Advantage of Representing Words as Vectors


When we see a word written down, it is not the letters that the word contains that give us the information, instead, it is our own preconceived ideas that give the word meaning and provide much more information. For example, we are able to tell that the words ‘dog’ and ‘puppy’ are related despite them looking nothing alike in terms of their spellings.


Words that appear in similar contexts will have similar meanings and since Word2Vec creates vectors by looking at the context, they will have similar vectors. This enables programs that use Word2Vec to pick up on semantics in a similar way to humans. Word2Vec is also able to recognise relationships between words. For example, if you were to take the vector for King and subtract the vector for Man then add the resulting vector to the vector for Woman, you would get the vector for Queen (Woman + (King – Man) = Queen). Similarly, subtracting London from England and adding this to Japan would return Tokyo (Japan + (England – London) = Tokyo).


Word2Vec is also very useful in machine learning. Most machine learning algorithms require the data to be represented in a numerical form and since the vectors provide meaning, the algorithms are able to recognise more nuanced details.


The Word2Vec Model


The neural networks used by Word2Vec have one input layer, one hidden layer and one output layer. All of the layers consist of nodes. There is one node for every word in the corpus in the input layer and one node for every word in the output layer. Every node in one layer is connected to every node in the next layer by a weighted edge. During the training phase, each of the weights is adjusted to improve the overall performance of the network. The value at each of the nodes in the input layer is 0 apart from one node at which the value is 1 (the position of the 1 indicates which word is being passed into the network). The network then computes the likelihood that the input word would appear in the context of all the other words in the corpus. Once the network has been trained, the set of weights connecting the input node to the hidden layer becomes the vector for that word.


Figure 2: A representation of a Word2Vec model, the blue circles represent nodes, and the lines represent weighted edges. The red weights become the vector for ‘cat’, the orange weights become the vector for ‘dog’ and the yellow weights become the vector for ‘hat’. To pass the word ‘dog’ into the network, the dog node would be given a value of 1 and all other input nodes would be given a value of 0.


Personal Opinion


Personally, I feel that Word2Vec is an incredibly impressive and useful algorithm due to its ability to recognise relationships between words.


The Word2Vec model has also been adapted to produce vectors for biological sequences such as the base sequences in DNA and RNA.



93 views0 comments

Recent Posts

See All

Comments


bottom of page