What is sentiment analysis? How is it related to machine learning? Click here to find out!
Figure 1: A notebook and marker
Introduction
Sentiment analysis is the process of determining the sentiment behind some text. Examples of this include determining whether movie reviews are positive or negative or whether an email is spam or not.
The Bag of Words Model
One method of sentiment analysis is using the bag of words model. In this model, there is a large dataset of text (for example movie reviews), each labelled as either positive or negative. Then we look at the words in the text we are trying to classify and determine the likelihood that each of these words would be found in a positive review and a negative review. We then multiply all the probabilities for words being in a negative review together and the probabilities for the words being in a positive review together. Then we are left with two probabilities: one tells us how likely the review is to be positive and the other tells us how likely it is to be negative. We can then normalise these probabilities to make them add up to 1 which makes it easier for them to be compared.
The Bag of Words model isn’t that effective. The main reason is that it is unable to recognise semantic relationships between words. For example, it is unable to realise that ‘good’ and ‘great’ are related words, it may only realise that they both often appear in positive contexts.
Sentiment Analysis using Machine Learning
A better way of performing sentiment analysis would be using machine learning, particularly, neural networks. Since neural networks can only interpret numbers and not words, we need to find a way of representing words with numbers. One way would be using a ‘one-hot representation’ in which words are represented as vectors in which one number is a 1 and the rest are 0s (the position of the 1 represents which word it is). A one-hot representation isn’t the best method since the vector doesn’t really provide us with any information as to the meaning of the word. A better way would be using a model such as Word2Vec which represents the meanings of words using vectors.
We can use a labelled dataset and convert the words into their Word2Vec vectors, then pass these into a neural network for training. One way to do this would be to have a neural network that has a fixed size input and then passing the average vector of each review into the network. Another way would be to use a Recurrent Neural Network (RNN). RNNs allow data to be passed into them over several timestamps alongside some information about the previous timestamp.
Figure 2: By using an RNN, the model would be passed information about every single word and might even be able to pick up on things such as common phrases.
One issue with RNNs, however, occurs due to the ‘vanishing gradient problem’. The 'vanishing gradient problem' is where each of the weights in a neural network is adjusted proportionally to the derivative of the current weight with respect to the loss function (how poorly the network is performing). Sometimes, the derivatives of weights in the earlier timestamps can become very small resulting in these weights not being trained properly. This means that the model ends up considering later words more than earlier words.
A type of RNN called an LSTM (Long Short Term Memory Network) can be used to minimise the effect of the vanishing gradient problem. LSTM units contain a cell and several gates (input, output and forget). The cell can be used to remember values over a certain time period and the gates control the flow of information through the cell.
Personal Opinion
I believe that sentiment analysis is an incredibly useful procedure. It can be used in market research, recommendation systems and much more.
Hashtags: #science #sciencebehindit #computerscience #compsci #personalopinion #computeralgorithms #algorithms #NowScience #scienceresearch #current #ScientificBlog #computers #interestingscientifictopics #Blog #writing #sentiment #sentimentwriting #tone #network #neuralnetwork #computermodels
Comments