What is sentiment analysis?
Sentiment analysis is the process of extracting a particular mood (typically on the scale of positive to negative) from a piece of potentially subjective text. It is pervasive in the corporate world as a way to examine surveys, documents, and reviews. For instance, in my group’s microfiche project, we are using it to analyze the sentiment of the college’s own advertisements.
There is plenty of software that does form sentiment analysis. If you want to code and get technical, you can use the Python library nltk. I am going to use MonkeyLearn which uses its own trained models for sentiment analysis.
Get the text that you want to analyze:
Using whatever method you want, get a text file that you want to run sentiment analysis on. For the purposes of this tutorial, I am going to use Austin Mason’s syllabus for Hacking the Humanities which is accessible here or by pressing the “syllabus” tab on this very website. There are more advanced ways to collect data such as a CSV file of reviews, but that will not be covered in this tutorial.
Create an account on Monkey Learn:
Click here to go to the Monkey Lean. Create a free account by pressing the “sign up free” button. There are going to be several popups asking how much you will use it and for what.
Setting up Monkey Learn
After you verify your account information, enter the corresponding information appropriate to what you will use it for (the syllabus for context is around 1500 words).
Find the pre-made sentiment analysis:
When you first log in to your workspace, you may notice that there is nothing that says sentiment analysis here, and you may be tempted to create your “own workflow.” You do not want to create your own workflow because the Monkey Learn team has to manually approve it. Instead, press “pre-made models.” Then, you should see a button that says “Sentiment Analysis.”
Delete the existing text, and copy and paste your own text in the window. Finally, you should receive a percentage of positivity or negativity. The syllabus for this class is around 72% positive.
You can now run sentiment analysis. Nice job!
Monkey Learn has dozens of different analysis models that you can test. Be warned though, many require some form of a CSV file, and you may have to choose a different file to test. Another popular software for sentiment analysis is Lingmotif. If sentiment analysis is something that you want to explore on a more technical level, I highly recommend that you check out the Python package nltk as it is just a python library that you can use with your own code.
I have heard of sentiment analysis before taking this class and working on our project, but I have never heard of MonkeyLearn. It seems to have a bunch of other analysis models that I bet could be useful for our group or others in this class!
Have you ever used SpaCy? It’s another library similar to nltk but written in C so a lot faster.
James, this is Henry (from class).
I like what you’ve done here- it has left me thinking… what kinds of text do you think are most appropriate for sentiment analysis? I imagine the distinction between objective and subjective text could be an important factor. Or do you think the beauty of sentiment analysis is that it can extract textual tones from objective text, too?
Another thing that is important to consider is the algorithm that Monkey Learn uses. Are there biases? Is the documentation publicly available?
This is a cool tool! I used it to do a sentiment analysis on my group’s final project proposal, and it turned out to be 99.3% positive!
This is a good tutorial, you guided us in what to do and warned us about things we didn’t need to do. It was interesting, and I found out that my math class syllabus has a 71.5% positivity rate (so Austin is beating my math prof by a tiny bit!).
As for things that I think you could improve, you have one part that says “click here” but is not linked anywhere in the creating your account step. When I signed up I was asked why I needed to analyze texts, which I know see that it had nothing to do with the analysis at the end, but I would’ve liked to know that because I didn’t know what I was supposed to answer at first. Finally, I think you could add like one more line saying like [0,30) positivity means this, or [90,100] means that. So we can interpret our result better.
Maybe I’m just jumping on the bandwagon here with completing this tutorial, but I think sentiment analysis is a really neat tool and has applications in the conversation of accessibility in the digital humanities. For some, picking up sentiment and the thoughts behind a written post can be difficult to discern, so using this simple, pre-made model can aid individuals in understanding the information they’ve received and also ensuring that their messages and information are tonally and sentimentally what they intend to communicate.