Sentiment
analysis (also known as opinion mining) refers to the use of natural language processing, text analysis and computational linguistics to identify and extract subjective
information in source materials[1].
In the forth lecture of Social Media Analysis,
professor Chan has introduced to us the theory of sentiment analysis and several
cases as well. Actually, sentiment
analysis is not totally new to us. It is already applied to the consumer
websites and used by us almost every day. When we are purchasing goods on
taobao.com or amazon.com, we can see ratings of the good from many aspects.
When we are choosing restaurant online, we can make our decision by counting
how many stars the restaurant owns. Then the next question put forward is how
to conduct the sentiment analysis?
Here I found an interesting case to analyze
Smartphone related twitter Reviews by using opinion mining techniques. Why not take a brief
look at the case and learn how to conduct a real Sentiment analysis or opinion
mining.
Figure-1
Figure-1 illustrates the opinion mining procedure on this Smartphone
related twitter
reviews.It takes four steps all together.[3]
The first step is to retrieve
the tweets which were containing information for Galaxy
S4, IPhone 5 and
Blackberry Q10 from a specified period of time. Those tweets are then
saved to
a local database by using Twitter Open API.
The second step is to classify
those tweets into six categories (Display,Network, AP,
Size, Camera and Audio),
which are pre-defines as six important attributes for the
Smartphone.
The third step is to find out
the polarity of the opinions towards each attributes. In order
to make it
simpler, only positive and negative values are counted. In this case, an
opinion mining analysis program called LIWC (Linguistic Inquiry and Word Count)[2] is
used
to normalize the degree of the polarity.
The fourth step is to display and
analyze the opinion mining result. From the output
shown in figure-2 , we can
find out which type of Smartphone has the highest rating over
some certain
period while figure-3 indicates more detailed opinion towards each attributes
Figure-2
Figure-3
Well, this is only a very simple opinion
mining case I found on the website sprinter.com.
Based on this case and what we’ve learnt from the lectures, there are still
some questions we can think about one or more steps further. Here I list some
of the questions I think we can give a second thought and those who visit my
homepage are welcomed to give your own opinions.
In the second step of the above
case, what kind of method or program can be adopted to classify the twitter
reviews?
In the third step instead of
using the paid program LIWC , is there any other methods can be
used to
normalize the opinion ?
When displaying the result of
the analysis, is there any tool or program can help to show the
result more
direct or maybe beautifully ?
[reference]
[1] http://en.wikipedia.org/wiki/Sentiment_analysis
[3] http://link.springer.com/chapter/10.1007/978-3-319-05503-9_20
[2] http://www.liwc.net
Hi Lin, Thank you for visiting my blog~ After reading yours, I get a clearer vision about the procedures of sentiment analysis. So the case study is very helpful! Besides, the questions you asked are interesting and reasonable. For your first question, I think that maybe the Twitter Open API can provide us a interface to search the tweets using one or more keywords (just like searching tweets in step 1), and if we search those tweets in the database, using keywords with respect to the six attributes predefined, we can classify the tweets roughly by grouping every searching results into the same category. But there may exists some overlapping groups using this method. It is just a suspect, and I think there can be some other ways. How do you think?
回复删除Hi Wenqin, I am so glad there is someone discussing with me on this blogger. For the first question I raised, I think you are right. The twitter already provides the API to the third party and we can even retrieve those information by the simple tool NodeXL. The overlap problem you mentioned is a good point, I also believe there are many tweets including not only one aspect of the six categories. I think what we have to do is to build a DB module first, and 'copy' the tweets into different DB table representing different categories. Does it make any sense to you?
删除Yes, I think you are right~ Some DB techniques should be used and matrices for classifications should be defined, which I am not professional at. But I think it is very interesting. So there are so much for me to learn!
删除Hi, Lin~Thanks for your sharing. After reading your blog, I have a deeper understanding of the opinion mining procedure. And your questions are all about the knowledge of programing. I suggest you to ask some CS students or read this book called Sentiment Analysis and Opinion Mining (Bing Liu. Morgan & Claypool Publishers, May 2012.). Hope that can help you to solve problems:)
回复删除After reading this blog of yours, I learned a lot. First, this is a very typical case in which data mining is used. Second, this also give me Inspiration of the project of our class, especially the future meaning or commercial value about our project. Thank you very much for the sharing:)
回复删除Hi Lynn! I fond what this blog talked about is my project task. You provide another angle to analyzing data, that is the sentiment for different attributes. I think this is a good perspective and our team will extract more useful information from huge amount of data.
回复删除Hi jiang
删除I think you are the same team with jiang yi because he also mentioned the project. I think the procedure in my post is typical for sentimental analysis and I am very looking forward to see the presentation on your team
Hi,jiang. The blog is very inspiring!
回复删除I have a question about the term "normalizing opinion" mention in your blog. What's the meaning of it?
Hi, the article is vary useful in opinion. But I have a little question about the emotion rank of different words. I this when we talk about different objects. The same word my contain different emotion and information. How to let the model fulfil this problem is worth a deeper discussion.
回复删除