2014年10月17日星期五

Let's Start Analyzing with NodeXL

In the sixth lecture of Social Media Analysis, we've learnt that social network can be viewed as a graph that describes the relations among a group of people. In the Graphs, vertices are used to represent an individual while edges are the concept to identify the direct connections between vertices.

During the class, Prof. Chan introduced a useful tool called NodeXL, which is a free, open-source template for Microsoft Excel 2007,2010 and 2013 that makes it easy to explore network graphs.

After the class, I used NodeXL to help me find out the twitters who send tweets with the keyword "CUHK" from the location of Hong Kong in the past one week and their netwok information as well.

Here is the graph I output by using NodeXL at the first place, we can see there are too many nodes (vertices) and lines (edges) on the graph and it is difficult to observe any information from it.



Then I adopt the Groups function to group the vertices by clusters and layout the graph by each cluster. 


Now can you find out the centrality, in-degree/out-degree and the betweenness/closeness information from it ? 

As a matter of fact, this is the first time for me to use NodeXL. What surprised me is that NodeXL integrates the API interface with some popular social network website already.  

If you find it interesting , you can check this tutorial video online to learn more about NodeXL (http://www.youtube.com/watch?v=PC-PgkhpsNc) to learn more about NodeXL :)


2014年10月6日星期一

A Sentiment Analysis Case Study

Sentiment analysis (also known as opinion mining) refers to the use of natural language processingtext analysis and computational linguistics to identify and extract subjective information in source materials[1]. 


In the forth lecture of Social Media Analysis, professor Chan has introduced to us the theory of sentiment analysis and several cases as well.  Actually, sentiment analysis is not totally new to us. It is already applied to the consumer websites and used by us almost every day. When we are purchasing goods on taobao.com or amazon.com, we can see ratings of the good from many aspects. When we are choosing restaurant online, we can make our decision by counting how many stars the restaurant owns. Then the next question put forward is how to conduct the sentiment analysis?  


Here I found an interesting case to analyze Smartphone related twitter Reviews by using opinion mining techniques. Why not take a brief look at the case and learn how to conduct a real Sentiment analysis or opinion mining. 

Figure-1

Figure-1 illustrates the opinion mining procedure on this Smartphone related twitter
reviews.It takes four steps all together.[3]

The first step is to retrieve the tweets which were containing information for Galaxy
S4, IPhone 5 and Blackberry Q10 from a specified period of time. Those tweets are then
saved to a local database by using Twitter Open API.

The second step is to classify those tweets into six categories (Display,Network, AP,
Size, Camera and Audio), which are pre-defines as six important attributes for the
Smartphone.

The third step is to find out the polarity of the opinions towards each attributes. In order
to make it simpler, only positive and negative values are counted. In this case, an
opinion mining analysis program called LIWC (Linguistic Inquiry and Word Count)[2] is used
to normalize the degree of the polarity.

The fourth step is to display and analyze the opinion mining result. From the output
shown in figure-2 , we can find out which type of Smartphone has the highest rating over
some certain period while figure-3 indicates more detailed opinion towards each attributes

Figure-2
Figure-3

Well, this is only a very simple opinion mining case I found on the website sprinter.com. Based on this case and what we’ve learnt from the lectures, there are still some questions we can think about one or more steps further. Here I list some of the questions I think we can give a second thought and those who visit my homepage are welcomed to give your own opinions.

In the second step of the above case, what kind of method or program can be adopted to classify the twitter reviews?  

In the third step instead of using the paid program LIWC , is there any other methods can be
used to normalize the opinion ?

When displaying the result of the analysis, is there any tool or program can help to show the
result more direct or maybe beautifully ?

[reference]
[1]  http://en.wikipedia.org/wiki/Sentiment_analysis
[3] http://link.springer.com/chapter/10.1007/978-3-319-05503-9_20
[2] http://www.liwc.net