A comparison study on active learning integrated ensemble approaches in sentiment analysis ☆

Research paper by Deniz Aldoğan, Yusuf Yaslan

Indexed on: 24 Nov '16Published on: 23 Nov '16Published in: Computers & Electrical Engineering


One of the most challenging problems of sentiment analysis on social media is that labelling huge amounts of instances can be very expensive. Active learning has been proposed to overcome this problem and to provide means for choosing the most useful training instances. In this study, we introduce active learning to a framework which is comprised of most popular base and ensemble approaches for sentiment analysis. In addition, the implemented framework contains two ensemble approaches, i.e. a probabilistic algorithm and a derived version of Behavior Knowledge Space (BKS) algorithm. The Shannon Entropy approach was utilized for choosing among training data during active learning process and it was compared with maximum disagreement method and random selection of instances. It was observed that the former method causes better accuracies in less number of iterations. The above methods were tested on Cornell movie review dataset and a popular multi-domain product review dataset.

Figure 10.1016/j.compeleceng.2016.11.015.0.jpg
Figure 10.1016/j.compeleceng.2016.11.015.1.jpg
Figure 10.1016/j.compeleceng.2016.11.015.2.jpg
Figure 10.1016/j.compeleceng.2016.11.015.3.jpg
Figure 10.1016/j.compeleceng.2016.11.015.4.jpg
Figure 10.1016/j.compeleceng.2016.11.015.5.jpg
Figure 10.1016/j.compeleceng.2016.11.015.6.jpg
Figure 10.1016/j.compeleceng.2016.11.015.7.jpg