Murtadha Talib AL-Sharuee, Fei Liu, Mahardhika Pratama


Products reviews are one of the major resources to determine the public sentiment. The existing literature on reviews sentiment analysis mainly utilizes supervised paradigm, which needs labeled data to be trained on and suffers from domain-dependency. This article addresses these issues by describes a completely automatic approach for sentiment analysis based on unsupervised ensemble learning. The method consists of two phases. The first phase is contextual analysis, which has five processes, namely (1) data preparation; (2) spelling correction; (3) intensifier handling; (4) negation handling and (5) contrast handling. The second phase comprises the unsupervised learning approach, which is an ensemble of clustering classifiers using a majority voting mechanism with different weight schemes. The base classifier of the ensemble method is a modified k-means algorithm. The base classifier is modified by extracting initial centroids from the feature set via using SentWordNet (SWN). We also introduce new sentiment analysis problems of Australian airlines and home builders which offer potential benchmark problems in the sentiment analysis field. Our experiments on datasets from different domains show that contextual analysis and the ensemble phases improve the clustering performance in term of accuracy, stability and generalization ability.