Semi-supervised collective extraction of opinion target and opinion word from online reviews based on active labeling

Abstract

Online reviews play important roles in many Web Applications like e-business and government intelligence, since such user-generated-contents (UGC) contain rich user opinion. Opinion target and opinion word are a pair of core objects for user opinion expression in reviews. Extracting these two objects from reviews is crucial for the tasks of opinion mining. However, traditional extraction methods have various limitations such as ignoring the opinion relationship, the restriction of word span, the error propagation caused by iterative expansion, which would reduce the extraction performance. For the above deficiencies, we propose a supervised method based on the constrained word alignment model to extract opinion target and opinion word collectively at first. To tackle the time-consuming and error-prone problem of manual annotation encountered by the supervised method, we further devise a semi-supervised extraction method based on active learning. In this method, we design the sample uncertainty-based sampling strategy and the feature evidence-based one to choose the most informative samples for labeling manually. At last, a series of experiments on a real-world dataset show that our approaches outperform several state-of-the-art baselines significantly.

Keywords

Collective extraction opinion target opinion word active learning uncertainty measurement

Get full access to this article

View all access options for this article.

References

Serrano-Guerrero

, Olivas

J.A.

, Romero

F.P.

and Herrera-Viedma

, Sentiment analysis: A review and comparative analysis of web services, Information Sciences311 (2015), 18–38.

Cone Inc. 2011 online influence trend tracker, 2011. http://www.conecomm.com/news-blog/2011-online-influence-trend-tracker-release

Hassan Khan

, Qamar

and Bashir

, Building normalized sentimi to enhance semi-supervised sentiment analysis, Journal of Intelligent & Fuzzy Systems29(5) (2015), 1805–1816.

Lin

, Zhang

, Wang

and Zhou

, Sentiment classification via integrating multiple feature presentations, In Proceedings of the 21st International Conference on World Wide Web, 2012, pp. 569–570. ACM.

Pang

, Lee

and Vaithyanathan

, Thumbs up?: Sentiment classification using machine learning techniques, In Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing-Volume 10, Association for Computational Linguistics, 2002, pp. 79–86.

Gao

and Fu

, Methods of uncertain partial differential equation with application to internet public opinion problem, Journal of Intelligent & Fuzzy Systems, 1–11. (Preprint).

Wang

, Wei

, Liu

, Zhou

and Zhang

, Topic sentiment analysis in twitter: A graph-based hashtag sentiment classification approach, In Conference on Information and Knowledge Management, 2011, pp. 1031–1040. ACM.

Zhao

, Qin

, Liu

and Tang

, Social sentiment sensor: A visualization system for topic detection and topic sentiment analysis on microblog, Multimedia Tools and Applications75 (2016), 8843–8860.

Guzman

and Maalej

, How do users like this feature? a fine grained sentiment analysis of app reviews, In 2014 IEEE 22nd International Requirements Engineering Conference (RE), 2014, pp. 153–162. IEEE.

10.

Yang

and Cardie

, Joint inference for fine-grained opinion extraction, In ACL (1), 2013, pp. 1640–1649.

11.

Zhai

, Liu

, Xu

and Jia

, Clustering product features for opinion mining, In Proceedings of the Fourth ACM International Conference on Web Search and Data Mining, 2011, pp. 347–354. ACM.

12.

and Liu

, Mining opinion features in customer reviews, In AAAI, volume 4, 2004, pp. 755–760.

13.

Wang

and Wang

, Bootstrapping both product features and opinion words from chinese customer reviews with crossinducing, In IJCNLP, volume 8, 2008, pp. 289–295.

14.

Popescu

and Etzioni

, Extracting product features and opinions from reviews, In Natural Language Processing and Text Mining, Springer, 2007, pp. 9–28.

15.

Zhang

, Liu

, Lim

and O’Brien-Strain

, Extracting and ranking product features in opinion documents, In Proceedings of the 23rd International Conference on Computational Linguistics, Association for Computational Linguistics, 2010, pp. 1462–1470.

16.

, Qin

, Xu

and Guo

, A holistic model of mining product aspects and associated sentiments from online reviews, Multimedia Tools and Applications74 (2015), 10177–10194.

17.

Brown

P.F.

, Pietra

V.J.D.

, Pietra

S.A.D.

and Mercer

R.L.

, The mathematics of statistical machine translation: Parameter estimation, Computational Linguistics19(2) (1993), 263–311.

18.

, Zhang

, Huang

and Wu

, Phrase dependency parsing for opinion mining, In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3, Association for Computational Linguistics, 2009, pp. 1533–1541.

19.

Kobayashi

, Inui

and Matsumoto

, Extracting aspectevaluation and aspect-of relations in opinion mining, In EMNLP-CoNLL, volume 7, 2007, pp. 1065–1074. Citeseer.

20.

, Han

, Huang

, Zhu

, Xia

, Zhang

and Yu

, Structure-aware review mining and summarization, In Proceedings of the 23rd International Conference on Computational Linguistics, Association for Computational Linguistics, 2010, pp. 653–661.

21.

Jin

, Ho

and Srihari

R.K.

, A novel lexicalized hmmbased learning framework for web opinion mining, In Proceedings of the 26th Annual International Conference on Machine Learning, 2009, pp. 465–472. Citeseer.

22.

Lafferty

, McCallum

and Pereira

, Conditional random fields: Probabilistic models for segmenting and labeling sequence data, In Proceedings of the Eighteenth International Conference on Machine Learning, ICML, volume 1, 2001, pp. 282–289.

23.

Rabiner

and Juang

, An introduction to hidden markov models, IEEE Assp Magazine3(1) (1986), 4–16.

24.

Liu

, Xu

and Zhao

, Co-extracting opinion targets and opinion words from online reviews based on the word alignment model, IEEE Transactions on Knowledge and Data Engineering27(3) (2015), 636–650.

25.

Wang

, Zhang

, Yin

, Wang

, Zhang

and Xu

, A unified framework for fine-grained opinion mining from online reviews, In 2016 49th Hawaii International Conference on System Sciences (HICSS), 2016, pp. 1134–1143. IEEE.

26.

Qiu

, Liu

, Bu

and Chen

, Opinion word expansion and target extraction through double propagation, Computational Linguistics37(1) (2010), 9–27.

27.

De Marneffe

, MacCartney

and Manning

C.D.

, Generating typed dependency parses from phrase structure parses, In Proceedings of LREC, volume 6, Genoa Italy, 2006, pp. 449–454.

28.

Lewis

D.D.

and Gale

W.A.

, A sequential algorithm for training text classifiers, In Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Springer-VerlagNew York Inc., 1994, pp. 3–12.

29.

Sharma

and Bilgic

, Most-surely vs. least-surely uncertain, In 13th International Conference on Data Mining, 2013, pp. 667–676. IEEE.

30.

Liu

, Xu

and Zhao

, Extracting opinion targets and opinion words from online reviews with graph co-ranking, In ACL (1), 2014, pp. 314–324.