Pattern and semantic analysis to improve unsupervised techniques for opinion target identification
Keywords:
Information retrieval, machine learning, natural language processing, opinion mining, text mining.Abstract
This research employs patterns and semantic analysis to improve the existingunsupervised opinion targets extraction technique. Two steps are employed to identifyopinion targets: candidate selection and opinion targets selection. For candidateselection; a combined lexical based syntactic pattern is identified. For opinion targetsselection, a hybrid approach that combines the existing likelihood ratio test techniquewith semantic based relatedness is proposed. The existing approach basically extractsfrequently observed targets in text. However, analysis shows that not all target featuresoccur frequently in the texts. Hence the hybrid technique is proposed to extractboth frequent and infrequent targets. The proposed algorithm employs incrementalapproach to improve the performance of existing unsupervised mining of featuresby extracting infrequent features through semantic relatedness with frequent featuresbased on lexical dictionary. Empirical results show that the hybrid technique withcombined patterns outperforms the existing techniques.
References
Agrawal, R. & Srikant, R. (1994) Fast algorithms for mining association rules in large databases. 20th
International Conference on Very Large Data Bases: Morgan Kaufmann Publishers Inc., pp. 487-
Ben-David, S., Blitzer J., Crammer. K. & Pereira. F. (2007) Analysis of representations for domain
adaptation. In proceedings of Advances in Neural Information Processing Systems, USA, Vol. 137,
pp.1-8
Bloom, K, Garg, N. & Argamon, S. (2007) Extracting appraisal expressions. In Human Language
Technologies 2007: The Conference of the North American Chapter of the Association for
Computational Linguistics. Rochester, New York, USA, pp 308–315.
Cambria, E., Schuller, B., Xia, Y. & Havasi, C. (2013) New avenues in opinion mining and sentiment
analysis. IEEE Intelligent Systems, (2):15-21.
Carenini, G., Ng, R.T. & Zwart, E. (2005) Extracting knowledge from evaluative text. In Proceedings of
the 3rd international conference on Knowledge capture, pp. 11-18, ACM.
Dunning, T. (1993) Accurate methods for the statistics of surprise and coincidence. Comput Linguist,
(1):61-74.
Ferreira, L., Jakob, N. & Gurevych, I. (2008) A comparative study of feature extraction algorithms in
customer reviews. 2008 IEEE International Conference on Semantic Computing, pp. 144-151.
Goujon, B. (2011) Text mining for opinion target detection. European Intelligence and Security Informatics
Conference (EISIC), pp. 322-326.
Holzinger, W, Krüpl, B. & Herzog, M. (2006) Using ontologies for extracting product features from
web pages. 5th International Semantic Web Conference, ISWC 2006. Athens, Georgia, USA, pp.
-299.
Hu, M. & Liu, B. (2004) Mining and summarizing customer reviews. 10th ACM SIGKDD international
conference on Knowledge discovery and data mining. Seattle, WA, USA: ACM, pp. 168-177.
Hung, C. & Lin, H.K. (2013) Using objective words in SentiWordNet to improve sentiment classification
for word of mouth. IEEE Intelligent Systems, pp. 47-54.
Klema, J. & Almonayyes, A. (2006) Automatic categorization of fanatic texts using random forests.
Kuwait journal of science and engineering, 33(2):1-18.
Kessler, J.S., Eckert, M., Clark, L. & Nicolov, N. (2010) The 2010 icwsm jdpa sentment corpus for
the automotive domain. 4th International AAAI Conference on Weblogs and Social Media Data
Workshop Challenge (ICWSMDWC2010) Washington, DC, USA.
Kobayashi, N., Inui, K., Matsumoto, Y., Tateishi, K. & Fukushima, T. (2004) Collecting evaluative
expressions for opinion extraction. 1st International Joint Conference on Natural Language
Processing. Hainan Island, China, pp. 596-605.
Khairullah Khan. & Baharum, B. Baharudin. (2012) Analysis of syntactic patterns for identification
of features from unstructured reviews. 4th International Conference on Intelligent and Advanced
Systems (ICIAS), (Volume:1 ) pp. 165-169.
Khairullah, Khan., Baharum B. Baharudin. & Aurangzeb, Khan. (2013) Mining Opinion Targets
from Text Documents: A Review. Journal of Emerging Technologies in Web Intelligence, Vol 5, No
, pp. 343-353, Nov 2013 doi:10.4304/jetwi.5.4.343-353.
Lin, C.J. & Chao, P.H. (2010) Tourism related opinion detection and tourist attraction target identification.
International Journal of Computational Linguistics & Chinese Language Processing, 15(1):3-16
Lu, Y. & Zhai, C. (2008) Opinion integration through semi-supervised topic modeling. 17th International
World Wide Web Conference (WWW ’08). Beijing, China. pp. 121-130.
Navigli, R. (2009) Word sense disambiguation: A survey. ACM Comput Survey, 41(2):1-69.
Popescu, A.M. & Etzioni, O. (2005) Extracting product features and opinions from reviews. Proceedings
of the conference on human language technology and empirical methods in natural language
processing. Vancouver, British Columbia, Canada: Association for Computational Linguistics, pp.
-346.
Poria, S., Gelbukh, A., Hussain, A., Das, D. & Bandyopadhyay, S. (2013) Enhanced SenticNet with
affective labels for concept-based opinion mining. IEEE Intelligent Systems, pp 31-38.
Resnik, P. (1999) Semantic similarity in a taxonomy: An information-based measure and its application to
problems of ambiguity in natural language. Journal of Artificial Intelligence Research, 11:95-130
Toutanova, K. & Manning, C.D. (2000) Enriching the knowledge sources used in a maximum entropy
part-of-speech tagger. Joint SIGDAT Conference on Empirical Methods in Natural Language
Processing and Very Large Corpora (EMNLP/VLC-2000). pp. 63-70.
Umagandhi, R. & Kumar, A.S. (2014) Time heuristics ranking approach for recommended queries using
search engine query logs. Kuwait Journal of Science, 41(2):127-149.
Wei, C.P., Chen, Y.M., Yang, C.S. & Yang, C.C. (2010) Understanding what concerns consumers:
a semantic approach to product features extraction from consumer reviews. Info Syst E-Bus
Management, (8):149-167.
Weichselbraun, A., Gindl, S. & Scharl, A. (2013). Extracting and grounding context-aware sentiment
lexicons.IEEE Intelligent Systems, 28(2):39-46.
Yi, J., Nasukawa, T., Bunescu, R. & Niblack, W. (2003) Sentiment analyzer: extracting sentiments about
a given topic using natural language processing techniques. Third IEEE International Conference
on Data Mining (ICDM) pp. 427-434.
Zhai, Z., Liu, B., Xu, H. & Jia, P. (2011) Clustering product features for opinion mining. The fourth ACM
international conference on Web search and data mining. Hong Kong, China: ACM. pp. 347-354.
Zhuang, L., Jing, F. & Zhu, X.Y. (2006) Movie review mining and summarization. The 15th ACM
international conference on Information and knowledge management. Arlington, Virginia, USA:
ACM. pp. 43-50.