Pattern and semantic analysis to improve unsupervised techniques for opinion target identification


  • Khairullah khan University of Science and Technology Bannu
  • Ashraf Ullah
  • Baharum Baharudin


Information retrieval, machine learning, natural language processing, opinion mining, text mining.


This research employs patterns and semantic analysis to improve the existingunsupervised opinion targets extraction technique. Two steps are employed to identifyopinion targets: candidate selection and opinion targets selection. For candidateselection; a combined lexical based syntactic pattern is identified. For opinion targetsselection, a hybrid approach that combines the existing likelihood ratio test techniquewith semantic based relatedness is proposed. The existing approach basically extractsfrequently observed targets in text. However, analysis shows that not all target featuresoccur frequently in the texts. Hence the hybrid technique is proposed to extractboth frequent and infrequent targets. The proposed algorithm employs incrementalapproach to improve the performance of existing unsupervised mining of featuresby extracting infrequent features through semantic relatedness with frequent featuresbased on lexical dictionary. Empirical results show that the hybrid technique withcombined patterns outperforms the existing techniques.

Author Biography

Khairullah khan, University of Science and Technology Bannu

Assistant Professor, ICES, UST Bannu


Agrawal, R. & Srikant, R. (1994) Fast algorithms for mining association rules in large databases. 20th

International Conference on Very Large Data Bases: Morgan Kaufmann Publishers Inc., pp. 487-

Ben-David, S., Blitzer J., Crammer. K. & Pereira. F. (2007) Analysis of representations for domain

adaptation. In proceedings of Advances in Neural Information Processing Systems, USA, Vol. 137,


Bloom, K, Garg, N. & Argamon, S. (2007) Extracting appraisal expressions. In Human Language

Technologies 2007: The Conference of the North American Chapter of the Association for

Computational Linguistics. Rochester, New York, USA, pp 308–315.

Cambria, E., Schuller, B., Xia, Y. & Havasi, C. (2013) New avenues in opinion mining and sentiment

analysis. IEEE Intelligent Systems, (2):15-21.

Carenini, G., Ng, R.T. & Zwart, E. (2005) Extracting knowledge from evaluative text. In Proceedings of

the 3rd international conference on Knowledge capture, pp. 11-18, ACM.

Dunning, T. (1993) Accurate methods for the statistics of surprise and coincidence. Comput Linguist,


Ferreira, L., Jakob, N. & Gurevych, I. (2008) A comparative study of feature extraction algorithms in

customer reviews. 2008 IEEE International Conference on Semantic Computing, pp. 144-151.

Goujon, B. (2011) Text mining for opinion target detection. European Intelligence and Security Informatics

Conference (EISIC), pp. 322-326.

Holzinger, W, Krüpl, B. & Herzog, M. (2006) Using ontologies for extracting product features from

web pages. 5th International Semantic Web Conference, ISWC 2006. Athens, Georgia, USA, pp.


Hu, M. & Liu, B. (2004) Mining and summarizing customer reviews. 10th ACM SIGKDD international

conference on Knowledge discovery and data mining. Seattle, WA, USA: ACM, pp. 168-177.

Hung, C. & Lin, H.K. (2013) Using objective words in SentiWordNet to improve sentiment classification

for word of mouth. IEEE Intelligent Systems, pp. 47-54.

Klema, J. & Almonayyes, A. (2006) Automatic categorization of fanatic texts using random forests.

Kuwait journal of science and engineering, 33(2):1-18.

Kessler, J.S., Eckert, M., Clark, L. & Nicolov, N. (2010) The 2010 icwsm jdpa sentment corpus for

the automotive domain. 4th International AAAI Conference on Weblogs and Social Media Data

Workshop Challenge (ICWSMDWC2010) Washington, DC, USA.

Kobayashi, N., Inui, K., Matsumoto, Y., Tateishi, K. & Fukushima, T. (2004) Collecting evaluative

expressions for opinion extraction. 1st International Joint Conference on Natural Language

Processing. Hainan Island, China, pp. 596-605.

Khairullah Khan. & Baharum, B. Baharudin. (2012) Analysis of syntactic patterns for identification

of features from unstructured reviews. 4th International Conference on Intelligent and Advanced

Systems (ICIAS), (Volume:1 ) pp. 165-169.

Khairullah, Khan., Baharum B. Baharudin. & Aurangzeb, Khan. (2013) Mining Opinion Targets

from Text Documents: A Review. Journal of Emerging Technologies in Web Intelligence, Vol 5, No

, pp. 343-353, Nov 2013 doi:10.4304/jetwi.5.4.343-353.

Lin, C.J. & Chao, P.H. (2010) Tourism related opinion detection and tourist attraction target identification.

International Journal of Computational Linguistics & Chinese Language Processing, 15(1):3-16

Lu, Y. & Zhai, C. (2008) Opinion integration through semi-supervised topic modeling. 17th International

World Wide Web Conference (WWW ’08). Beijing, China. pp. 121-130.

Navigli, R. (2009) Word sense disambiguation: A survey. ACM Comput Survey, 41(2):1-69.

Popescu, A.M. & Etzioni, O. (2005) Extracting product features and opinions from reviews. Proceedings

of the conference on human language technology and empirical methods in natural language

processing. Vancouver, British Columbia, Canada: Association for Computational Linguistics, pp.


Poria, S., Gelbukh, A., Hussain, A., Das, D. & Bandyopadhyay, S. (2013) Enhanced SenticNet with

affective labels for concept-based opinion mining. IEEE Intelligent Systems, pp 31-38.

Resnik, P. (1999) Semantic similarity in a taxonomy: An information-based measure and its application to

problems of ambiguity in natural language. Journal of Artificial Intelligence Research, 11:95-130

Toutanova, K. & Manning, C.D. (2000) Enriching the knowledge sources used in a maximum entropy

part-of-speech tagger. Joint SIGDAT Conference on Empirical Methods in Natural Language

Processing and Very Large Corpora (EMNLP/VLC-2000). pp. 63-70.

Umagandhi, R. & Kumar, A.S. (2014) Time heuristics ranking approach for recommended queries using

search engine query logs. Kuwait Journal of Science, 41(2):127-149.

Wei, C.P., Chen, Y.M., Yang, C.S. & Yang, C.C. (2010) Understanding what concerns consumers:

a semantic approach to product features extraction from consumer reviews. Info Syst E-Bus

Management, (8):149-167.

Weichselbraun, A., Gindl, S. & Scharl, A. (2013). Extracting and grounding context-aware sentiment

lexicons.IEEE Intelligent Systems, 28(2):39-46.

Yi, J., Nasukawa, T., Bunescu, R. & Niblack, W. (2003) Sentiment analyzer: extracting sentiments about

a given topic using natural language processing techniques. Third IEEE International Conference

on Data Mining (ICDM) pp. 427-434.

Zhai, Z., Liu, B., Xu, H. & Jia, P. (2011) Clustering product features for opinion mining. The fourth ACM

international conference on Web search and data mining. Hong Kong, China: ACM. pp. 347-354.

Zhuang, L., Jing, F. & Zhu, X.Y. (2006) Movie review mining and summarization. The 15th ACM

international conference on Information and knowledge management. Arlington, Virginia, USA:

ACM. pp. 43-50.


