Pointwise Mutual Information for Opinion Mining Feature Level Product Reviews

Indira Syawanodya

Abstract


The determination of whether a product holds a positive or negative sentiment can be inferred from reviews provided by previous customers. In recent years, various websites have emerged that offer product reviews, in which the value of a product is evaluated through user-generated ratings and textual comments. However, the abundance of reviews often makes it challenging for prospective customers to interpret the overall sentiment accurately. To address this issue, a classification approach can be employed to determine the polarity of product reviews. Opinion Mining, also known as sentiment analysis, is a field of study that focuses on analyzing individuals’ opinions toward entities, individuals, issues, events, topics, and their associated attributes. The implementation of feature extraction prior to the classification process has been shown to significantly enhance the accuracy of sentiment assessment. One effective method for feature extraction is Pointwise Mutual Information (PMI), which leverages search engine statistics to identify meaningful term associations in real time. PMI enables the system to capture semantic relationships between words, thereby improving the reliability of sentiment classification.

Keywords


Feature extraction; Opinion mining; Pointwise mutual information; Product review; Review opinion;

Full Text:

PDF

References


Balakrishman, V., and Ye, E. (2014). Stemming and lemmatization: A comparison of retrieval performances.

de Marneffe, C., and Manning, C. D. (2013). Stanford typed dependency manual. Stanford University.

Etzioni, O., Cafarella, M., Downey, D., Popescu, A.-M., Shaked, T., Soderland, S., and Weld, D. S. (2005). Unsupervised named-entity extraction from the web: An experimental study. Department of Computer Science and Engineering, University of Washington.

Mudambi, S. M., and Schuff, D. (2010). What makes a helpful online review? A study of customer reviews on Amazon.com. MIS Quarterly, 34(1), 185–200

Hu, M., and Liu, B. (n.d.). Mining and summarizing customer reviews. University of Computer Science.

Hu, M., and Liu, B. (2006). Opinion feature extraction using class sequential rules. In Proceedings of the American Association for Artificial Intelligence (AAAI).

Liu, B. (2007). Web data mining: Exploring hyperlinks, contents, and usage data. Springer.

Liu, B. (2012). Sentiment analysis and opinion mining. Synthesis Lectures on Human Language Technologies, 5(1), 1–167. Morgan & Claypool Publishers.

Manning, C. D., Raghavan, P., & Schütze, H. (2008). Introduction to information retrieval. Cambridge University Press.

Manning, C. D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S. J., & McClosky, D. (2014). The Stanford CoreNLP natural language processing toolkit. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 55–60. Association for Computational Linguistics (ACL).

Popescu, A.-M., and Etzioni, O. (2007). Extracting product features and opinions from reviews. In Proceedings of NLP and Text Mining (pp. 9–28).

Santorini, B. (1991). Part-of-speech tagging guidelines for the Penn Treebank Project. University of Pennsylvania.

Shapira, O., and Levy, R. (2020). Massive multi-document summarization of product reviews with weak supervision. arXiv preprint arXiv:2007.11348. Cornell University Library.

Turney, P. D. (2001). Mining the web for synonyms: PMI-IR versus LSA on TOEFL. In Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp. 491–502).

Turney, P. D. (2002). Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL) (pp. 417–424).




DOI: https://doi.org/10.17509/seict.v6i1.89411

Refbacks

  • There are currently no refbacks.


Copyright (c) 2025 Journal of Software Engineering, Information and Communication Technology (SEICT)

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Journal of Software Engineering, Information and Communicaton Technology (SEICT), 
(e-ISSN:
2774-1699 | p-ISSN:2744-1656) published by Program Studi Rekayasa Perangkat Lunak, Kampus UPI di Cibiru.


 Indexed by.