Mining Twitter Multi-word Product Opinions with Most Frequent Sequences of Aspect Terms

Document Type

Conference Proceeding

Publication Date

1-1-2022

Publication Title

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Volume

13635 LNCS

First Page

126

Keywords

Aspect based opinion mining, Sequential pattern mining, Topic modeling, Twitter sentiment analysis

Last Page

136

Abstract

Given a corpus of microblog texts from a social media platform such as Twitter (e.g., “the new iPhone battery life is good, but camera quality is bad”), mining multi-word aspects (e.g., battery life, camera quality) and opinions (e.g., good, bad) of these products is challenging due to the vast amount of data being generated. Aspect-Based Opinion Mining (ABOM) is thus a combination of automatic aspect extraction and opinion mining that allows an enterprise to analyze the data on relevant features of products in detail, saving time and money. Existing Twitter ABOM systems such as Hate Crime Twitter Sentiment (HCTS) and Microblog Aspect Miner (MAM) generally go through the four-step approach of obtaining microblog posts, identifying frequent nouns (candidate aspects), pruning the candidate aspects, and getting opinion polarity. However, they differ in how well they prune their candidate features. This paper proposes a system called Microblog Aspect Sequence Miner (MASM) as an extension of Microblog Aspect Miner (MAM) by replacing the Apriori algorithm with a modified frequent sequential pattern mining algorithm based on CM-SPAM to also enable mining multi-word aspects more efficiently. The proposed system is able to determine the summary of most common aspects (Aspect Category) and their sentiments for a product. Experimental results with evaluation metrics of execution time, precision, recall, and F1-measure indicate that our approach has higher recall and precision than these existing systems on Sanders Twitter corpus dataset.

DOI

10.1007/978-3-031-21047-1_12

ISSN

03029743

E-ISSN

16113349

ISBN

9783031210464

Share

COinS