Date of Award
5-16-2025
Publication Type
Thesis
Degree Name
M.Sc.
Department
Computer Science
Keywords
Aspect Sentiment Triplet Extraction; ASTE; BERT Joint Fine-Tuning; Data Augmentation; GPT-4
Supervisor
Christie Ezeife
Rights
info:eu-repo/semantics/embargoedAccess
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Abstract
Aspect-Based Opinion Mining (ABOM) analyzes customer feedback (review) to identify sentiments linked to specific product or service features (aspects). Foundational ABOM techniques that rely on pre-defined words labelled with sentiments (e.g., positive, negative, or neutral) or simple grammatical processing in a text review (e.g., “pizza” (noun), “fresh” (adjective)), struggle when a single review expresses multiple aspects and sentiments. In contrast, Aspect Sentiment Triplet Extraction (ASTE) a subset of ABOM, presents a more precise representation by extracting interconnected elements as a triplet of aspect, opinion, and sentiment from the text. For example, in the review, “The burger’s patty was tasty.”, ASTE produces the triplet in the format (aspect, opinion, sentiment): (patty, tasty, positive). Existing ASTE solutions such as ASTE-RL21, MSFAN22, and BMRC-DA-TF24 employ various deep neural network embeddings and transformer-based models (like BERT). ASTE-RL21 employs a two-step tagging approach. First, it utilizes SpaCy’s part-of-speech (POS) tagging vectors, BERT embeddings and previous token (word) embeddings to classify each token’s sentiment. Then, it applies BIO (Beginning, Inside, Outside) tagging scheme for aspect and opinion identification by using each token’s input embeddings from step one along with the sentiment classified earlier to mark beginning, inside or outside of an aspect or opinion token. However, when SpaCy misclassifies domain-specific terms (e.g., tagging "lightweight" as a noun instead of an adjective in the review, “The software is lightweight during multitasking.”), it produces incorrect POS embeddings. Also, the model’s reliance on use of previous token (word) embeddings in its sequential framework, leads to error propagation and degraded performance. MSFAN22 employs a convolutional neural network (CNN) based grid tagging to capture closely linked aspects and opinions as pairs but struggles to associate distantly positioned terms. BMRC-DA-TF24 attempts data augmentation (synthesizing data from existing data) but fails to modify key aspect-opinion terms, limiting the model’s adaptability to diverse datasets.This thesis proposes a Transformer-Based Adaptive Tagging Framework for Aspect Sentiment Triplet Extraction (ATF-ASTE), replacing ASTE-RL21's separate POS and BIO tagging steps by assigning unified numerical tags to aspects, opinions, and sentiments from annotated (aspect position, opinion position, sentiment) reviews. The numeric tags defined are: 0 (non-aspect/opinion/sentiment), 1 (single-word aspect/opinion), 2 (multi-word aspect/opinion), and sentiment tags as 1 (positive), 2 (negative), and 3 (neutral). Unlike ASTE-RL21 which requires extensive training (55 iterations) to update semantic token representations, the proposed approach uses the defined numeric tagging to generate refined semantic token embeddings by combining BERT’s contextual embeddings and relational embeddings from a graph neural network, reducing computational cost. ATF-ASTE’s GPT-4 driven data augmentation module, absent in ASTE-RL21, strengthens ATF-ASTE’s tagging domain adaptability. The proposed system first prepares the data by tokenizing reviews, tagging tokens separately for aspects, opinions, and sentiments using the unified numerical scheme and constructing relational edges between ASTE elements based on the tags given. Next, it generates contextual embeddings by processing the tags in parallel through BERT, and relational embeddings from the edges via graph neural layer, and concatenating them for joint fine-tuning with weighted task-specific and contrastive losses to perform triplet extraction. Finally, an independent augmentation module applies controlled paraphrasing and polarity inversion to Lap14 and Res15 datasets, providing diverse review variations to revalidate model’s performance. Experimental results on ASTE-DATA-V2 show that ATF-ASTE achieves up to 12.04% in Precision, 4.10% in Recall, and 4.88% in F1-score over ASTE-RL21, MSFAN22, BMRC-DA-TF24, and other state-of-the-art methods, indicating superior extraction of triplets from reviews.
Recommended Citation
Ahluwalia, Gurpartap Singh, "Transformer-Based Adaptive Tagging Framework for Aspect Sentiment Triplet Extraction(ATF-ASTE)" (2025). Electronic Theses and Dissertations. 9707.
https://scholar.uwindsor.ca/etd/9707