Date of Award
2-15-2024
Publication Type
Thesis
Degree Name
M.Sc.
Department
Computer Science
Keywords
Aspect detection;Backtranslation augmentation;Review analysis
Supervisor
Hossein Fani
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.
Abstract
Within the context of review analytics, aspects are the features of products and services at which customers target their opinions and sentiments. Aspect detection helps product owners and service providers identify shortcomings and prioritize customers' needs. Existing methods focus on detecting the surface form of an aspect falling short when aspects are latent in reviews, especially in an informal context like in social posts. In this work, we propose data augmentation via natural language backtranslation to extract latent occurrences of aspects. We presume that backtranslation (1) can reveal latent aspects because they may not be commonly known in the target language and can be generated through backtranslation; (2) augments context-aware synonymous aspects from a target language to the original language, hence addressing the out-of-vocabulary issue; and (3) helps with the semantic disambiguation of polysemous words and collocations. Through our experiments on well-known aspect detection methods across semeval datasets of restaurant and laptop reviews and unsolicited reviews of twitter dataset, we demonstrate that review augmentation via backtranslation yields a steady performance boost in baselines. We also present LADy, a Python-based framework designed to facilitate research in aspect detection. Although there has been a significant increase in aspect detection research, different works come with their own implementations which are scarcely publicly available and incapable of accommodating new methods. Moreover, preprocessing the datasets into a version that could be readily fed to the algorithm of choice is time-consuming. LADy is a publicly accessible system that: i) can efficiently preprocess data with diverse formats and structures, ii) can be easily extended or customized to new methods, and iii) is extensible to experiments on new datasets from other domains. LADy hosts various canonical aspect detection methods and benchmark datasets and incorporates an object-oriented design for integrating new methods and metrics.
Recommended Citation
Hemmati Zadeh, Farinam, "LADy: Latent Aspect Detection via Backtranslation Augmentation" (2024). Electronic Theses and Dissertations. 9439.
https://scholar.uwindsor.ca/etd/9439