Date of Award

6-13-2024

Publication Type

Dissertation

Degree Name

Ph.D.

Department

Computer Science

Keywords

Automatic Text Summarization;Compression;Data Augmentation;Natural Language Processing

Supervisor

Robin Gras

Creative Commons License

Creative Commons Attribution 4.0 International License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Abstract

Automatic text summarization is an influential field because it has the ability to condense any information into critical points. The current landscape of the Internet and the explosion of information in recent years make it an exciting research area. This methodology offers the potential to significantly reduce time expenditure in situations where individuals may be uncertain regarding their interest in the content of a book or article, or when aiming to remain up-to-date with cutting-edge research and recent publications. The extraction of critical information from textual content can provide substantial assistance in these contexts. Looking at the advancements in the artificial intelligence field and models with the ability to handle various modalities make it possible to transcribe audio files or generate captions from videos, thereby extending the possibilities indefinitely. It is important to note that the field still faces numerous challenges and unresolved issues that require attention. To highlight a few: i) The reliance on n-gram-based evaluation metrics, such as the ROUGE score, which compares generated summaries to a "golden" summary. This approach raises several concerns, notably the subjective nature of summaries. A practical summary should be evaluated based on its faithfulness to the original article rather than proximity to a single, preferred summary. ii) The trend toward increasing the number of parameters in neural networks enhances the models' capabilities but simultaneously renders them less accessible to practitioners and researchers due to computational and resource constraints. iii) The scarcity of high-quality, human-curated datasets significantly impacts the quality of models. This issue is particularly pronounced in domain-specific areas, where the availability of specialized datasets is critical for model performance. This research seeks to explore the challenges in various aspects of the text summarization field. Initial experiments aimed at minimizing the overall size of the neural networks, with particular attention to the full encoder-decoder transformer architecture. This led to the introduction of employing an autoencoder to compress the latent representation of the encoder, thereby significantly diminishing the size of the decoder. This approach achieved a 60% reduction in decoder size with a minimal loss of 4.5% in the ROUGE score for summarization. Remarkably, the outcomes demonstrated a 54% increase in speed during the inference phase, alongside a 57% decrease in GPU memory usage throughout the fine-tuning stage. The implications of this approach were further investigated through its application to additional tasks, including translation and classification, to assess its generalizability across similar or different tasks that utilize varying architecture. (Comparison of encoder-decoder models with encoder-only models) Furthermore, this research includes a comprehensive examination of the effects of data augmentation on the automatic text summarization task. It highlights the obstacles to the widespread adoption of augmentation techniques in natural language processing (NLP). The primary challenge is attributed to the necessity for context-aware filters in text augmentation to ensure the preservation of the original meaning and critical details such as names, dates, and locations. Consequently, these transformations demand more computational resources compared to other fields, such as vision. The study on data augmentation culminated in over 60 experiments, which not only introduced new augmentation techniques but also examined established methods and their combinations to assess their impact. The effectiveness of these methods was evaluated across various contexts using various metrics—such as Novelty and length. Additionally, the study incorporated GPT-4 to analyze the summaries from multiple aspects, including relevancy, consistency, fluency, and coherence. Additionally, the effect of back-translation and introducing diverse augmentation and masking were discussed. The experiments were conducted to establish guidelines for employing various methods based on the desired objective, be it achieving a 158% enhancement in novelty, a 2% improvement in sentence quality as measured by the ROUGE score, or modifying the lengths of summaries. Lastly, the execution of these experiments led to the development and release of two open-source libraries. The first, named the AttentionVisualizer Package [url], facilitates the visualization of the attention mechanism scores, enabling researchers to identify the segments upon which the model's attention heads are focused during the generation of representations. The second, the SumEvaluator Package [url], streamlines the evaluation process for generated summaries by incorporating the previously mentioned metrics. Additionally, the architectures and weights of models from the most successful experiments have been made publicly available on GitHub, thereby supporting and encouraging future research in this area. [url]

Available for download on Wednesday, December 11, 2024

Share

COinS