Date of Award

2-4-2025

Publication Type

Thesis

Degree Name

M.Sc.

Department

Computer Science

Keywords

Abstractive Summarization; Encoder-Decoder Architectures; Pause Tokens; ROUGE Metrics; TC27ransformer Models

Supervisor

Robin Gras

Rights

info:eu-repo/semantics/openAccess

Abstract

In natural language processing, text summarization is widely explored using transformer-based models due to their strong ability to capture and generate meaningful representations of input text. Traditional transformer models for summarization generate tokens sequentially, with each output token produced immediately after processing its preceding token. While this approach has proven effective, it may limit the model's capacity to fully refine its understanding before committing to each output token. This research explores an alternative approach that introduces a learnable pause token to the encoder's input and, optionally, the decoder's input during finetuning and inference of encoder-decoder transformer models, BART, T5, and Pegasus. By adding these pause tokens, the model is allowed additional processing time for refining intermediate representations before the generation of the next token. The experiments, conducted on SAMSum, Billsum, and BBC datasets and evaluated using ROUGE metrics, demonstrate that adding pause tokens to the encoder significantly enhances summarization performance, particularly in BART and T5 models. Comparative analyses demonstrate that the pause token approach improves summarization quality, delivering measurable gains over the standard finetuning approach. These findings highlight the potential of pause tokens to enhance intermediate computation, improving the overall quality of generated summaries.

Share

COinS