Date of Award

5-16-2025

Publication Type

Thesis

Degree Name

M.Sc.

Department

Computer Science

Keywords

Back Translation; Large Language Models; Membership Inference Attacks

Supervisor

Dima Alhadidi

Rights

info:eu-repo/semantics/embargoedAccess

Abstract

Given a machine learning model and a record, Membership Inference Attacks (MIAs) determine whether this record was used as part of the model’s training dataset. This can raise privacy issues. MIAs pose a significant threat to the privacy of machine learning models, particularly when the training dataset contains sensitive or confidential information. MIAs often take advantage of a model’s tendency to overfit its training data, resulting in lower loss values for the training data than for non-training data. Recently, a new MIA against language models was designed that is based on a decision rule that compares the difference between the loss value of the target sample under the target model and the average loss of its neighboring samples against a threshold. They generate neighborhoods with simple word replacements that preserve the semantics and fit the context of the original word using Masked Language Models (MLMs). In this thesis, we propose Back Translation and Dynamic Thresholding (BTDT), a novel MIA. BTDT generates more realistic and diverse neighbor samples using back translation and introduces a dynamic thresholding mechanism, resulting in more adaptive and accurate membership inference. The results indicate that by employing dynamic thresholding, the attack’s false positive and false negative rates can be effectively managed, thereby enhancing its robustness and efficiency.

Available for download on Friday, May 15, 2026

Share

COinS