Date of Award

6-2-2023

Publication Type

Thesis

Degree Name

M.Sc.

Department

Computer Science

Keywords

Blockchain;Data Evaluation;Federated Learning;Smart Contracts

Supervisor

Saeed Samet

Rights

info:eu-repo/semantics/openAccess

Creative Commons License

Creative Commons Attribution 4.0 International License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Abstract

Deep Learning is one of the most revolutionary concepts in the field of Artificial Intelligence, allowing us to train a Machine Learning model for almost any type of problem using any type of data. Federated Learning (FL) is a type of distributed Deep Learning framework in which the model is trained locally on each device, and only the trained gradients, also known as “local updates”, are sent to a central server that aggregates them and creates a global model. This helps in preserving the data privacy of the user as the local data never leaves the local device. It has many applications in the fields of healthcare, supply chain, finance, and many more. However, due to its heavy reliance on a central server, it poses many issues, such as communication bottleneck, a single point of failure, and trust issues due to lack of transparency. Another major concern in Federated Learning is ensuring the data quality of trained data. Since there is no control over the training data, FL models tend to be highly susceptible to model poisoning attacks. To address these issues, we propose a decentralized approach using blockchain to create an FL framework. Blockchain provides a decentralized (no reliance on a central server), transparent, immutable, traceable, and trustless environment. We use miners to validate every local model by running it against a secret testing dataset and checking its accuracy. This is done using a smart contract. The local model will be aggregated with the global model only if it passes a preset accuracy threshold. We test our proposed method on 2 datasets - The brain Tumor Classification dataset from Kaggle, comprised of 7000 MRI images divided into 2 classes (Tumor / No Tumor), and the Medical MNIST dataset comprised of 58,954 images across 6 classes ( AbdomenCT, BreastMRI, ChestCT, CXR (Chest XRay), Hand (X-Ray), HeadCT). Our results show that our method performs better than the original Federated Learning approach across all evaluation metrics for both datasets.

Share

COinS