Date of Award

2023

Publication Type

Thesis

Degree Name

M.Sc.

Department

Computer Science

Keywords

Catboost; Credit scoring, Decision trees, Machine learning, Tabular data, Tree-based methods

Rights

info:eu-repo/semantics/openAccess

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Abstract

The lending industry commonly relied on assessing borrowers’ repayment performance to make lending decisions. This is to safeguard their assets and maintain their profitability. With the rise of Artificial Intelligence, lenders resorted to Machine Learning (ML) algorithms to solve this problem.

In this study, the novelty introduced is applying ML’s Tree-based methods to a large dataset and accurately predicting financial repayment performance without using any repayment history, which was utilized in all literature reviewed. Instead, the attributes used were demographics and psychographics of applicants, only. The study’s proprietary US-based dataset comprises an anonymous population whose owner does not wish to be disclosed and it contains the information of about half a million beneficiaries with a very balanced bimodal binary target distribution.

An Area Under the Curve of Receiver Characteristic Operator (ROC-AUC) of 85% was achieved with a binary classification target using CatBoost API. The study also experimented with a given tri-class target. Furthermore, this research used ML to gain insight into which attributes contribute the most to the repayment prediction. The study also tested whether similar results can be achieved with fewer attributes for the sake of the practicality of application by the data owner. The best model was applied to one of the biggest publicly available financial datasets for verification. The original research of said dataset had an accuracy score of 82%, this study achieved 79% using 5-fold Cross-Validation (CV). This result was achieved with Tree-Based models with a complexity of O(log n) compared to O(2n) in the original research, which is a significant efficiency enhancement.

Recommended Citation

Abouhassan, Ahmed Shafeek, "Tree-Based Approaches for Predicting Financial Performance" (2023). Electronic Theses and Dissertations. 9048.
https://scholar.uwindsor.ca/etd/9048

Download

Included in

Computer Sciences Commons

COinS

Scholarship at UWindsor

Electronic Theses and Dissertations

Tree-Based Approaches for Predicting Financial Performance

Date of Award

Publication Type

Degree Name

Department

Keywords

Rights

Creative Commons License

Abstract

Recommended Citation

Included in

Search

Browse

Author Corner

Scholarship at UWindsor

Electronic Theses and Dissertations

Tree-Based Approaches for Predicting Financial Performance

Author

Date of Award

Publication Type

Degree Name

Department

Keywords

Rights

Creative Commons License

Abstract

Recommended Citation

Included in

Share

Search

Browse

Author Corner