Date of Award
2025
Publication Type
Thesis
Degree Name
M.A.Sc.
Department
Industrial and Manufacturing Systems Engineering
Keywords
Combinatorial Optimization; Deep Learning; Graph Neural Networks; Policy Gradient Algorithms; Quadratic Assignment Problem; Reinforcement Learning
Supervisor
Guoqing Zhang
Rights
info:eu-repo/semantics/embargoedAccess
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Abstract
The Quadratic Assignment Problem (QAP) is an NP-hard combinatorial optimization problem with applications in facility layout and scheduling. This thesis presents a novel approach to solving QAP by formulating it as a Markov Decision Process (MDP) and applying Proximal Policy Optimization (PPO) within an Actor Critic framework. The methodology employs Graph Attention Networks (GAT) to learn optimal policies from a bipartite graph representation of QAP. In this work, PPO agents were trained on small (n = 10), medium (n = 25), and large (n = 50) problem instances. Performance was evaluated against established baselines, including heuristics. Experiments demonstrate the scalability and efficacy of the proposed approach, particularly for large-scale QAPs. The application of an improvement heuristic to PPO-generated solutions further enhances solution quality, illustrating the benefits of combining reinforcement learning (RL) with local search techniques. The thesis presents a detailed analysis of experimental results, including performance comparisons across problem sizes, convergence studies, and sensitivity analyses. This research contributes to the integration of machine learning and operations research, offering insights into AI applications for combinatorial optimization. The methodology provides a competitive alternative to traditional QAP solving techniques and opens avenues for applying RL to similar problems. Future work may explore adaptation to dynamic QAP variants, integration with other heuristics, application to related combinatorial optimization problems, and theoretical analysis of RL convergence in discrete optimization spaces. This thesis demonstrates the potential of combining RL, graph neural networks (GNNs), and optimization heuristics to address the Quadratic Assignment Problem effectively, paving the way for further advancements in the field of combinatorial optimization.
Recommended Citation
Pichka, Ebrahim, "A Policy Gradient Approach to the Quadratic Assignment Problem" (2025). Electronic Theses and Dissertations. 9635.
https://scholar.uwindsor.ca/etd/9635