Date of Award
1-27-2016
Publication Type
Master Thesis
Degree Name
M.A.Sc.
Department
Electrical and Computer Engineering
Keywords
FPGA, Hardware Acceleration, High Level Synthesis, OpenCL
Supervisor
Khalid, Mohammed
Rights
info:eu-repo/semantics/openAccess
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Abstract
FPGAs have shown great promise for accelerating computationally intensive algorithms. However, FPGA-based accelerator design is tedious and time consuming if we rely on traditional HDL based design method. Recent introduction of Altera SDK for OpenCL (AOCL) high level synthesis tool enables developers to utilize FPGA’s potential without long development time and extensive hardware knowledge. AOCL is used in this thesis to accelerate computationally intensive algorithms in the field of machine learning and scientific computing. The algorithms studied are k-means clustering, k-nearest neighbour search, N-body simulation and LU decomposition. The performance and power consumption of the algorithms synthesized using AOCL for FPGA are evaluated against state of the art CPU and GPU implementations. The k-means clustering and k-nearest neighbor kernels designed for FPGA significantly out-performed optimized CPU implementations while achieving similar or better power efficiency than that of GPU.
Recommended Citation
Tang, Qing Yun, "FPGA Based Acceleration of Matrix Decomposition and Clustering Algorithm Using High Level Synthesis" (2016). Electronic Theses and Dissertations. 5669.
https://scholar.uwindsor.ca/etd/5669