Date of Award

2018

Publication Type

Doctoral Thesis

Degree Name

Ph.D.

Department

Electrical and Computer Engineering

Keywords

Action Recogntion; Computer Vision; Video Content Clustering

Supervisor

Wu, Q. M. Jonathan

Supervisor

Saif, Mehrdad

Rights

info:eu-repo/semantics/openAccess

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Abstract

Human action recognition plays a crucial role in visual learning applications such as video understanding and surveillance, video retrieval, human-computer interactions, and autonomous driving systems. A variety of methodologies have been proposed for human action recognition via developing of low-level features along with the bag-of-visual-word models. However, much less research has been performed on the compound of pre-processing, encoding and classification stages. This dissertation focuses on enhancing the action recognition performances via ensemble learning, hybrid classifier, hierarchical feature representation, and key action perception methodologies. Action variation is one of the crucial challenges in video analysis and action recognition. We address this problem by proposing the hybrid classifier (HC) to discriminate actions which contain similar forms of motion features such as walking, running, and jogging. Aside from that, we show and proof that the fusion of various appearance-based and motion features can boost the simple and complex action recognition performance. The next part of the dissertation introduces pooled-feature representation (PFR) which is derived from a double phase encoding framework (DPE). Considering that a given unconstrained video is composed of a sequence of simple frames, the first phase of DPE generates temporal sub-volumes from the video and represents them individually by employing the proposed improved rank pooling (IRP) method. The second phase constructs the pool of features by fusing the represented vectors from the first phase. The pool is compressed and then encoded to provide video-parts vector (VPV). The DPE framework allows distilling the video representation and hierarchically extracting new information. Compared with recent video encoding approaches, VPV can preserve the higher-level information through standard encoding of low-level features in two phases. Furthermore, the encoded vectors from both phases of DPE are fused along with a compression stage to develop PFR.

Recommended Citation

Mohammadi Nejad, Eman, "Simple and Complex Human Action Recognition in Constrained and Unconstrained Videos" (2018). Electronic Theses and Dissertations. 7383.
https://scholar.uwindsor.ca/etd/7383

Download

COinS

Scholarship at UWindsor

Electronic Theses and Dissertations

Simple and Complex Human Action Recognition in Constrained and Unconstrained Videos

Date of Award

Publication Type

Degree Name

Department

Keywords

Supervisor

Supervisor

Rights

Creative Commons License

Abstract

Recommended Citation

Search

Browse

Author Corner

Scholarship at UWindsor

Electronic Theses and Dissertations

Simple and Complex Human Action Recognition in Constrained and Unconstrained Videos

Author

Date of Award

Publication Type

Degree Name

Department

Keywords

Supervisor

Supervisor

Rights

Creative Commons License

Abstract

Recommended Citation

Share

Search

Browse

Author Corner