Date of Award

1998

Degree Type

Dissertation

Degree Name

Ph.D.

Department

Electrical and Computer Engineering

First Advisor

Admadi, M.,

Keywords

Engineering, Electronics and Electrical.

Rights

CC BY-NC-ND 4.0

Abstract

An important problem in office automation is the machine extraction and recognition of filled-in information in form documents. This thesis addresses the extraction of the user entered information in forms, when an original blank form is available (herein after referred to as the master) as a scanned document. Starting with a filled in form (herein after referred to as the input form), this thesis addresses the problem of converting the coordinate representation of the input form so that it matches the master. In addition to the data that is filled-in, the input form differs from the master in that it could be a translated, non-uniformly scaled, rotated, or sheared version of the master. Under normal conditions if the master and the input forms were scanned under similar circumstances, then the input form and the master would be nearly identical except for the filled-in items. In this thesis, no constraints have been imposed on the magnitude of the distortion with regard to translation, scaling, rotation and shear of the input form with regard to the master. However, the proposed approach requires that a master form be available. The principal problem is the determination of the correspondence between the selected features in both images. The features selected are points on the Convex Hull and straight lines inside the form. It is assumed that the geometric distortion can be modeled as an affine transformation and that point and line features are adequate for establishing correspondence. The correspondence is solved by applying the Best Find strategy on a set of five points geometric invariants and the Geometric Hashing matching strategy to the four different affine invariants proposed by this thesis. An integration of voting results based on a statistical point of view is performed. Using this correspondence result, the transformation parameters are found by a Least Square solution. The transformation parameters are verified and updated. The input form is mapped to its corresponding template using the transformation parameters obtained. The performance and robustness of the proposed method is then evaluated using sensitivity analysis of the proposed method and OCR (Optical Character Recognition) analysis of the registered forms.Dept. of Electrical and Computer Engineering. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis1997 .S23. Source: Dissertation Abstracts International, Volume: 61-09, Section: B, page: 4898. Adviser: M. Admadi. Thesis (Ph.D.)--University of Windsor (Canada), 1998.

Share

COinS