The mathematical structure of semantic distances in language analysis.
Date of Award
Mathematics and Statistics
CC BY-NC-ND 4.0
Studies of visual word recognition have focused on several characteristics of words, or words relative to other words. Notable among these characteristics are (a) orthographic---the appearance of the written form of a word, (b) phonological---the sound of a word in its spoken form, and (c) semantic---the relative position of a word with respect to other words in either written or spoken form. We will be focusing exclusively on semantic characteristics. Within the area of semantics, object based semantic measurements are taken after first manually grouping words into categories. On the other hand, language based semantic measurements are taken by using statistical properties of the semantics of words grouped according to usage, for example, in books. This is the path that will be taken here. Consider the following sentence: The quick brown fox jumped over the lazy dog. Look at the word "fox" and its surrounding words, e.g. quick, brown, jumped. If the words that appear around "fox" do so often, relatively speaking, then we would like numerical values (called closeness) to be assigned to them that are larger than those values assigned to words that appear around "fox" less often. But how do we assign such a value? Ideally we would like to examine all the words that appear before "fox" and all the words that appear after "fox" each time "fox" appears. We consider examining every surrounding word for each word in each sample we use as being infeasible because the amount of computation involved and data generated would be too great. (Abstract shortened by UMI.)Dept. of Mathematics and Statistics. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2005 .C37. Source: Masters Abstracts International, Volume: 44-01, page: 0365. Thesis (M.Sc.)--University of Windsor (Canada), 2005.
Casey, Jon., "The mathematical structure of semantic distances in language analysis." (2005). Electronic Theses and Dissertations. 4095.