Date of Award

3-10-2019

Publication Type

Master Thesis

Degree Name

M.C.Sc.

Department

Computer Science

Keywords

distributional word representations, dynamic mutual information, word co-occurrences

Supervisor

Jianguo Lu

Rights

info:eu-repo/semantics/openAccess

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Abstract

Semantic relations between words are crucial for information retrieval and natural language processing tasks. Distributional representations are based on word co-occurrence, and have been proven successful. Recent neural network approaches such as Word2vec and Glove are all derived from co-occurrence information. In particular, they are based on Shifted Positive Pointwise Mutual Information (SPPMI). In SPPMI, PMI values are shifted uniformly by a constant, which is typically five. Although SPPMI is effective in practice, it lacks theoretical explanation, and has space for improvement. Intuitively, shifting is to remove co-occurrence pairs that could have co-occurred due to randomness, i.e., the pairs whose expected co-occurrence count is close to its observed appearances. We propose a new shifting scheme, called Dynamic Mutual Information (DMI), where the shifting is based on the variance of co-occurrences and Chebyshev's Inequality. Intuitively, DMI shifts more aggressively for rare word pairs. We demonstrate that DMI outperforms the state-of-the-art SPPMI in a variety of word similarity evaluation tasks.

Recommended Citation

Li, Yaxin, "Capturing Word Semantics From Co-occurrences Using Dynamic Mutual Information" (2019). Electronic Theses and Dissertations. 7644.
https://scholar.uwindsor.ca/etd/7644

Download

COinS

Scholarship at UWindsor

Electronic Theses and Dissertations

Capturing Word Semantics From Co-occurrences Using Dynamic Mutual Information

Date of Award

Publication Type

Degree Name

Department

Keywords

Supervisor

Rights

Creative Commons License

Abstract

Recommended Citation

Search

Browse

Author Corner

Scholarship at UWindsor

Electronic Theses and Dissertations

Capturing Word Semantics From Co-occurrences Using Dynamic Mutual Information

Author

Date of Award

Publication Type

Degree Name

Department

Keywords

Supervisor

Rights

Creative Commons License

Abstract

Recommended Citation

Share

Search

Browse

Author Corner