Date of Award


Publication Type

Master Thesis

Degree Name



Computer Science


Christie Ezeife


Applied sciences




An uncertain data sequence is a sequence of data that exist with some level of doubt or probability. Each data item in the uncertain sequence is represented with a label and probability values, referred to as existential probability, ranging from 0 to 1.

Existing algorithms are either unsuitable or inefficient for discovering frequent sequences in uncertain data. This thesis presents mining of uncertain Web sequences with a method that combines access history probabilities from several Web log sessions with features of the PLWAP web sequential miner. The method is Uncertain Position Coded Pre-order Linked Web Access Pattern (U-PLWAP) algorithm for mining frequent sequential patterns in uncertain web logs. While PLWAP only considers a session of weblogs, U-PLWAP takes more sessions of weblogs from which existential probabilities are generated. Experiments show that U-PLWAP is at least 100% faster than U-apriori, and 33% faster than UF-growth. The UF-growth algorithm also fails to take into consideration the order of the items, thereby making U-PLWAP a richer algorithm in terms of the information its result contains.