Mining Multiple Related Data Sources Using Object-Oriented Model
Contribution to Book
Transactions on Large-Scale Data- and Knowledge-Centered Systems XIII
An object-oriented database is represented by a set of classes connected by their class inheritance hierarchy through superclass and subclass relationships. An object-oriented database is suitable for capturing more comprehensive and detailed complexity of real world data such as capturing multiple related tables representing data schemas of a retail store web site, or capturing multiple databases such as several retail store web sites. Modeling web and other data as a number of object database schemas would enable derived, historical, and comparative mining of multiple databases and tables.
This paper proposes an object-oriented class model and database schema, and a series of class methods including that for object-oriented join (OOJoin) for mining multiple data sources through object oriented model. The OOJoin procedure joins superclass and subclass tables by matching their type and super type relationships. Mining Hierarchical Frequent Patterns (MineHFPs) from multiple integrated databases is done by applying an extended TidFP technique which specifies the object class hierarchy by traversing the multiple database inheritance hierarchy. This paper also extends map-gen join method used in TidFP algorithm to oomap-gen join for generating k-itemset object candidate patterns. The oomap-gen join reduces the number of candidate itemsets generated through indexing of the (k-1)-itemset candidate pattern with start and end position codes for the inheritance hierarchy level. Experimental results show that the proposed MineHFPs algorithm for mining hierarchical frequent patterns is effective and efficient for complex queries.
Ezeife, C. I. and Zhang, D.. (2014). Mining Multiple Related Data Sources Using Object-Oriented Model. Transactions on Large-Scale Data- and Knowledge-Centered Systems XIII, 158-186.