Document Type

Article

Publication Date

2007

Publication Title

International Journal of Software Engineering

Volume

17

Issue

5

First Page

575

Keywords

Software reuse, software component search, schema matching, XML schema, tree matching algorithm, data integration

Abstract

XML Schema matching problem can be formulated as follows: given two XML Schemas, find the best mapping between the elements and attributes of the schemas, and the overall similarity between them. XML Schema matching is an important problem in data integration, schema evolution, and software reuse. This paper describes a matching system that can find accurate matches and scales to large XML Schemas with hundreds of nodes. In our system, XML Schemas are modeled as labeled and unordered trees, and the schema matching problem is turned into a tree matching problem. We proposed Approximate Common Structures in trees, and developed a tree matching algorithm based on this concept. Compared with the traditional tree edit-distance algorithm and other schema matching systems, our algorithm is faster and more suitable for large XML Schema matching.

DOI

10.1142/S0218194007003446

Comments

The article available for download is a post print. The definitive version is published in the International Journal of Software Engineering and is available here. Copyright (2012) World Scientific Publishing Co.

Share

COinS