Schema Matching using Machine Learning

Sahay, Tanvi, Mehta, Ankita, Jadon, Shruti

Nov-23-2019–arXiv.org Artificial Intelligence

--Schema Matching is a method of finding attributes that are either similar to each other linguistically or represent the same information. In this project, we take a hybrid approach at solving this problem by making use of both the provided data and the schema name to perform one to one schema matching and introduce creation of a global dictionary to achieve one to many schema matching. We experiment with two methods of one to one matching and compare both based on their F-scores, precision and recall. We also compare our method with the ones previously suggested and highlight differences between them. The schema of a database is the skeleton that represents its logical view. In other words, a schema describes the data contained in a database, with the name of each attribute in a relation and its data type contained in the relation's schema. Any time the different tables maintained by a peer management system need to be linked to each other or when one branch of a company is closed down and all its data needs to be redistributed to the database maintained by other branches or when one company takes over another company and all data of the child comapny needs to be integrated with that of the parent company, the need to match schemas of multiple relations with each other arises. Consider the Tables I and II. Here, the ideal schema mappings would be: FName LName Name, Major Maj Stream and Address House No St Name .

mapping, matching, schema, (14 more...)

arXiv.org Artificial Intelligence

Nov-23-2019

arXiv.org PDF

Add feedback

Country:
- North America
  - Puerto Rico (0.04)
  - United States
    - District of Columbia > Washington (0.04)
    - Massachusetts > Hampshire County
      - Amherst (0.14)
    - California > San Francisco County
      - San Francisco (0.14)
- Europe > United Kingdom
  - England > Greater London > London (0.04)

Genre:
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Natural Language (1.00)
  - Machine Learning
    - Neural Networks (0.94)
    - Statistical Learning > Clustering (0.69)
    - Performance Analysis > Accuracy (0.49)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found