Schema Matching using Machine Learning
Sahay, Tanvi, Mehta, Ankita, Jadon, Shruti
–arXiv.org Artificial Intelligence
--Schema Matching is a method of finding attributes that are either similar to each other linguistically or represent the same information. In this project, we take a hybrid approach at solving this problem by making use of both the provided data and the schema name to perform one to one schema matching and introduce creation of a global dictionary to achieve one to many schema matching. We experiment with two methods of one to one matching and compare both based on their F-scores, precision and recall. We also compare our method with the ones previously suggested and highlight differences between them. The schema of a database is the skeleton that represents its logical view. In other words, a schema describes the data contained in a database, with the name of each attribute in a relation and its data type contained in the relation's schema. Any time the different tables maintained by a peer management system need to be linked to each other or when one branch of a company is closed down and all its data needs to be redistributed to the database maintained by other branches or when one company takes over another company and all data of the child comapny needs to be integrated with that of the parent company, the need to match schemas of multiple relations with each other arises. Consider the Tables I and II. Here, the ideal schema mappings would be: FName LName Name, Major Maj Stream and Address House No St Name .
arXiv.org Artificial Intelligence
Nov-23-2019
- Country:
- North America
- Puerto Rico (0.04)
- United States
- District of Columbia > Washington (0.04)
- Massachusetts > Hampshire County
- Amherst (0.14)
- California > San Francisco County
- San Francisco (0.14)
- Europe > United Kingdom
- England > Greater London > London (0.04)
- North America
- Genre:
- Research Report (0.64)
- Technology:
- Information Technology > Artificial Intelligence
- Representation & Reasoning (1.00)
- Natural Language (1.00)
- Machine Learning
- Neural Networks (0.94)
- Statistical Learning > Clustering (0.69)
- Performance Analysis > Accuracy (0.49)
- Information Technology > Artificial Intelligence