Data Transformation Strategies to Remove Heterogeneity
Yoo, Sangbong, Lee, Jaeyoung, Yoon, Chanyoung, Son, Geonyeong, Hong, Hyein, Seo, Seongbum, Yim, Soobin, Jung, Chanyoung, Park, Jungsoo, Kim, Misuk, Jang, Yun
–arXiv.org Artificial Intelligence
Data heterogeneity is a prevalent issue, stemming from various conflicting factors, making its utilization complex. This uncertainty, particularly resulting from disparities in data formats, frequently necessitates the involvement of experts to find resolutions. Current methodologies primarily address conflicts related to data structures and schemas, often overlooking the pivotal role played by data transformation. As the utilization of artificial intelligence (AI) continues to expand, there is a growing demand for a more streamlined data preparation process, and data transformation becomes paramount. It customizes training data to enhance AI learning efficiency and adapts input formats to suit diverse AI models. Selecting an appropriate transformation technique is paramount in preserving crucial data details. Despite the widespread integration of AI across various industries, comprehensive reviews concerning contemporary data transformation approaches are scarce. This survey explores the intricacies of data heterogeneity and its underlying sources. It systematically categorizes and presents strategies to address heterogeneity stemming from differences in data formats, shedding light on the inherent challenges associated with each strategy.
arXiv.org Artificial Intelligence
Jul-18-2025
- Country:
- Asia
- China
- Japan
- Hokkaidō > Hokkaidō Prefecture
- Sapporo (0.04)
- Honshū > Kansai
- Osaka Prefecture > Osaka (0.04)
- Hokkaidō > Hokkaidō Prefecture
- Middle East
- Singapore (0.04)
- South Korea > Seoul
- Seoul (0.05)
- Europe
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- United Kingdom
- England
- Greater London > London (0.04)
- Greater Manchester > Manchester (0.04)
- Scotland > City of Edinburgh
- Edinburgh (0.04)
- England
- Finland > Uusimaa
- Helsinki (0.04)
- Italy
- France
- Auvergne-Rhône-Alpes > Isère
- Grenoble (0.04)
- Provence-Alpes-Côte d'Azur > Bouches-du-Rhône
- Marseille (0.04)
- Île-de-France > Paris
- Paris (0.04)
- Auvergne-Rhône-Alpes > Isère
- Portugal > Lisbon
- Lisbon (0.04)
- Greece > Attica
- Athens (0.04)
- Denmark > Capital Region
- Copenhagen (0.04)
- Spain
- Basque Country > Biscay Province
- Bilbao (0.04)
- Catalonia > Barcelona Province
- Barcelona (0.04)
- Galicia > Madrid (0.04)
- Basque Country > Biscay Province
- Germany
- Bavaria > Upper Bavaria
- Munich (0.04)
- Berlin (0.04)
- Bremen > Bremen (0.04)
- North Rhine-Westphalia > Münster Region
- Münster (0.04)
- Bavaria > Upper Bavaria
- Netherlands > North Holland
- Amsterdam (0.04)
- Switzerland > Zürich
- Zürich (0.14)
- Belgium > Brussels-Capital Region
- North America
- Canada
- Dominican Republic (0.04)
- Puerto Rico > San Juan
- San Juan (0.04)
- United States
- New York > New York County
- New York City (0.04)
- California
- Alameda County > Berkeley (0.04)
- Los Angeles County
- Long Beach (0.04)
- Los Angeles (0.04)
- San Francisco County > San Francisco (0.14)
- San Luis Obispo County > San Luis Obispo (0.04)
- Santa Clara County > Palo Alto (0.04)
- Massachusetts
- Middlesex County > Cambridge (0.04)
- Suffolk County > Boston (0.04)
- District of Columbia > Washington (0.04)
- Washington > King County
- Georgia > Fulton County
- Atlanta (0.04)
- Oregon > Multnomah County
- Portland (0.04)
- Tennessee > Davidson County
- Nashville (0.04)
- Utah > Salt Lake County
- Salt Lake City (0.04)
- Pennsylvania > Philadelphia County
- Philadelphia (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Hawaii > Honolulu County
- Honolulu (0.04)
- Maryland > Baltimore (0.04)
- Nevada > Clark County
- Las Vegas (0.04)
- Florida
- Miami-Dade County > Miami (0.04)
- Orange County > Orlando (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Michigan > Washtenaw County
- Ann Arbor (0.04)
- Texas > Travis County
- Austin (0.04)
- New York > New York County
- Oceania > Australia
- New South Wales > Sydney (0.14)
- Queensland > Brisbane (0.04)
- Victoria > Melbourne (0.04)
- South America > Chile
- Asia
- Genre:
- Overview (1.00)
- Research Report (1.00)
- Industry:
- Health & Medicine > Therapeutic Area (0.46)
- Information Technology (0.67)
- Leisure & Entertainment (0.92)
- Technology:
- Information Technology
- Artificial Intelligence
- Machine Learning
- Learning Graphical Models > Directed Networks
- Bayesian Learning (0.67)
- Neural Networks > Deep Learning (1.00)
- Statistical Learning (1.00)
- Learning Graphical Models > Directed Networks
- Natural Language
- Chatbot (1.00)
- Information Retrieval (0.68)
- Large Language Model (1.00)
- Text Processing (1.00)
- Representation & Reasoning
- Expert Systems (0.67)
- Information Fusion (0.93)
- Machine Learning
- Data Science > Data Mining (1.00)
- Artificial Intelligence
- Information Technology