AITopics | Kayali, Moe

Collaborating Authors

Kayali, Moe

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Mind the Data Gap: Bridging LLMs to Enterprise Data Integration

Kayali, Moe, Wenz, Fabian, Tatbul, Nesime, Demiralp, Çağatay

arXiv.org Artificial IntelligenceDec-28-2024

Leading large language models (LLMs) are trained on public data. However, most of the world's data is dark data that is not publicly accessible, mainly in the form of private organizational or enterprise data. We show that the performance of methods based on LLMs seriously degrades when tested on real-world enterprise datasets. Current benchmarks, based on public data, overestimate the performance of LLMs. We release a new benchmark dataset, the GOBY Benchmark, to advance discovery in enterprise data integration. Based on our experience with this enterprise benchmark, we propose techniques to uplift the performance of LLMs on enterprise data, including (1) hierarchical annotation, (2) runtime class-learning, and (3) ontology synthesis. We show that, once these techniques are deployed, the performance on enterprise data becomes on par with that of public data. The Goby benchmark can be obtained at https://goby-benchmark.github.io/.

benchmark, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2412.20331

Country: North America > United States (0.94)

Genre: Research Report (1.00)

Industry:

Law (0.94)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

CHORUS: Foundation Models for Unified Data Discovery and Exploration

Kayali, Moe, Lykov, Anton, Fountalis, Ilias, Vasiloglou, Nikolaos, Olteanu, Dan, Suciu, Dan

arXiv.org Artificial IntelligenceSep-26-2023

We apply foundation models to data discovery and exploration tasks. Foundation models are large language models (LLMs) that show promising performance on a range of diverse tasks unrelated to their training. We show that these models are highly applicable to the data discovery and data exploration domain. When carefully used, they have superior capability on three representative tasks: table-class detection, column-type annotation and join-column prediction. On all three tasks, we show that a foundation-model-based approach outperforms the task-specific models and so the state of the art. Further, our approach often surpasses human-expert task performance. We investigate the fundamental characteristics of this approach including generalizability to several foundation models, impact of non-determinism on the outputs and syntactic/semantic signals. All in all, this suggests a future direction in which disparate data management tasks can be unified under foundation models.

artificial intelligence, large language model, natural language, (3 more...)

arXiv.org Artificial Intelligence

2306.0961

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.80)
Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (0.53)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.53)

Add feedback