AITopics | data development

Collaborating Authors

data development

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Data-centric Artificial Intelligence: A Survey

Zha, Daochen, Bhat, Zaid Pervaiz, Lai, Kwei-Herng, Yang, Fan, Jiang, Zhimeng, Zhong, Shaochen, Hu, Xia

arXiv.org Artificial IntelligenceJun-11-2023

Artificial Intelligence (AI) is making a profound impact in almost every domain. A vital enabler of its great success is the availability of abundant and high-quality data for building machine learning models. Recently, the role of data in AI has been significantly magnified, giving rise to the emerging concept of data-centric AI. The attention of researchers and practitioners has gradually shifted from advancing model design to enhancing the quality and quantity of the data. In this survey, we discuss the necessity of data-centric AI, followed by a holistic view of three general data-centric goals (training data development, inference data development, and data maintenance) and the representative methods. We also organize the existing literature from automation and collaboration perspectives, discuss the challenges, and tabulate the benchmarks for various tasks. We believe this is the first comprehensive survey that provides a global view of a spectrum of tasks across various stages of the data lifecycle. We hope it can help the readers efficiently grasp a broad picture of this field, and equip them with the techniques and further research ideas to systematically engineer data for building AI systems. A companion list of data-centric AI resources will be regularly updated on https://github.com/daochenzha/data-centric-AI

arxiv preprint arxiv, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2303.10158

Country:

North America > United States > Florida > Hillsborough County > University (0.05)
North America > United States > Texas > Brazos County > College Station (0.04)
Europe > United Kingdom > England > Leicestershire > Leicester (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area (0.92)
Health & Medicine > Pharmaceuticals & Biotechnology (0.67)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(6 more...)

Add feedback

What Are the Data-Centric AI Concepts behind GPT Models?

#artificialintelligenceApr-9-2023, 08:10:06 GMT

Artificial Intelligence (AI) has made incredible strides in transforming the way we live, work, and interact with technology. Recently, that one area that has seen significant progress is the development of Large Language Models (LLMs), such as GPT-3, ChatGPT, and GPT-4. These models are capable of performing tasks such as language translation, text summarization, and question-answering with impressive accuracy. While it's difficult to ignore the increasing model size of LLMs, it's also important to recognize that their success is due largely to the large amount and high-quality data used to train them. In this article, we will present an overview of the recent advancements in LLMs from a data-centric AI perspective, drawing upon insights from our recent survey papers [1,2] with corresponding technical resources on GitHub.

data development, gpt model, llm, (13 more...)

#artificialintelligence

Genre: Overview (0.97)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Data-centric AI: Perspectives and Challenges

Zha, Daochen, Bhat, Zaid Pervaiz, Lai, Kwei-Herng, Yang, Fan, Hu, Xia

arXiv.org Artificial IntelligenceApr-2-2023

The role of data in building AI systems has recently been significantly magnified by the emerging concept of data-centric AI (DCAI), which advocates a fundamental shift from model advancements to ensuring data quality and reliability. Although our community has continuously invested efforts into enhancing data in different aspects, they are often isolated initiatives on specific tasks. To facilitate the collective initiative in our community and push forward DCAI, we draw a big picture and bring together three general missions: training data development, inference data development, and data maintenance. We provide a top-level discussion on representative DCAI tasks and share perspectives. Finally, we list open challenges. More resources are summarized at https://github.com/daochenzha/data-centric-AI

data mining, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2301.04819

Country:

North America > United States > Texas > Brazos County > College Station (0.04)
Europe > Spain > Asturias > Oviedo Province > Oviedo (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Data Science > Data Quality (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback