AITopics | data marketplace

Collaborating Authors

data marketplace

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

DataPerf: Benchmarks for Data-Centric AI Development Mark Mazumder

Neural Information Processing SystemsFeb-7-2026, 23:54:59 GMT

Machine learning research has long focused on models rather than datasets, and prominent datasets are used for common ML tasks without regard to the breadth, difficulty, and faithfulness of the underlying problems. Neglecting the fundamental importance of data has given rise to inaccuracy, bias, and fragility in real-world applications, and research is hindered by saturation across existing dataset benchmarks.

data quality, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Diego County > San Diego (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)
North America > Canada (0.04)
(2 more...)

Genre: Research Report > Promising Solution (0.67)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(4 more...)

Add feedback

Designing Reputation Systems for Manufacturing Data Trading Markets: A Multi-Agent Evaluation with Q-Learning and IRL-Estimated Utilities

Yamamoto, Kenta, Hayashi, Teruaki

arXiv.org Artificial IntelligenceNov-26-2025

Abstract--Recent advances in machine learning and big data analytics have intensified the demand for high-quality cross-domain datasets and accelerated the growth of data trading across organizations. As data become increasingly recognized as an economic asset, data marketplaces have emerged as a key infrastructure for data-driven innovation. However, unlike mature product or service markets, data-trading environments remain nascent and suffer from pronounced information asymmetry. Buyers cannot verify the content or quality before purchasing data, making trust and quality assurance central challenges. T o address these issues, this study develops a multi-agent data-market simulator that models participant behavior and evaluates the institutional mechanisms for trust formation. Focusing on the manufacturing sector, where initiatives such as GAIA-X and Catena-X are advancing, the simulator integrates reinforcement learning (RL) for adaptive agent behavior and inverse reinforcement learning (IRL) to estimate utility functions from empirical behavioral data. Using the simulator, we examine the market-level effects of five representative reputation systems--Time-decay, Bayesian-beta, PageRank, PowerTrust, and PeerTrust--and found that PeerTrust achieved the strongest alignment between data price and quality, while preventing monopolistic dominance. Building on these results, we develop a hybrid reputation mechanism that integrates the strengths of existing systems to achieve improved price-quality consistency and overall market stability. This study extends simulation-based data-market analysis by incorporating trust and reputation as endogenous mechanisms and offering methodological and institutional insights into the design of reliable and efficient data ecosystems.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2511.1993

Country: Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry:

Banking & Finance > Trading (1.00)
Information Technology (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

LLM-based Multi-Agent System for Simulating Strategic and Goal-Oriented Data Marketplaces

Sashihara, Jun, Fujita, Yukihisa, Nakamura, Kota, Kuwahara, Masahiro, Hayashi, Teruaki

arXiv.org Artificial IntelligenceNov-18-2025

Abstract--Data marketplaces, which mediate the purchase and exchange of data from third parties, have attracted growing attention for reducing the cost and effort of data collection while enabling the trading of diverse datasets. However, a systematic understanding of the interactions between market participants, data, and regulations remains limited. T o address this gap, we propose a Large Language Model-based Multi-Agent System (LLM-MAS) for data marketplaces. In our framework, buyer and seller agents powered by LLMs operate with explicit objectives and autonomously perform strategic actions, such as--planning, searching, purchasing, pricing, and updating data. These agents can reason about market dynamics, forecast future demand, and adapt their strategies accordingly. Unlike conventional model-based simulations, which are typically constrained to predefined rules, LLM-MAS supports broader and more adaptive behavior selection through natural language reasoning. We evaluated the framework via simulation experiments using three distribution-based metrics: (1) the number of purchases per dataset, (2) the number of purchases per buyer, and (3) the number of repeated purchases of the same dataset. The results demonstrate that LLM-MAS more faithfully reproduces trading patterns observed in real data marketplaces compared to traditional approaches, and further captures the emergence and evolution of market trends. Data have emerged as a tradable economic resource, and data marketplaces that mediate the purchase and exchange of datasets from third parties have rapidly expanded [1]. These marketplaces streamline data collection that previously required substantial cost and effort, while also providing organizations and researchers with access to diverse, high-quality datasets. As a result, they are increasingly recognized as critical infrastructures that accelerate innovation based on data that were closed within individual organizations [2]. Despite this progress, our understanding of how interactions among market participants, data, and regulations shape market dynamics remains limited. Smooth and efficient data transactions require well-designed and robust data marketplaces [3].

artificial intelligence, dataset, natural language, (17 more...)

arXiv.org Artificial Intelligence

2511.13233

Country: Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.15)

Genre: Research Report (1.00)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

DataPerf: Benchmarks for Data-Centric AI Development Mark Mazumder

Neural Information Processing SystemsOct-8-2025, 03:37:40 GMT

data quality, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Diego County > San Diego (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)
North America > Canada (0.04)
(2 more...)

Genre: Research Report > Promising Solution (0.67)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(4 more...)

Add feedback

A Cramér-von Mises Approach to Incentivizing Truthful Data Sharing

Clinton, Alex, Zeng, Thomas, Chen, Yiding, Zhu, Xiaojin, Kandasamy, Kirthevasan

arXiv.org Artificial IntelligenceJun-10-2025

Modern data marketplaces and data sharing consortia increasingly rely on incentive mechanisms to encourage agents to contribute data. However, schemes that reward agents based on the quantity of submitted data are vulnerable to manipulation, as agents may submit fabricated or low-quality data to inflate their rewards. Prior work has proposed comparing each agent's data against others' to promote honesty: when others contribute genuine data, the best way to minimize discrepancy is to do the same. Yet prior implementations of this idea rely on very strong assumptions about the data distribution (e.g. Gaussian), limiting their applicability. In this work, we develop reward mechanisms based on a novel, two-sample test inspired by the Cramér-von Mises statistic. Our methods strictly incentivize agents to submit more genuine data, while disincentivizing data fabrication and other types of untruthful reporting. We establish that truthful reporting constitutes a (possibly approximate) Nash equilibrium in both Bayesian and prior-agnostic settings. We theoretically instantiate our method in three canonical data sharing problems and show that it relaxes key assumptions made by prior work. Empirically, we demonstrate that our mechanism incentivizes truthful data sharing via simulations and on real-world language and image data.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2506.07272

Country:

North America > United States (0.28)
Europe (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Communications (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.87)
(3 more...)

Add feedback

Learn then Decide: A Learning Approach for Designing Data Marketplaces

Gao, Yingqi, Zhou, Jin, Zhou, Hua, Chen, Yong, Dai, Xiaowu

arXiv.org Machine LearningMar-13-2025

As data marketplaces become increasingly central to the digital economy, it is crucial to design efficient pricing mechanisms that optimize revenue while ensuring fair and adaptive pricing. We introduce the Maximum Auction-to-Posted Price (MAPP) mechanism, a novel two-stage approach that first estimates the bidders' value distribution through auctions and then determines the optimal posted price based on the learned distribution. We establish that MAPP is individually rational and incentive-compatible, ensuring truthful bidding while balancing revenue maximization with minimal price discrimination. MAPP achieves a regret of $O_p(n^{-1})$ when incorporating historical bid data, where $n$ is the number of bids in the current round. It outperforms existing methods while imposing weaker distributional assumptions. For sequential dataset sales over $T$ rounds, we propose an online MAPP mechanism that dynamically adjusts pricing across datasets with varying value distributions. Our approach achieves no-regret learning, with the average cumulative regret converging at a rate of $O_p(T^{-1/2}(\log T)^2)$. We validate the effectiveness of MAPP through simulations and real-world data from the FCC AWS-3 spectrum auction.

dataset, mapp mechanism, mechanism, (14 more...)

arXiv.org Machine Learning

2503.10773

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > Oregon > Multnomah County > Portland (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Industry: Information Technology (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Private, Augmentation-Robust and Task-Agnostic Data Valuation Approach for Data Marketplace

Jahani-Nezhad, Tayyebeh, Moradi, Parsa, Maddah-Ali, Mohammad Ali, Caire, Giuseppe

arXiv.org Artificial IntelligenceNov-1-2024

Evaluating datasets in data marketplaces, where the buyer aim to purchase valuable data, is a critical challenge. In this paper, we introduce an innovative task-agnostic data valuation method called PriArTa which is an approach for computing the distance between the distribution of the buyer's existing dataset and the seller's dataset, allowing the buyer to determine how effectively the new data can enhance its dataset. PriArTa is communication-efficient, enabling the buyer to evaluate datasets without needing access to the entire dataset from each seller. Instead, the buyer requests that sellers perform specific preprocessing on their data and then send back the results. Using this information and a scoring metric, the buyer can evaluate the dataset. The preprocessing is designed to allow the buyer to compute the score while preserving the privacy of each seller's dataset, mitigating the risk of information leakage before the purchase. A key feature of PriArTa is its robustness to common data transformations, ensuring consistent value assessment and reducing the risk of purchasing redundant data. The effectiveness of PriArTa is demonstrated through experiments on real-world image datasets, showing its ability to perform privacy-preserving, augmentation-robust data valuation in data marketplaces.

artificial intelligence, data mining, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2411.00745

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > Ontario > Toronto (0.14)
North America > United States > New York > New York County > New York City (0.04)
(4 more...)

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

Disentangled Structural and Featural Representation for Task-Agnostic Graph Valuation

Falahati, Ali, Amiri, Mohammad Mohammadi

arXiv.org Machine LearningAug-22-2024

With the emergence of data marketplaces, the demand for methods to assess the value of data has increased significantly. While numerous techniques have been proposed for this purpose, none have specifically addressed graphs as the main data modality. Graphs are widely used across various fields, ranging from chemical molecules to social networks. In this study, we break down graphs into two main components: structural and featural, and we focus on evaluating data without relying on specific task-related metrics, making it applicable in practical scenarios where validation requirements may be lacking. We introduce a novel framework called blind message passing, which aligns the seller's and buyer's graphs using a shared node permutation based on graph matching. This allows us to utilize the graph Wasserstein distance to quantify the differences in the structural distribution of graph datasets, called the structural disparities. We then consider featural aspects of buyers' and sellers' graphs for data valuation and capture their statistical similarities and differences, referred to as relevance and diversity, respectively. Our approach ensures that buyers and sellers remain unaware of each other's datasets. Our experiments on real datasets demonstrate the effectiveness of our approach in capturing the relevance, diversity, and structural disparities of seller data for buyers, particularly in graph-based data valuation scenarios.

dataset, graph, valuation, (16 more...)

arXiv.org Machine Learning

2408.12659

Genre: Research Report > New Finding (0.66)

Industry:

Information Technology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.93)
Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Data Acquisition: A New Frontier in Data-centric AI

Chen, Lingjiao, Acun, Bilge, Ardalani, Newsha, Sun, Yifan, Kang, Feiyang, Lyu, Hanrui, Kwon, Yongchan, Jia, Ruoxi, Wu, Carole-Jean, Zaharia, Matei, Zou, James

arXiv.org Artificial IntelligenceNov-22-2023

Datasets, the cornerstone of modern machine learning (ML) systems, have been increasingly sold and purchased for different ML pipelines [2]. Several data marketplaces have emerged to serve different stages of building ML-enhanced data applications. For example, NASDAQ Data Link [3] offers financial datasets cleaned and structured for model training, Amazon AWS data exchange [4] focuses on generic tabular datasets, and Databricks Marketplace [5] integrates raw datasets and ML pipelines to deliver insights. The data-as-a-service market size was more than 30 billions and is expected to double in the next five years [6]. While the data marketplaces are increasingly expanding, unfortunately, data acquisition for ML remains challenging, partially due to its ad-hoc nature: Based on discussions with real-world users, data acquirers often need to negotiate varying contracts with different data providers first, then purchase multiple datasets with different formats, and finally filtering out unnecessary data from the purchased datasets.

dataset, marketplace, provider, (13 more...)

arXiv.org Artificial Intelligence

2311.13712

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > Virginia > Montgomery County > Blacksburg (0.04)
(3 more...)

Genre: Research Report (0.82)

Industry:

Health & Medicine (1.00)
Information Technology > Security & Privacy (0.88)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

A Survey of Data Pricing for Data Marketplaces

Zhang, Mengxiao, Beltran, Fernando, Liu, Jiamou

arXiv.org Artificial IntelligenceMar-6-2023

A data marketplace is an online venue that brings data owners, data brokers, and data consumers together and facilitates commoditisation of data amongst them. Data pricing, as a key function of a data marketplace, demands quantifying the monetary value of data. A considerable number of studies on data pricing can be found in literature. This paper attempts to comprehensively review the state-of-the-art on existing data pricing studies to provide a general understanding of this emerging research area. Our key contribution lies in a new taxonomy of data pricing studies that unifies different attributes determining data prices. The basis of our framework categorises these studies by the kind of market structure, be it sell-side, buy-side, or two-sided. Then in a sell-side market, the studies are further divided by query type, which defines the way a data consumer accesses data, while in a buy-side market, the studies are divided according to privacy notion, which defines the way to quantify privacy of data owners. In a two-sided market, both privacy notion and query type are used as criteria. We systematically examine the studies falling into each category in our taxonomy. Lastly, we discuss gaps within the existing research and define future research directions.

data consumer, data mining, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2303.0481

Country:

Oceania > New Zealand > North Island > Auckland Region > Auckland (0.05)
North America > United States > Hawaii (0.04)
South America > Colombia (0.04)
(7 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Banking & Finance (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Information Management (1.00)
Information Technology > Data Science > Data Mining (1.00)
(3 more...)

Add feedback