AITopics | Liu, Jinfei

Collaborating Authors

Liu, Jinfei

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

CoKV: Optimizing KV Cache Allocation via Cooperative Game

Sun, Qiheng, Zhang, Hongwei, Xia, Haocheng, Zhang, Jiayao, Liu, Jinfei, Ren, Kui

arXiv.org Artificial IntelligenceFeb-21-2025

Large language models (LLMs) have achieved remarkable success on various aspects of human life. However, one of the major challenges in deploying these models is the substantial memory consumption required to store key-value pairs (KV), which imposes significant resource demands. Recent research has focused on KV cache budget allocation, with several approaches proposing head-level budget distribution by evaluating the importance of individual attention heads. These methods, however, assess the importance of heads independently, overlooking their cooperative contributions within the model, which may result in a deviation from their true impact on model performance. In light of this limitation, we propose CoKV, a novel method that models the cooperation between heads in model inference as a cooperative game. By evaluating the contribution of each head within the cooperative game, CoKV can allocate the cache budget more effectively. Extensive experiments show that CoKV achieves state-of-the-art performance on the LongBench benchmark using LLama-3-8B-Instruct and Mistral-7B models.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2502.17501

Country:

Asia (0.67)
North America > United States > Illinois (0.14)
Europe > Austria > Vienna (0.14)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

A Survey on Data Markets

Zhang, Jiayao, Bi, Yuran, Cheng, Mengye, Liu, Jinfei, Ren, Kui, Sun, Qiheng, Wu, Yihang, Cao, Yang, Fernandez, Raul Castro, Xu, Haifeng, Jia, Ruoxi, Kwon, Yongchan, Pei, Jian, Wang, Jiachen T., Xia, Haocheng, Xiong, Li, Yu, Xiaohui, Zou, James

arXiv.org Artificial IntelligenceNov-9-2024

Data is the new oil of the 21st century. The growing trend of trading data for greater welfare has led to the emergence of data markets. A data market is any mechanism whereby the exchange of data products including datasets and data derivatives takes place as a result of data buyers and data sellers being in contact with one another, either directly or through mediating agents. It serves as a coordinating mechanism by which several functions, including the pricing and the distribution of data as the most important ones, interact to make the value of data fully exploited and enhanced. In this article, we present a comprehensive survey of this important and emerging direction from the aspects of data search, data productization, data transaction, data pricing, revenue allocation as well as privacy, security, and trust issues. We also investigate the government policies and industry status of data markets across different countries and different domains. Finally, we identify the unresolved challenges and discuss possible future directions for the development of data markets.

kamalika chaudhuri and ruslan salakhutdinov, knowledge management, machine learning, (31 more...)

arXiv.org Artificial Intelligence

2411.07267

Country:

Asia > China (1.00)
Africa (1.00)
Europe > France (0.67)
(6 more...)

Genre:

Overview (1.00)
Research Report > Experimental Study (0.45)
Research Report > New Finding (0.45)

Industry:

Telecommunications (1.00)
Law > Civil Rights & Constitutional Law (1.00)
Law Enforcement & Public Safety (1.00)
(13 more...)

Technology:

Information Technology > e-Commerce > Financial Technology (1.00)
Information Technology > Software Engineering (1.00)
Information Technology > Security & Privacy (1.00)
(20 more...)

Add feedback

Cross-silo Federated Learning with Record-level Personalized Differential Privacy

Liu, Junxu, Lou, Jian, Xiong, Li, Liu, Jinfei, Meng, Xiaofeng

arXiv.org Artificial IntelligenceJan-29-2024

Federated learning enhanced by differential privacy has emerged as a popular approach to better safeguard the privacy of client-side data by protecting clients' contributions during the training process. Existing solutions typically assume a uniform privacy budget for all records and provide one-size-fits-all solutions that may not be adequate to meet each record's privacy requirement. In this paper, we explore the uncharted territory of cross-silo FL with record-level personalized differential privacy. We devise a novel framework named rPDP-FL, employing a two-stage hybrid sampling scheme with both client-level sampling and non-uniform record-level sampling to accommodate varying privacy requirements. A critical and non-trivial problem is to select the ideal per-record sampling probability q given the personalized privacy budget {\epsilon}. We introduce a versatile solution named Simulation-CurveFitting, allowing us to uncover a significant insight into the nonlinear correlation between q and {\epsilon} and derive an elegant mathematical model to tackle the problem. Our evaluation demonstrates that our solution can provide significant performance gains over the baselines that do not consider personalized privacy preservation.

artificial intelligence, machine learning, probability, (15 more...)

arXiv.org Artificial Intelligence

2401.16251

Country: North America > United States (0.71)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Prompt Valuation Based on Shapley Values

Liu, Hanxi, Mao, Xiaokai, Xia, Haocheng, Lou, Jian, Liu, Jinfei

arXiv.org Artificial IntelligenceDec-23-2023

Large language models (LLMs) excel on new tasks without additional training, simply by providing natural language prompts that demonstrate how the task should be performed. Prompt ensemble methods comprehensively harness the knowledge of LLMs while mitigating individual biases and errors and further enhancing performance. However, more prompts do not necessarily lead to better results, and not all prompts are beneficial. A small number of high-quality prompts often outperform many low-quality prompts. Currently, there is a lack of a suitable method for evaluating the impact of prompts on the results. In this paper, we utilize the Shapley value to fairly quantify the contributions of prompts, helping to identify beneficial or detrimental prompts, and potentially guiding prompt valuation in data markets. Through extensive experiments employing various ensemble methods and utility functions on diverse tasks, we validate the effectiveness of using the Shapley value method for prompts as it effectively distinguishes and quantifies the contributions of each prompt.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2312.15395

Country:

Europe (0.46)
North America > United States > California (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Shapley Value on Probabilistic Classifiers

Li, Xiang, Xia, Haocheng, Liu, Jinfei

arXiv.org Artificial IntelligenceJun-12-2023

Data valuation has become an increasingly significant discipline in data science due to the economic value of data. In the context of machine learning (ML), data valuation methods aim to equitably measure the contribution of each data point to the utility of an ML model. One prevalent method is Shapley value, which helps identify data points that are beneficial or detrimental to an ML model. However, traditional Shapley-based data valuation methods may not effectively distinguish between beneficial and detrimental training data points for probabilistic classifiers. In this paper, we propose Probabilistic Shapley (P-Shapley) value by constructing a probability-wise utility function that leverages the predicted class probabilities of probabilistic classifiers rather than binarized prediction results in the traditional Shapley value. We also offer several activation functions for confidence calibration to effectively quantify the marginal contribution of each data point to the probabilistic classifiers. Extensive experiments on four real-world datasets demonstrate the effectiveness of our proposed P-Shapley value in evaluating the importance of data for building a high-usability and trustworthy ML model.

artificial intelligence, game theory, machine learning, (12 more...)

arXiv.org Artificial Intelligence

2306.07171

Country:

North America > United States (0.15)
North America > Canada (0.14)
Asia > Japan (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Quantifying and Defending against Privacy Threats on Federated Knowledge Graph Embedding

Hu, Yuke, Liang, Wei, Wu, Ruofan, Xiao, Kai, Wang, Weiqiang, Li, Xiaochen, Liu, Jinfei, Qin, Zhan

arXiv.org Artificial IntelligenceApr-6-2023

Knowledge Graph Embedding (KGE) is a fundamental technique that extracts expressive representation from knowledge graph (KG) to facilitate diverse downstream tasks. The emerging federated KGE (FKGE) collaboratively trains from distributed KGs held among clients while avoiding exchanging clients' sensitive raw KGs, which can still suffer from privacy threats as evidenced in other federated model trainings (e.g., neural networks). However, quantifying and defending against such privacy threats remain unexplored for FKGE which possesses unique properties not shared by previously studied models. In this paper, we conduct the first holistic study of the privacy threat on FKGE from both attack and defense perspectives. For the attack, we quantify the privacy threat by proposing three new inference attacks, which reveal substantial privacy risk by successfully inferring the existence of the KG triple from victim clients. For the defense, we propose DP-Flames, a novel differentially private FKGE with private selection, which offers a better privacy-utility tradeoff by exploiting the entity-binding sparse gradient property of FKGE and comes with a tight privacy accountant by incorporating the state-of-the-art private selection technique. We further propose an adaptive privacy budget allocation policy to dynamically adjust defense magnitude across the training procedure. Comprehensive evaluations demonstrate that the proposed defense can successfully mitigate the privacy threat by effectively reducing the success rate of inference attacks from $83.1\%$ to $59.4\%$ on average with only a modest utility decrease.

artificial intelligence, machine learning, relation, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3543507.3583450

2304.02932

Country:

Europe (1.00)
Asia (1.00)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.83)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback