AITopics | Li, Beibei

Collaborating Authors

Li, Beibei

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Fairshare Data Pricing for Large Language Models

Zhang, Luyang, Jiao, Cathy, Li, Beibei, Xiong, Chenyan

arXiv.org Artificial IntelligenceJan-31-2025

Training data is a pivotal resource for building large language models (LLMs), but unfair pricing in data markets poses a serious challenge for both data buyers (e.g., LLM builders) and sellers (e.g., human annotators), which discourages market participation, reducing data quantity and quality. In this paper, we propose a fairshare pricing framework that sets training data prices using data valuation methods to quantify their contribution to LLMs. In our framework, buyers make purchasing decisions using data valuation and sellers set prices to maximize their profits based on the anticipated buyer purchases. We theoretically show that pricing derived from our framework is tightly linked to data valuation and buyers' budget, optimal for both buyers and sellers. Through market simulations using current LLMs and datasets (math problems, medical diagnosis, and physical reasoning), we show that our framework is fairshare for buyers by ensuring their purchased data is reflective of model training value, leading to higher LLM task performances per-dollar spent on data, and fairshare for sellers by ensuring they sell their data at optimal prices. Our framework lays the foundation for future research on equitable and sustainable data markets for large-scale AI.

artificial intelligence, large language model, natural language, (13 more...)

arXiv.org Artificial Intelligence

2502.00198

Country:

North America > United States > New York (0.14)
Europe > Middle East > Malta (0.14)
Asia > Middle East > UAE (0.14)

Genre: Research Report > New Finding (0.68)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Diagnostic Medicine (0.88)
Banking & Finance > Trading (0.68)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Grimm: A Plug-and-Play Perturbation Rectifier for Graph Neural Networks Defending against Poisoning Attacks

Liu, Ao, Li, Wenshan, Li, Beibei, Ma, Wengang, Li, Tao, Zhou, Pan

arXiv.org Artificial IntelligenceDec-19-2024

Recent studies have revealed the vulnerability of graph neural networks (GNNs) to adversarial poisoning attacks on node classification tasks. Current defensive methods require substituting the original GNNs with defense models, regardless of the original's type. This approach, while targeting adversarial robustness, compromises the enhancements developed in prior research to boost GNNs' practical performance. Here we introduce Grimm, the first plug-and-play defense model. With just a minimal interface requirement for extracting features from any layer of the protected GNNs, Grimm is thus enabled to seamlessly rectify perturbations. Specifically, we utilize the feature trajectories (FTs) generated by GNNs, as they evolve through epochs, to reflect the training status of the networks. We then theoretically prove that the FTs of victim nodes will inevitably exhibit discriminable anomalies. Consequently, inspired by the natural parallelism between the biological nervous and immune systems, we construct Grimm, a comprehensive artificial immune system for GNNs. Grimm not only detects abnormal FTs and rectifies adversarial edges during training but also operates efficiently in parallel, thereby mirroring the concurrent functionalities of its biological counterparts. We experimentally confirm that Grimm offers four empirically validated advantages: 1) Harmlessness, as it does not actively interfere with GNN training; 2) Parallelism, ensuring monitoring, detection, and rectification functions operate independently of the GNN training process; 3) Generalizability, demonstrating compatibility with mainstream GNNs such as GCN, GAT, and GraphSAGE; and 4) Transferability, as the detectors for abnormal FTs can be efficiently transferred across different systems for one-step rectification.

artificial intelligence, machine learning, trajectory, (17 more...)

arXiv.org Artificial Intelligence

2412.08555

Country: North America > United States (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.86)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Magnetic Field Data Calibration with Transformer Model Using Physical Constraints: A Scalable Method for Satellite Missions, Illustrated by Tianwen-1

Li, Beibei, Chi, Yutian, Wang, Yuming

arXiv.org Artificial IntelligenceDec-16-2024

However, magnetometer data often suffer from disturbances caused by satellite dynamics, onboard instrument interference, and environmental noise. For instance, changes in satellite orientation can lead to anomalies in magnetic field measurements due to interference from electric currents within the satellite's instruments. These disturbances necessitate careful data correction to ensure the accuracy and reliability of measurements. Traditional correction methods rely heavily on human expertise and are rooted in well established physical and mathematical principles. While these methods have proven effective, they are inherently limited by their long processing times and delays in real time prediction [7] [6] [4] [2] [1]. In contrast, machine learning models, though rarely applied in this field, offer strong predictive capabilities and the potential for faster computations. This study seeks to address these limitations by combining the strengths of traditional correction methods with the adaptability and efficiency of machine learning models, thereby improving timeliness while ensuring both physical consistency and improved real time performance. This study bridges the gap between data driven models and physics based understanding by integrating Maxwell's equations into the neural network architecture as physical information. The key innovations are: 1 arXiv:2501.00020v3

artificial intelligence, machine learning, transformer, (14 more...)

arXiv.org Artificial Intelligence

2501.0002

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

An Unbiased Risk Estimator for Partial Label Learning with Augmented Classes

Hu, Jiayu, Shu, Senlin, Li, Beibei, Xiang, Tao, He, Zhongshi

arXiv.org Machine LearningSep-29-2024

Partial Label Learning (PLL) is a typical weakly supervised learning task, which assumes each training instance is annotated with a set of candidate labels containing the ground-truth label. Recent PLL methods adopt identification-based disambiguation to alleviate the influence of false positive labels and achieve promising performance. However, they require all classes in the test set to have appeared in the training set, ignoring the fact that new classes will keep emerging in real applications. To address this issue, in this paper, we focus on the problem of Partial Label Learning with Augmented Class (PLLAC), where one or more augmented classes are not visible in the training stage but appear in the inference stage. Specifically, we propose an unbiased risk estimator with theoretical guarantees for PLLAC, which estimates the distribution of augmented classes by differentiating the distribution of known classes from unlabeled data and can be equipped with arbitrary PLL loss functions. Besides, we provide a theoretical analysis of the estimation error bound of the estimator, which guarantees the convergence of the empirical risk minimizer to the true risk minimizer as the number of training data tends to infinity. Furthermore, we add a risk-penalty regularization term in the optimization objective to alleviate the influence of the over-fitting issue caused by negative empirical risk. Extensive experiments on benchmark, UCI and real-world datasets demonstrate the effectiveness of the proposed approach.

artificial intelligence, inductive learning, machine learning, (12 more...)

arXiv.org Machine Learning

2409.196

Country:

North America > Canada > Quebec (0.14)
North America > United States > New York (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback

Tuning Vision-Language Models with Candidate Labels by Prompt Alignment

Zhang, Zhifang, Li, Beibei

arXiv.org Artificial IntelligenceJul-11-2024

Vision-language models (VLMs) can learn high-quality representations from a large-scale training dataset of image-text pairs. Prompt learning is a popular approach to fine-tuning VLM to adapt them to downstream tasks. Despite the satisfying performance, a major limitation of prompt learning is the demand for labelled data. In real-world scenarios, we may only obtain candidate labels (where the true label is included) instead of the true labels due to data privacy or sensitivity issues. In this paper, we provide the first study on prompt learning with candidate labels for VLMs. We empirically demonstrate that prompt learning is more advantageous than other fine-tuning methods, for handling candidate labels. Nonetheless, its performance drops when the label ambiguity increases. In order to improve its robustness, we propose a simple yet effective framework that better leverages the prior knowledge of VLMs to guide the learning process with candidate labels. Specifically, our framework disambiguates candidate labels by aligning the model output with the mixed class posterior jointly predicted by both the learnable and the handcrafted prompt. Besides, our framework can be equipped with various off-the-shelf training objectives for learning with candidate labels to further improve their performance. Extensive experiments demonstrate the effectiveness of our proposed framework.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2407.07638

Country:

Asia > China (0.14)
Europe > Belgium (0.14)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Towards Inductive Robustness: Distilling and Fostering Wave-induced Resonance in Transductive GCNs Against Graph Adversarial Attacks

Liu, Ao, Li, Wenshan, Li, Tao, Li, Beibei, Huang, Hanyuan, Zhou, Pan

arXiv.org Artificial IntelligenceDec-13-2023

Graph neural networks (GNNs) have recently been shown to be vulnerable to adversarial attacks, where slight perturbations in the graph structure can lead to erroneous predictions. However, current robust models for defending against such attacks inherit the transductive limitations of graph convolutional networks (GCNs). As a result, they are constrained by fixed structures and do not naturally generalize to unseen nodes. Here, we discover that transductive GCNs inherently possess a distillable robustness, achieved through a wave-induced resonance process. Based on this, we foster this resonance to facilitate inductive and robust learning. Specifically, we first prove that the signal formed by GCN-driven message passing (MP) is equivalent to the edge-based Laplacian wave, where, within a wave system, resonance can naturally emerge between the signal and its transmitting medium. This resonance provides inherent resistance to malicious perturbations inflicted on the signal system. We then prove that merely three MP iterations within GCNs can induce signal resonance between nodes and edges, manifesting as a coupling between nodes and their distillable surrounding local subgraph. Consequently, we present Graph Resonance-fostering Network (GRN) to foster this resonance via learning node representations from their distilled resonating subgraphs. By capturing the edge-transmitted signals within this subgraph and integrating them with the node signal, GRN embeds these combined signals into the central node's representation. This node-wise embedding approach allows for generalization to unseen nodes. We validate our theoretical findings with experiments, and demonstrate that GRN generalizes robustness to unseen nodes, whilst maintaining state-of-the-art classification accuracy on perturbed graphs.

artificial intelligence, equation, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2312.08651

Country: Asia > China (0.14)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (0.71)
Government > Military (0.71)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)

Add feedback

Graph Agent Network: Empowering Nodes with Decentralized Communications Capabilities for Adversarial Resilience

Liu, Ao, Li, Wenshan, Li, Tao, Li, Beibei, Huang, Hanyuan, Xu, Guangquan, Zhou, Pan

arXiv.org Artificial IntelligenceJun-12-2023

End-to-end training with global optimization have popularized graph neural networks (GNNs) for node classification, yet inadvertently introduced vulnerabilities to adversarial edge-perturbing attacks. Adversaries can exploit the inherent opened interfaces of GNNs' input and output, perturbing critical edges and thus manipulating the classification results. Current defenses, due to their persistent utilization of global-optimization-based end-to-end training schemes, inherently encapsulate the vulnerabilities of GNNs. This is specifically evidenced in their inability to defend against targeted secondary attacks. In this paper, we propose the Graph Agent Network (GAgN) to address the aforementioned vulnerabilities of GNNs. GAgN is a graph-structured agent network in which each node is designed as an 1-hop-view agent. Through the decentralized interactions between agents, they can learn to infer global perceptions to perform tasks including inferring embeddings, degrees and neighbor relationships for given nodes. This empowers nodes to filtering adversarial edges while carrying out classification tasks. Furthermore, agents' limited view prevents malicious messages from propagating globally in GAgN, thereby resisting global-optimization-based secondary attacks. We prove that single-hidden-layer multilayer perceptrons (MLPs) are theoretically sufficient to achieve these functionalities. Experimental results show that GAgN effectively implements all its intended capabilities and, compared to state-of-the-art defenses, achieves optimal classification accuracy on the perturbed datasets.

agent, artificial intelligence, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2306.06909

Country:

Asia > China (0.94)
North America > United States > California (0.14)
North America > Canada > New Brunswick (0.14)

Genre: Research Report > New Finding (0.65)

Industry:

Information Technology > Security & Privacy (1.00)
Education (1.00)
Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Inclusive FinTech Lending via Contrastive Learning and Domain Adaptation

Hu, Xiyang, Huang, Yan, Li, Beibei, Lu, Tian

arXiv.org Artificial IntelligenceMay-9-2023

FinTech lending (e.g., micro-lending) has played a significant role in facilitating financial inclusion. It has reduced processing times and costs, enhanced the user experience, and made it possible for people to obtain loans who may not have qualified for credit from traditional lenders. However, there are concerns about the potentially biased algorithmic decision-making during loan screening. Machine learning algorithms used to evaluate credit quality can be influenced by representation bias in the training data, as we only have access to the default outcome labels of approved loan applications, for which the borrowers' socioeconomic characteristics are better than those of rejected ones. In this case, the model trained on the labeled data performs well on the historically approved population, but does not generalize well to borrowers of low socioeconomic background. In this paper, we investigate the problem of representation bias in loan screening for a real-world FinTech lending platform. We propose a new Transformer-based sequential loan screening model with self-supervised contrastive learning and domain adaptation to tackle this challenging issue. We use contrastive learning to train our feature extractor on unapproved (unlabeled) loan applications and use domain adaptation to generalize the performance of our label predictor. We demonstrate the effectiveness of our model through extensive experimentation in the real-world micro-lending setting. Our results show that our model significantly promotes the inclusiveness of funding decisions, while also improving loan screening accuracy and profit by 7.10% and 8.95%, respectively. We also show that incorporating the test data into contrastive learning and domain adaptation and labeling a small ratio of test data can further boost model performance.

artificial intelligence, domain adaptation, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2305.05827

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > Arizona (0.14)

Genre: Research Report > New Finding (0.68)

Industry:

Banking & Finance > Loans (1.00)
Banking & Finance > Credit (0.89)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

FedCliP: Federated Learning with Client Pruning

Li, Beibei, Shao, Zerui, Liu, Ao, Wang, Peiran

arXiv.org Artificial IntelligenceJan-28-2023

The prevalent communication efficient federated learning (FL) frameworks usually take advantages of model gradient compression or model distillation. However, the unbalanced local data distributions (either in quantity or quality) of participating clients, contributing non-equivalently to the global model training, still pose a big challenge to these works. In this paper, we propose FedCliP, a novel communication efficient FL framework that allows faster model training, by adaptively learning which clients should remain active for further model training and pruning those who should be inactive with less potential contributions. We also introduce an alternative optimization method with a newly defined contribution score measure to facilitate active and inactive client determination. We empirically evaluate the communication efficiency of FL frameworks with extensive experiments on three benchmark datasets under both IID and non-IID settings. Numerical results demonstrate the outperformance of the porposed FedCliP framework over state-of-the-art FL frameworks, i.e., FedCliP can save 70% of communication overhead with only 0.2% accuracy loss on MNIST datasets, and save 50% and 15% of communication overheads with less than 1% accuracy loss on FMNIST and CIFAR-10 datasets, respectively.

artificial intelligence, contribution score, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2301.06768

Country:

North America > United States (0.28)
Asia > China (0.28)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Uncovering the Source of Machine Bias

Hu, Xiyang, Huang, Yan, Li, Beibei, Lu, Tian

arXiv.org Machine LearningJan-9-2022

We develop a structural econometric model to capture the decision dynamics of human evaluators on an online micro-lending platform, and estimate the model parameters using a real-world dataset. We find two types of biases in gender, preference-based bias and belief-based bias, are present in human evaluators' decisions. Both types of biases are in favor of female applicants. Through counterfactual simulations, we quantify the effect of gender bias on loan granting outcomes and the welfare of the company and the borrowers. Our results imply that both the existence of the preference-based bias and that of the belief-based bias reduce the company's profits. When the preference-based bias is removed, the company earns more profits. When the belief-based bias is removed, the company's profits also increase. Both increases result from raising the approval probability for borrowers, especially male borrowers, who eventually pay back loans. For borrowers, the elimination of either bias decreases the gender gap of the true positive rates in the credit risk evaluation. We also train machine learning algorithms on both the real-world data and the data from the counterfactual simulations. We compare the decisions made by those algorithms to see how evaluators' biases are inherited by the algorithms and reflected in machine-based decisions. We find that machine learning algorithms can mitigate both the preference-based bias and the belief-based bias.

artificial intelligence, banking & finance, machine learning, (19 more...)

arXiv.org Machine Learning

2201.03092

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre: Research Report > New Finding (0.88)

Industry:

Law (1.00)
Banking & Finance > Loans (1.00)
Banking & Finance > Credit (0.90)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback