AITopics | Cheng, Lu

Collaborating Authors

Cheng, Lu

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

API Is Enough: Conformal Prediction for Large Language Models Without Logit-Access

Su, Jiayuan, Luo, Jing, Wang, Hongwei, Cheng, Lu

arXiv.org Artificial IntelligenceApr-3-2024

This study aims to address the pervasive challenge of quantifying uncertainty in large language models (LLMs) without logit-access. Conformal Prediction (CP), known for its model-agnostic and distribution-free features, is a desired approach for various LLMs and data distributions. However, existing CP methods for LLMs typically assume access to the logits, which are unavailable for some API-only LLMs. In addition, logits are known to be miscalibrated, potentially leading to degraded CP performance. To tackle these challenges, we introduce a novel CP method that (1) is tailored for API-only LLMs without logit-access; (2) minimizes the size of prediction sets; and (3) ensures a statistical guarantee of the user-defined coverage. The core idea of this approach is to formulate nonconformity measures using both coarse-grained (i.e., sample frequency) and fine-grained uncertainty notions (e.g., semantic similarity). Experimental results on both close-ended and open-ended Question Answering tasks show our approach can mostly outperform the logit-based CP baselines.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2403.01216

Country:

Europe (1.00)
North America > United States > Illinois (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Transportation > Air (0.68)
Education > Curriculum > Subject-Specific Education (0.47)
Leisure & Entertainment > Sports > Hockey (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

JORA: JAX Tensor-Parallel LoRA Library for Retrieval Augmented Fine-Tuning

Tahir, Anique, Cheng, Lu, Liu, Huan

arXiv.org Artificial IntelligenceMar-19-2024

The scaling of Large Language Models (LLMs) for retrieval-based tasks, particularly in Retrieval Augmented Generation (RAG), faces significant memory constraints, especially when fine-tuning extensive prompt sequences. Current open-source libraries support full-model inference and fine-tuning across multiple GPUs but fall short of accommodating the efficient parameter distribution required for retrieved context. Addressing this gap, we introduce a novel framework for PEFT-compatible fine-tuning of Llama-2 models, leveraging distributed training. Our framework uniquely utilizes JAX's just-in-time (JIT) compilation and tensor-sharding for efficient resource management, thereby enabling accelerated fine-tuning with reduced memory requirements. This advancement significantly improves the scalability and feasibility of fine-tuning LLMs for complex RAG applications, even on systems with limited GPU resources. Our experiments show more than 12x improvement in runtime compared to Hugging Face/DeepSpeed implementation with four GPUs while consuming less than half the VRAM per GPU.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2403.11366

Country:

North America > United States > Arizona (0.14)
North America > United States > Illinois (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Media Bias Matters: Understanding the Impact of Politically Biased News on Vaccine Attitudes in Social Media

Jiang, Bohan, Cheng, Lu, Tan, Zhen, Guo, Ruocheng, Liu, Huan

arXiv.org Artificial IntelligenceMar-6-2024

News media has been utilized as a political tool to stray from facts, presenting biased claims without evidence. Amid the COVID-19 pandemic, politically biased news (PBN) has significantly undermined public trust in vaccines, despite strong medical evidence supporting their efficacy. In this paper, we analyze: (i) how inherent vaccine stances subtly influence individuals' selection of news sources and participation in social media discussions; and (ii) the impact of exposure to PBN on users' attitudes toward vaccines. In doing so, we first curate a comprehensive dataset that connects PBN with related social media discourse. Utilizing advanced deep learning and causal inference techniques, we reveal distinct user behaviors between social media groups with various vaccine stances. Moreover, we observe that individuals with moderate stances, particularly the vaccine-hesitant majority, are more vulnerable to the influence of PBN compared to those with extreme views. Our findings provide critical insights to foster this line of research.

artificial intelligence, machine learning, social media, (16 more...)

arXiv.org Artificial Intelligence

2403.04009

Country: North America > United States > Illinois (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.68)

Industry:

Health & Medicine > Therapeutic Area > Vaccines (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.34)

Add feedback

Overcoming Pitfalls in Graph Contrastive Learning Evaluation: Toward Comprehensive Benchmarks

Ma, Qian, Chi, Hongliang, Zhang, Hengrui, Liu, Kay, Zhang, Zhiwei, Cheng, Lu, Wang, Suhang, Yu, Philip S., Ma, Yao

arXiv.org Artificial IntelligenceFeb-23-2024

The rise of self-supervised learning, which operates without the need for labeled data, has garnered significant interest within the graph learning community. This enthusiasm has led to the development of numerous Graph Contrastive Learning (GCL) techniques, all aiming to create a versatile graph encoder that leverages the wealth of unlabeled data for various downstream tasks. However, the current evaluation standards for GCL approaches are flawed due to the need for extensive hyper-parameter tuning during pre-training and the reliance on a single downstream task for assessment. These flaws can skew the evaluation away from the intended goals, potentially leading to misleading conclusions. In our paper, we thoroughly examine these shortcomings and offer fresh perspectives on how GCL methods are affected by hyper-parameter choices and the choice of downstream tasks for their evaluation. Additionally, we introduce an enhanced evaluation framework designed to more accurately gauge the effectiveness, consistency, and overall capability of GCL methods.

artificial intelligence, graph contrastive learning evaluation, machine learning, (6 more...)

arXiv.org Artificial Intelligence

2402.1568

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A Survey on Safe Multi-Modal Learning System

Zhao, Tianyi, Zhang, Liangliang, Ma, Yao, Cheng, Lu

arXiv.org Artificial IntelligenceFeb-7-2024

With the wide deployment of multimodal learning systems (MMLS) in real-world scenarios, safety concerns have become increasingly prominent. The absence of systematic research into their safety is a significant barrier to progress in this field. To bridge the gap, we present the first taxonomy for MMLS safety, identifying four essential pillars of these concerns. Leveraging this taxonomy, we conduct in-depth reviews for each pillar, highlighting key limitations based on the current state of development. Finally, we pinpoint unique challenges in MMLS safety and provide potential directions for future research.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2402.05355

Country:

North America > United States > Illinois (0.14)
North America > United States > California (0.14)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Beyond Detection: Unveiling Fairness Vulnerabilities in Abusive Language Models

Liang, Yueqing, Cheng, Lu, Payani, Ali, Shu, Kai

arXiv.org Artificial IntelligenceDec-5-2023

This work investigates the potential of undermining both fairness and detection performance in abusive language detection. In a dynamic and complex digital world, it is crucial to investigate the vulnerabilities of these detection models to adversarial fairness attacks to improve their fairness robustness. We propose a simple yet effective framework FABLE that leverages backdoor attacks as they allow targeted control over the fairness and detection performance. FABLE explores three types of trigger designs (i.e., rare, artificial, and natural triggers) and novel sampling strategies. Specifically, the adversary can inject triggers into samples in the minority group with the favored outcome (i.e., "non-abusive") and flip their labels to the unfavored outcome, i.e., "abusive". Experiments on benchmark datasets demonstrate the effectiveness of FABLE attacking fairness and utility in abusive language detection.

detection, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2311.09428

Country: North America > United States > Illinois (0.14)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Interpreting Pretrained Language Models via Concept Bottlenecks

Tan, Zhen, Cheng, Lu, Wang, Song, Bo, Yuan, Li, Jundong, Liu, Huan

arXiv.org Artificial IntelligenceNov-8-2023

Pretrained language models (PLMs) have made significant strides in various natural language processing tasks. However, the lack of interpretability due to their ``black-box'' nature poses challenges for responsible implementation. Although previous studies have attempted to improve interpretability by using, e.g., attention weights in self-attention layers, these weights often lack clarity, readability, and intuitiveness. In this research, we propose a novel approach to interpreting PLMs by employing high-level, meaningful concepts that are easily understandable for humans. For example, we learn the concept of ``Food'' and investigate how it influences the prediction of a model's sentiment towards a restaurant review. We introduce C$^3$M, which combines human-annotated and machine-generated concepts to extract hidden neurons designed to encapsulate semantically meaningful and task-specific concepts. Through empirical evaluations on real-world datasets, we manifest that our approach offers valuable insights to interpret PLM behavior, helps diagnose model failures, and enhances model robustness amidst noisy concept labels.

artificial intelligence, interpreting pretrained language model, natural language, (1 more...)

arXiv.org Artificial Intelligence

2311.05014

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Equal Opportunity of Coverage in Fair Regression

Wang, Fangxin, Cheng, Lu, Guo, Ruocheng, Liu, Kay, Yu, Philip S.

arXiv.org Artificial IntelligenceNov-3-2023

We study fair machine learning (ML) under predictive uncertainty to enable reliable and trustworthy decision-making. The seminal work of ``equalized coverage'' proposed an uncertainty-aware fairness notion. However, it does not guarantee equal coverage rates across more fine-grained groups (e.g., low-income females) conditioning on the true label and is biased in the assessment of uncertainty. To tackle these limitations, we propose a new uncertainty-aware fairness -- Equal Opportunity of Coverage (EOC) -- that aims to achieve two properties: (1) coverage rates for different groups with similar outcomes are close, and (2) the coverage rate for the entire population remains at a predetermined level. Further, the prediction intervals should be narrow to be informative. We propose Binned Fair Quantile Regression (BFQR), a distribution-free post-processing method to improve EOC with reasonable width for any trained ML models. It first calibrates a hold-out set to bound deviation from EOC, then leverages conformal prediction to maintain EOC on a test set, meanwhile optimizing prediction interval width. Experimental results demonstrate the effectiveness of our method in improving EOC. Our code is publicly available at https://github.com/fangxin-wang/bfqr .

artificial intelligence, coverage rate, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2311.02243

Country: North America > United States > Illinois (0.14)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Quality (0.93)

Add feedback

A Theoretical Approach to Characterize the Accuracy-Fairness Trade-off Pareto Frontier

Tang, Hua, Cheng, Lu, Liu, Ninghao, Du, Mengnan

arXiv.org Artificial IntelligenceOct-19-2023

While the accuracy-fairness trade-off has been frequently observed in the literature of fair machine learning, rigorous theoretical analyses have been scarce. To demystify this long-standing challenge, this work seeks to develop a theoretical framework by characterizing the shape of the accuracy-fairness trade-off Pareto frontier (FairFrontier), determined by a set of all optimal Pareto classifiers that no other classifiers can dominate. Specifically, we first demonstrate the existence of the trade-off in real-world scenarios and then propose four potential categories to characterize the important properties of the accuracy-fairness Pareto frontier. For each category, we identify the necessary conditions that lead to corresponding trade-offs. Experimental results on synthetic data suggest insightful findings of the proposed framework: (1) When sensitive attributes can be fully interpreted by non-sensitive attributes, FairFrontier is mostly continuous. (2) Accuracy can suffer a \textit{sharp} decline when over-pursuing fairness. (3) Eliminate the trade-off via a two-step streamlined approach. The proposed research enables an in-depth understanding of the accuracy-fairness trade-off, pushing current fair machine-learning research to a new frontier.

artificial intelligence, classifier, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2310.12785

Country:

North America > United States > Illinois (0.14)
North America > United States > California (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.95)

Add feedback

STANCE-C3: Domain-adaptive Cross-target Stance Detection via Contrastive Learning and Counterfactual Generation

Kim, Nayoung, Mosallanezhad, David, Cheng, Lu, Mancenido, Michelle V., Liu, Huan

arXiv.org Artificial IntelligenceSep-26-2023

Stance detection is the process of inferring a person's position or standpoint on a specific issue to deduce prevailing perceptions toward topics of general or controversial interest, such as health policies during the COVID-19 pandemic. Existing models for stance detection are trained to perform well for a single domain (e.g., COVID-19) and a specific target topic (e.g., masking protocols), but are generally ineffectual in other domains or targets due to distributional shifts in the data. However, constructing high-performing, domain-specific stance detection models requires an extensive corpus of labeled data relevant to the targeted domain, yet such datasets are not readily available. This poses a challenge as the process of annotating data is costly and time-consuming. To address these challenges, we introduce a novel stance detection model coined domain-adaptive Cross-target STANCE detection via Contrastive learning and Counterfactual generation (STANCE-C3) that uses counterfactual data augmentation to enhance domain-adaptive training by enriching the target domain dataset during the training process and requiring significantly less information from the new domain. We also propose a modified self-supervised contrastive learning as a component of STANCE-C3 to prevent overfitting for the existing domain and target and enable cross-target stance detection. Through experiments on various datasets, we show that STANCE-C3 shows performance improvement over existing state-of-the-art methods.

artificial intelligence, contrastive learning and counterfactual generation, machine learning, (2 more...)

arXiv.org Artificial Intelligence

2309.15176

Genre: Research Report (0.69)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.44)
Health & Medicine > Therapeutic Area > Immunology (0.44)
Health & Medicine > Epidemiology (0.44)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback