AITopics | Yu, Yifan

Collaborating Authors

Yu, Yifan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

EchoLM: Accelerating LLM Serving with Real-time Knowledge Distillation

Yu, Yifan, Gan, Yu, Tsai, Lillian, Sarda, Nikhil, Shen, Jiaming, Zhou, Yanqi, Krishnamurthy, Arvind, Lai, Fan, Levy, Henry M., Culler, David

arXiv.org Artificial IntelligenceJan-24-2025

Large language models (LLMs) have excelled in various applications, yet serving them at scale is challenging due to their substantial resource demands and high latency. Our real-world studies reveal that over 60% of user requests to LLMs have semantically similar counterparts, suggesting the potential for knowledge sharing among requests. However, naively caching and reusing past responses leads to large quality degradation. In this paper, we introduce EchoLM, an in-context caching system that leverages historical requests as examples to guide response generation, enabling selective offloading of requests to more efficient LLMs. However, enabling this real-time knowledge transfer leads to intricate tradeoffs between response quality, latency, and system throughput at scale. For a new request, EchoLM identifies similar, high-utility examples and efficiently prepends them to the input for better response. At scale, EchoLM adaptively routes requests to LLMs of varying capabilities, accounting for response quality and serving loads. EchoLM employs a cost-aware cache replay mechanism to improve example quality and coverage offline, maximizing cache utility and runtime efficiency. Evaluations on millions of open-source requests demonstrate that EchoLM has a throughput improvement of 1.4-5.9x while reducing latency by 28-71% without hurting response quality on average.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2501.12689

Country: North America > United States > Illinois (0.14)

Genre: Research Report (0.63)

Industry: Information Technology > Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Continual Novel Class Discovery via Feature Enhancement and Adaptation

Yu, Yifan, Wang, Shaokun, He, Yuhang, Chen, Junzhe, Gong, Yihong

arXiv.org Artificial IntelligenceMay-10-2024

Continual Novel Class Discovery (CNCD) aims to continually discover novel classes without labels while maintaining the recognition capability for previously learned classes. The main challenges faced by CNCD include the feature-discrepancy problem, the inter-session confusion problem, etc. In this paper, we propose a novel Feature Enhancement and Adaptation method for the CNCD to tackle the above challenges, which consists of a guide-to-novel framework, a centroid-to-samples similarity constraint (CSS), and a boundary-aware prototype constraint (BAP). More specifically, the guide-to-novel framework is established to continually discover novel classes under the guidance of prior distribution. Afterward, the CSS is designed to constrain the relationship between centroid-to-samples similarities of different classes, thereby enhancing the distinctiveness of features among novel classes. Finally, the BAP is proposed to keep novel class features aware of the positions of other class prototypes during incremental sessions, and better adapt novel class features to the shared feature space. Experimental results on three benchmark datasets demonstrate the superiority of our method, especially in more challenging protocols with more incremental sessions.

artificial intelligence, machine learning, novel class, (13 more...)

arXiv.org Artificial Intelligence

2405.06389

Country: Asia (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Learning the Market: Sentiment-Based Ensemble Trading Agents

Ye, Andrew, Xu, James, Wang, Yi, Yu, Yifan, Yan, Daniel, Chen, Ryan, Dong, Bosheng, Chaudhary, Vipin, Xu, Shuai

arXiv.org Artificial IntelligenceFeb-2-2024

We propose the integration of sentiment analysis and deep-reinforcement learning ensemble algorithms for stock trading, and design a strategy capable of dynamically altering its employed agent given concurrent market sentiment. In particular, we create a simple-yet-effective method for extracting news sentiment and combine this with general improvements upon existing works, resulting in automated trading agents that effectively consider both qualitative market factors and quantitative stock data. We show that our approach results in a strategy that is profitable, robust, and risk-minimal -- outperforming the traditional ensemble strategy as well as single agent algorithms and market metrics. Our findings determine that the conventional practice of switching ensemble agents every fixed-number of months is sub-optimal, and that a dynamic sentiment-based framework greatly unlocks additional performance within these agents. Furthermore, as we have designed our algorithm with simplicity and efficiency in mind, we hypothesize that the transition of our method from historical evaluation towards real-time trading with live data should be relatively simple.

machine learning, natural language, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

2402.01441

Country: North America > United States > California (0.28)

Genre: Research Report (0.70)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models

Li, Yixiao, Yu, Yifan, Liang, Chen, He, Pengcheng, Karampatziakis, Nikos, Chen, Weizhu, Zhao, Tuo

arXiv.org Artificial IntelligenceNov-28-2023

Quantization is an indispensable technique for serving Large Language Models (LLMs) and has recently found its way into LoRA fine-tuning. In this work we focus on the scenario where quantization and LoRA fine-tuning are applied together on a pre-trained model. In such cases it is common to observe a consistent gap in the performance on downstream tasks between full fine-tuning and quantization plus LoRA fine-tuning approach. In response, we propose LoftQ (LoRA-Fine-Tuning-aware Quantization), a novel quantization framework that simultaneously quantizes an LLM and finds a proper low-rank initialization for LoRA fine-tuning. Such an initialization alleviates the discrepancy between the quantized and full-precision model and significantly improves generalization in downstream tasks. We evaluate our method on natural language understanding, question answering, summarization, and natural language generation tasks. Experiments show that our method is highly effective and outperforms existing quantization methods, especially in the challenging 2-bit and 2/4-bit mixed precision regimes. The code is available on https://github.com/yxli2123/LoftQ.

large language model, machine learning, quantization, (18 more...)

arXiv.org Artificial Intelligence

2310.08659

Country:

North America > United States > Texas (0.14)
North America > United States > Louisiana (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

LoSparse: Structured Compression of Large Language Models based on Low-Rank and Sparse Approximation

Li, Yixiao, Yu, Yifan, Zhang, Qingru, Liang, Chen, He, Pengcheng, Chen, Weizhu, Zhao, Tuo

arXiv.org Artificial IntelligenceJun-26-2023

Transformer models have achieved remarkable results in various natural language tasks, but they are often prohibitively large, requiring massive memories and computational resources. To reduce the size and complexity of these models, we propose LoSparse (Low-Rank and Sparse approximation), a novel model compression technique that approximates a weight matrix by the sum of a low-rank matrix and a sparse matrix. Our method combines the advantages of both low-rank approximations and pruning, while avoiding their limitations. Low-rank approximation compresses the coherent and expressive parts in neurons, while pruning removes the incoherent and non-expressive parts in neurons. Pruning enhances the diversity of low-rank approximations, and low-rank approximation prevents pruning from losing too many expressive neurons. We evaluate our method on natural language understanding, question answering, and natural language generation tasks. We show that it significantly outperforms existing compression methods.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2306.11222

Country:

North America > United States > Texas (0.14)
North America > United States > Louisiana (0.14)

Genre: Research Report (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.82)

Add feedback

Informative Policy Representations in Multi-Agent Reinforcement Learning via Joint-Action Distributions

Yu, Yifan, Jiang, Haobin, Lu, Zongqing

arXiv.org Artificial IntelligenceJun-10-2021

In multi-agent reinforcement learning, the inherent non-stationarity of the environment caused by other agents' actions posed significant difficulties for an agent to learn a good policy independently. One way to deal with non-stationarity is agent modeling, by which the agent takes into consideration the influence of other agents' policies. Most existing work relies on predicting other agents' actions or goals, or discriminating between their policies. However, such modeling fails to capture the similarities and differences between policies simultaneously and thus cannot provide useful information when generalizing to unseen policies. To address this, we propose a general method to learn representations of other agents' policies via the joint-action distributions sampled in interactions. The similarities and differences between policies are naturally captured by the policy distance inferred from the joint-action distributions and deliberately reflected in the learned representations. Agents conditioned on the policy representations can well generalize to unseen agents. We empirically demonstrate that our method outperforms existing work in multi-agent tasks when facing unseen agents.

artificial intelligence, computer game, opponent, (18 more...)

arXiv.org Artificial Intelligence

2106.05802

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison

Irvin, Jeremy, Rajpurkar, Pranav, Ko, Michael, Yu, Yifan, Ciurea-Ilcus, Silviana, Chute, Chris, Marklund, Henrik, Haghgoo, Behzad, Ball, Robyn, Shpanskaya, Katie, Seekins, Jayne, Mong, David A., Halabi, Safwan S., Sandberg, Jesse K., Jones, Ricky, Larson, David B., Langlotz, Curtis P., Patel, Bhavik N., Lungren, Matthew P., Ng, Andrew Y.

arXiv.org Artificial IntelligenceJan-21-2019

Large, labeled datasets have driven deep learning methods to achieve expert-level performance on a variety of medical imaging tasks. We present CheXpert, a large dataset that contains 224,316 chest radiographs of 65,240 patients. We design a labeler to automatically detect the presence of 14 observations in radiology reports, capturing uncertainties inherent in radiograph interpretation. We investigate different approaches to using the uncertainty labels for training convolutional neural networks that output the probability of these observations given the available frontal and lateral radiographs. On a validation set of 200 chest radiographic studies which were manually annotated by 3 board-certified radiologists, we find that different uncertainty approaches are useful for different pathologies. We then evaluate our best model on a test set composed of 500 chest radiographic studies annotated by a consensus of 5 board-certified radiologists, and compare the performance of our model to that of 3 additional radiologists in the detection of 5 selected pathologies. On Cardiomegaly, Edema, and Pleural Effusion, the model ROC and PR curves lie above all 3 radiologist operating points. We release the dataset to the public as a standard benchmark to evaluate performance of chest radiograph interpretation models. The dataset is freely available at https://stanfordmlgroup.github.io/competitions/chexpert .

dataset, deep learning, neural network, (23 more...)

arXiv.org Artificial Intelligence

1901.07031

Country: North America > United States > Wisconsin (0.14)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Nuclear Medicine (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback