AITopics | Hu, Yi-Qi

Collaborating Authors

Hu, Yi-Qi

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

EFPC: Towards Efficient and Flexible Prompt Compression

Cao, Yun-Hao, Wang, Yangsong, Hao, Shuzheng, Li, Zhenxing, Zhan, Chengjun, Liu, Sichao, Hu, Yi-Qi

arXiv.org Artificial IntelligenceMar-10-2025

The emergence of large language models (LLMs) like GPT-4 has revolutionized natural language processing (NLP), enabling diverse, complex tasks. However, extensive token counts lead to high computational and financial burdens. To address this, we propose Efficient and Flexible Prompt Compression (EFPC), a novel method unifying task-aware and task-agnostic compression for a favorable accuracy-efficiency trade-off. EFPC uses GPT-4 to generate compressed prompts and integrates them with original prompts for training. During training and inference, we selectively prepend user instructions and compress prompts based on predicted probabilities. EFPC is highly data-efficient, achieving significant performance with minimal data. Compared to the state-of-the-art method LLMLingua-2, EFPC achieves a 4.8% relative improvement in F1-score with 1% additional data at a 4x compression rate, and an 11.4% gain with 10% additional data on the LongBench single-doc QA benchmark. EFPC's unified framework supports broad applicability and enhances performance across various models, tasks, and domains, offering a practical advancement in NLP.

large language model, llmlingua-2, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2503.07956

Genre: Research Report > Promising Solution (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Rethinking Optimization and Architecture for Tiny Language Models

Tang, Yehui, Liu, Fangcheng, Ni, Yunsheng, Tian, Yuchuan, Bai, Zheyuan, Hu, Yi-Qi, Liu, Sichao, Jui, Shangling, Han, Kai, Wang, Yunhe

arXiv.org Artificial IntelligenceFeb-6-2024

The power of large language models (LLMs) has been demonstrated through numerous data and computing resources. However, the application of language models on mobile devices is facing huge challenge on the computation and memory costs, that is, tiny language models with high performance are urgently required. Limited by the highly complex training process, there are many details for optimizing language models that are seldom studied carefully. In this study, based on a tiny language model with 1B parameters, we carefully design a series of empirical study to analyze the effect of each component. Three perspectives are mainly discussed, \ie, neural architecture, parameter initialization, and optimization strategy. Several design formulas are empirically proved especially effective for tiny language models, including tokenizer compression, architecture tweaking, parameter inheritance and multiple-round training. Then we train PanGu-$\pi$-1B Pro and PanGu-$\pi$-1.5B Pro on 1.6T multilingual corpora, following the established formulas. Experimental results demonstrate the improved optimization and architecture yield a notable average improvement of 8.87 on benchmark evaluation sets for PanGu-$\pi$-1B Pro. Besides, PanGu-$\pi$-1.5B Pro surpasses a range of SOTA models with larger model sizes, validating its superior performance. The code is available at https://github.com/YuchuanTian/RethinkTinyLM.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2402.02791

Country:

North America > United States (0.14)
Europe > Spain (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Cascaded Algorithm-Selection and Hyper-Parameter Optimization with Extreme-Region Upper Confidence Bound Bandit

Hu, Yi-Qi, Yu, Yang, Liao, Jun-Da

arXiv.org Machine LearningMay-31-2019

An automatic machine learning (AutoML) task is to select the best algorithm and its hyper-parameters simultaneously. Previously, the hyper-parameters of all algorithms are joint as a single search space, which is not only huge but also redundant, because many dimensions of hyper-parameters are irrelevant with the selected algorithms. In this paper, we propose a cascaded approach for algorithm selection and hyper-parameter optimization. While a search procedure is employed at the level of hyper-parameter optimization, a bandit strategy runs at the level of algorithm selection to allocate the budget based on the search feedbacks. Since the bandit is required to select the algorithm with the maximum performance, instead of the average performance, we thus propose the extreme-region upper confidence bound (ER-UCB) strategy, which focuses on the extreme region of the underlying feedback distribution. We show theoretically that the ER-UCB has a regret upper bound $O\left(K \ln n\right)$ with independent feedbacks, which is as efficient as the classical UCB bandit. We also conduct experiments on a synthetic problem as well as a set of AutoML tasks. The results verify the effectiveness of the proposed method.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

1905.13703

Genre: Research Report (0.50)

Industry: Energy > Oil & Gas (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

ZOOpt: Toolbox for Derivative-Free Optimization

Liu, Yu-Ren, Hu, Yi-Qi, Qian, Hong, Yu, Yang, Qian, Chao

arXiv.org Machine LearningFeb-6-2018

Recent advances of derivative-free optimization allow efficient approximating the global optimal solutions of sophisticated functions, such as functions with many local optima, non-differentiable and non-continuous functions. This article describes the ZOOpt (https://github.com/eyounx/ZOOpt) toolbox that provides efficient derivative-free solvers and are designed easy to use. ZOOpt provides a Python package for single-thread optimization, and a light-weighted distributed version with the help of the Julia language for Python described functions. ZOOpt toolbox particularly focuses on optimization problems in machine learning, addressing high-dimensional, noisy, and large-scale problems. The toolbox is being maintained toward ready-to-use tool in real-world machine learning tasks.

optimization, optimization problem, survey article, (15 more...)

arXiv.org Machine Learning

1801.00329

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Sequential Classification-Based Optimization for Direct Policy Search

Hu, Yi-Qi (Nanjing University) | Qian, Hong (Nanjing University) | Yu, Yang (Nanjing University)

AAAI ConferencesFeb-14-2017

Classification-based optimization is a recently developed framework for derivative-free optimization, which has shown to be effective for non-convex optimization problems with many local optima. This framework requires to sample a batch of solutions for every update of the search model. However, in reinforcement learning, direct policy search often offers only sequential policy evaluation. Thus, classificationbased optimization is not efficient for direct policy search where solutions have to be sampled sequentially. In this paper, we adapt the classification-based optimization for sequential sampled solutions by forming the batch of reused historical solutions. Experiments on helicopter hovering control task and reinforcement learning benchmark tasks in OpenAI Gym show that the new algorithm is superior to state-of-the-art derivative-free optimization approaches.

air transportation, algorithm, optimization problem, (19 more...)

AAAI Conferences

Thirty-First AAAI Conference on Artificial Intelligence

Country:

Asia (0.28)
North America > United States > Massachusetts > Middlesex County (0.14)

Industry: Transportation > Air (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Derivative-Free Optimization via Classification

Yu, Yang (Nanjing University) | Qian, Hong (Nanjing University) | Hu, Yi-Qi (Nanjing University)

AAAI ConferencesApr-19-2016

Many randomized heuristic derivative-free optimization methods share a framework that iteratively learns a model for promising search areas and samples solutions from the model. This paper studies a particular setting of such framework, where the model is implemented by a classification model discriminating good solutions from bad ones. This setting allows a general theoretical characterization, where critical factors to the optimization are discovered. We also prove that optimization problems with Local Lipschitz continuity can be solved in polynomial time by proper configurations of this framework. Following the critical factors, we propose the randomized coordinate shrinking classification algorithm to learn the model, forming the RACOS algorithm, for optimization in continuous and discrete domains. Experiments on the testing functions as well as on the machine learning tasks including spectral clustering and classification with Ramp loss demonstrate the effectiveness of RACOS.

algorithm, artificial intelligence, optimization problem, (20 more...)

AAAI Conferences

Thirtieth AAAI Conference on Artificial Intelligence

Country: North America > United States > Massachusetts > Middlesex County (0.14)

Genre: Research Report (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)

Add feedback