AITopics | Li, Zongqian

Collaborating Authors

Li, Zongqian

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

General Scales Unlock AI Evaluation with Explanatory and Predictive Power

Zhou, Lexin, Pacchiardi, Lorenzo, Martínez-Plumed, Fernando, Collins, Katherine M., Moros-Daval, Yael, Zhang, Seraphina, Zhao, Qinlin, Huang, Yitian, Sun, Luning, Prunty, Jonathan E., Li, Zongqian, Sánchez-García, Pablo, Chen, Kexin Jiang, Casares, Pablo A. M., Zu, Jiyun, Burden, John, Mehrbakhsh, Behzad, Stillwell, David, Cebrian, Manuel, Wang, Jindong, Henderson, Peter, Wu, Sherry Tongshuang, Kyllonen, Patrick C., Cheke, Lucy, Xie, Xing, Hernández-Orallo, José

arXiv.org Artificial IntelligenceMar-15-2025

Ensuring safe and effective use of AI requires understanding and anticipating its performance on novel tasks, from advanced scientific challenges to transformed workplace activities. So far, benchmarking has guided progress in AI, but it has offered limited explanatory and predictive power for general-purpose AI systems, given the low transferability across diverse tasks. In this paper, we introduce general scales for AI evaluation that can explain what common AI benchmarks really measure, extract ability profiles of AI systems, and predict their performance for new task instances, in- and out-of-distribution. Our fully-automated methodology builds on 18 newly-crafted rubrics that place instance demands on general scales that do not saturate. Illustrated for 15 large language models and 63 tasks, high explanatory power is unleashed from inspecting the demand and ability profiles, bringing insights on the sensitivity and specificity exhibited by different benchmarks, and how knowledge, metacognition and reasoning are affected by model size, chain-of-thought and distillation. Surprisingly, high predictive power at the instance level becomes possible using these demand levels, providing superior estimates over black-box baseline predictors based on embeddings or finetuning, especially in out-of-distribution settings (new tasks and new benchmarks). The scales, rubrics, battery, techniques and results presented here represent a major step for AI evaluation, underpinning the reliable deployment of AI in the years ahead. (Collaborative platform: https://kinds-of-intelligence-cfi.github.io/ADELE.)

data mining, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2503.06378

Country:

North America > United States (1.00)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre:

Instructional Material (1.00)
Questionnaire & Opinion Survey (0.92)
Overview (0.92)
(2 more...)

Industry:

Leisure & Entertainment > Sports (1.00)
Law (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
(12 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(5 more...)

Add feedback

ReasonGraph: Visualisation of Reasoning Paths

Li, Zongqian, Shareghi, Ehsan, Collier, Nigel

arXiv.org Artificial IntelligenceMar-5-2025

Large Language Models (LLMs) reasoning processes are challenging to analyze due to their complexity and the lack of organized visualization tools. We present ReasonGraph, a web-based platform for visualizing and analyzing LLM reasoning processes. It supports both sequential and tree-based reasoning methods while integrating with major LLM providers and over fifty state-of-the-art models. ReasonGraph incorporates an intuitive UI with meta reasoning method selection, configurable visualization parameters, and a modular framework that facilitates efficient extension. Our evaluation shows high parsing reliability, efficient processing, and strong usability across various downstream applications. By providing a unified visualization framework, ReasonGraph reduces cognitive load in analyzing complex reasoning paths, improves error detection in logical processes, and enables more effective development of LLM-based applications. The platform is open-source, promoting accessibility and reproducibility in LLM reasoning analysis.

large language model, natural language, reasoning method, (15 more...)

arXiv.org Artificial Intelligence

2503.03979

Country: North America > Canada (0.15)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Prompt Compression for Large Language Models: A Survey

Li, Zongqian, Liu, Yinhong, Su, Yixuan, Collier, Nigel

arXiv.org Artificial IntelligenceOct-17-2024

Leveraging large language models (LLMs) for complex natural language tasks typically requires long-form prompts to convey detailed requirements and information, which results in increased memory usage and inference costs. To mitigate these challenges, multiple efficient methods have been proposed, with prompt compression gaining significant research interest. This survey provides an overview of prompt compression techniques, categorized into hard prompt methods and soft prompt methods. First, the technical approaches of these methods are compared, followed by an exploration of various ways to understand their mechanisms, including the perspectives of attention optimization, Parameter-Efficient Fine-Tuning (PEFT), modality integration, and new synthetic language. We also examine the downstream adaptations of various prompt compression techniques. Finally, the limitations of current prompt compression methods are analyzed, and several future directions are outlined, such as optimizing the compression encoder, combining hard and soft prompts methods, and leveraging insights from multimodality.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2410.12388

Country:

Europe (1.00)
Asia (0.68)
North America > United States > California (0.46)
North America > United States > Minnesota (0.28)

Genre: Overview (1.00)

Industry: Leisure & Entertainment > Sports > Football (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback