AITopics | Qiu, Luyu

Collaborating Authors

Qiu, Luyu

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Exposing Numeracy Gaps: A Benchmark to Evaluate Fundamental Numerical Abilities in Large Language Models

Li, Haoyang, Chen, Xuejia, XU, Zhanchao, Li, Darian, Hu, Nicole, Teng, Fei, Li, Yiming, Qiu, Luyu, Zhang, Chen Jason, Li, Qing, Chen, Lei

arXiv.org Artificial IntelligenceFeb-16-2025

Large Language Models (LLMs) have demonstrated impressive capabilities in natural language processing tasks, such as text generation and semantic understanding. However, their performance on numerical reasoning tasks, such as basic arithmetic, numerical retrieval, and magnitude comparison, remains surprisingly poor. This gap arises from their reliance on surface-level statistical patterns rather than understanding numbers as continuous magnitudes. Existing benchmarks primarily focus on either linguistic competence or structured mathematical problem-solving, neglecting fundamental numerical reasoning required in real-world scenarios. To bridge this gap, we propose NumericBench, a comprehensive benchmark to evaluate six fundamental numerical capabilities: number recognition, arithmetic operations, contextual retrieval, comparison, summary, and logical reasoning. NumericBench includes datasets ranging from synthetic number lists to the crawled real-world data, addressing challenges like long contexts, noise, and multi-step reasoning. Extensive experiments on state-of-the-art LLMs, including GPT-4 and DeepSeek, reveal persistent weaknesses in numerical reasoning, highlighting the urgent need to improve numerically-aware language modeling. The benchmark is released in: https://github.com/TreeAI-Lab/NumericBench.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2502.11075

Country: North America > Mexico (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Towards Fine-Grained Explainability for Heterogeneous Graph Neural Network

Li, Tong, Deng, Jiale, Shen, Yanyan, Qiu, Luyu, Huang, Yongxiang, Cao, Caleb Chen

arXiv.org Artificial IntelligenceDec-23-2023

Recently, Their goal is to learn or search for optimal graph objects that heterogeneous graph neural networks (HGNs) have maximize mutual information with the predictions. While become one of the standard paradigms for modeling rich such explanations answer the question "what is salient to semantics of heterogeneous graphs in various application the prediction", they fail to unveil "how the salient objects domains such as e-commerce, finance, and healthcare (Lv affect the prediction". In particular, there may exist multiple et al. 2021; Wang et al. 2022). In parallel with the proliferation paths in the graph to propagate the information of the salient of HGNs, understanding the reasons behind the objects to the target object and affect its prediction. Without predictions from HGNs is urgently demanded in order to distinguishing these different influential paths, the answer to build trust and confidence in the models for both users and the "how" question remains unclear, which could compromise stakeholders. For example, a customer would be satisfied if the utility of the explanation. This issue becomes more an HGN-based recommender system accompanies recommended prominent when it comes to explaining HGNs due to the items with explanations; a bank manager may want complex semantics of heterogeneous graphs.

artificial intelligence, explanation, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2312.15237

Country: Asia > China (0.28)

Genre: Research Report (1.00)

Industry: Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.86)

Add feedback

Resisting Out-of-Distribution Data Problem in Perturbation of XAI

Qiu, Luyu, Yang, Yi, Cao, Caleb Chen, Liu, Jing, Zheng, Yueyuan, Ngai, Hilary Hei Ting, Hsiao, Janet, Chen, Lei

arXiv.org Artificial IntelligenceJul-27-2021

With the rapid development of eXplainable Artificial Intelligence (XAI), perturbation-based XAI algorithms have become quite popular due to their effectiveness and ease of implementation. The vast majority of perturbation-based XAI techniques face the challenge of Out-of-Distribution (OoD) data -- an artifact of randomly perturbed data becoming inconsistent with the original dataset. OoD data leads to the over-confidence problem in model predictions, making the existing XAI approaches unreliable. To our best knowledge, the OoD data problem in perturbation-based XAI algorithms has not been adequately addressed in the literature. In this work, we address this OoD data problem by designing an additional module quantifying the affinity between the perturbed data and the original dataset distribution, which is integrated into the process of aggregation. Our solution is shown to be compatible with the most popular perturbation-based XAI algorithms, such as RISE, OCCLUSION, and LIME. Experiments have confirmed that our methods demonstrate a significant improvement in general cases using both computational and cognitive metrics. Especially in the case of degradation, our proposed approach demonstrates outstanding performance comparing to baselines. Besides, our solution also resolves a fundamental problem with the faithfulness indicator, a commonly used evaluation metric of XAI algorithms that appears to be sensitive to the OoD issue.

artificial intelligence, neural network, saliency map, (20 more...)

arXiv.org Artificial Intelligence

2107.14

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre:

Research Report > New Finding (0.87)
Research Report > Experimental Study (0.67)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.88)
(2 more...)

Add feedback