AITopics | He, Qianyu

Collaborating Authors

He, Qianyu

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Order Doesn't Matter, But Reasoning Does: Training LLMs with Order-Centric Augmentation

He, Qianxi, He, Qianyu, Liang, Jiaqing, Xiao, Yanghua, Zhou, Weikang, Sun, Zeye, Yu, Fei

arXiv.org Artificial IntelligenceFeb-27-2025

Logical reasoning is essential for large language models (LLMs) to ensure accurate and coherent inference. However, LLMs struggle with reasoning order variations and fail to generalize across logically equivalent transformations. LLMs often rely on fixed sequential patterns rather than true logical understanding. To address this issue, we introduce an order-centric data augmentation framework based on commutativity in logical reasoning. We first randomly shuffle independent premises to introduce condition order augmentation. For reasoning steps, we construct a directed acyclic graph (DAG) to model dependencies between steps, which allows us to identify valid reorderings of steps while preserving logical correctness. By leveraging order-centric augmentations, models can develop a more flexible and generalized reasoning process. Finally, we conduct extensive experiments across multiple logical reasoning benchmarks, demonstrating that our method significantly enhances LLMs' reasoning performance and adaptability to diverse logical structures. We release our codes and augmented data in https://anonymous.4open.science/r/Order-Centric-Data-Augmentation-822C/.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2502.19907

Country:

North America (0.14)
Asia (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Application of machine learning algorithm in temperature field reconstruction

He, Qianyu, Sun, Huaiwei, Li, Yubo, You, Zhiwen, Zheng, Qiming, Huang, Yinghan, Zhu, Sipeng, Wang, Fengyu

arXiv.org Artificial IntelligenceFeb-18-2025

This study focuses on the stratification patterns and dynamic evolution of reservoir water temperatures, aiming to estimate and reconstruct the temperature field using limited and noisy local measurement data. Due to complex measurement environments and technical limitations, obtaining complete temperature information for reservoirs is highly challenging. Therefore, accurately reconstructing the temperature field from a small number of local data points has become a critical scientific issue. To address this, the study employs Proper Orthogonal Decomposition (POD) and sparse representation methods to reconstruct the temperature field based on temperature data from a limited number of local measurement points. The results indicate that satisfactory reconstruction can be achieved when the number of POD basis functions is set to 2 and the number of measurement points is 10. Under different water intake depths, the reconstruction errors of both POD and sparse representation methods remain stable at around 0.15, fully validating the effectiveness of these methods in reconstructing the temperature field based on limited local temperature data. Additionally, the study further explores the distribution characteristics of reconstruction errors for POD and sparse representation methods under different water level intervals, analyzing the optimal measurement point layout scheme and potential limitations of the reconstruction methods in this case. This research not only effectively reduces measurement costs and computational resource consumption but also provides a new technical approach for reservoir temperature analysis, holding significant theoretical and practical importance.

algorithm, application, temperature field reconstruction

arXiv.org Artificial Intelligence

2502.1319

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Step-by-Step Mastery: Enhancing Soft Constraint Following Ability of Large Language Models

Ren, Qingyu, Zeng, Jie, He, Qianyu, Liang, Jiaqing, Xiao, Yanghua, Zhou, Weikang, Sun, Zeye, Yu, Fei

arXiv.org Artificial IntelligenceJan-13-2025

It is crucial for large language models (LLMs) to follow instructions that involve multiple constraints. However, soft constraints are semantically related and difficult to verify through automated methods. These constraints remain a significant challenge for LLMs. To enhance the ability of LLMs to follow soft constraints, we initially design a pipeline to obtain high-quality outputs automatically. Additionally, to fully utilize the acquired data, we introduce a training paradigm based on curriculum learning. We experimentally evaluate the effectiveness of our methods in improving LLMs' soft constraint following ability and analyze the factors driving the improvements. The datasets and code are publicly available at https://github.com/Rainier-rq/FollowSoftConstraints.

constraint, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2501.04945

Genre: Research Report (0.64)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

QUILL: Quotation Generation Enhancement of Large Language Models

Xiao, Jin, Zhang, Bowei, He, Qianyu, Liang, Jiaqing, Wei, Feng, Chen, Jinglei, Liang, Zujie, Yang, Deqing, Xiao, Yanghua

arXiv.org Artificial IntelligenceNov-6-2024

While Large language models (LLMs) have become excellent writing assistants, they still struggle with quotation generation. This is because they either hallucinate when providing factual quotations or fail to provide quotes that exceed human expectations. To bridge the gap, we systematically study how to evaluate and improve LLMs' performance in quotation generation tasks. We first establish a holistic and automatic evaluation system for quotation generation task, which consists of five criteria each with corresponding automatic metric. To improve the LLMs' quotation generation abilities, we construct a bilingual knowledge base that is broad in scope and rich in dimensions, containing up to 32,022 quotes. Moreover, guided by our critiria, we further design a quotation-specific metric to rerank the retrieved quotations from the knowledge base. Extensive experiments show that our metrics strongly correlate with human preferences. Existing LLMs struggle to generate desired quotes, but our quotation knowledge base and reranking metric help narrow this gap. Our dataset and code are publicly available at https://github.com/GraceXiaoo/QUILL.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2411.03675

Country:

North America > United States (0.14)
Europe (0.14)
Asia > China (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Think Thrice Before You Act: Progressive Thought Refinement in Large Language Models

Du, Chengyu, Han, Jinyi, Ying, Yizhou, Chen, Aili, He, Qianyu, Zhao, Haokun, Xia, Sirui, Guo, Haoran, Liang, Jiaqing, Chen, Zulong, Li, Liangyue, Xiao, Yanghua

arXiv.org Artificial IntelligenceOct-17-2024

Recent advancements in large language models (LLMs) have demonstrated that progressive refinement, rather than providing a single answer, results in more accurate and thoughtful outputs. However, existing methods often rely heavily on supervision signals to evaluate previous responses, making it difficult to assess output quality in more open-ended scenarios effectively. Additionally, these methods are typically designed for specific tasks, which limits their generalization to new domains. To address these limitations, we propose Progressive Thought Refinement (PTR), a framework that enables LLMs to refine their responses progressively. PTR operates in two phases: (1) Thought data construction stage: We propose a weak and strong model collaborative selection strategy to build a high-quality progressive refinement dataset to ensure logical consistency from thought to answers, and the answers are gradually refined in each round. (2) Thought-Mask Fine-Tuning Phase: We design a training structure to mask the "thought" and adjust loss weights to encourage LLMs to refine prior thought, teaching them to implicitly understand "how to improve" rather than "what is correct." Experimental results show that PTR significantly enhances LLM performance across ten diverse tasks (avg. from 49.6% to 53.5%) without task-specific fine-tuning. Notably, in more open-ended tasks, LLMs also demonstrate substantial improvements in the quality of responses beyond mere accuracy, suggesting that PTR truly teaches LLMs to self-improve over time.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2410.13413

Country: Asia > China (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology (1.00)
Transportation > Air (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

From Complex to Simple: Enhancing Multi-Constraint Complex Instruction Following Ability of Large Language Models

He, Qianyu, Zeng, Jie, He, Qianxi, Liang, Jiaqing, Xiao, Yanghua

arXiv.org Artificial IntelligenceJun-18-2024

It is imperative for Large language models (LLMs) to follow instructions with elaborate requirements (i.e. Complex Instructions Following). Yet, it remains under-explored how to enhance the ability of LLMs to follow complex instructions with multiple constraints. To bridge the gap, we initially study what training data is effective in enhancing complex constraints following abilities. We found that training LLMs with instructions containing multiple constraints enhances their understanding of complex instructions, especially those with lower complexity levels. The improvement can even generalize to compositions of out-of-domain constraints. Additionally, we further propose methods addressing how to obtain and utilize the effective training data. Finally, we conduct extensive experiments to prove the effectiveness of our methods in terms of overall performance and training efficiency. We also demonstrate that our methods improve models' ability to follow instructions generally and generalize effectively across out-of-domain, in-domain, and adversarial settings, while maintaining general capabilities.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2404.15846

Country: Asia > China (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Endocrinology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Light Up the Shadows: Enhance Long-Tailed Entity Grounding with Concept-Guided Vision-Language Models

Zhang, Yikai, He, Qianyu, Wang, Xintao, Yuan, Siyu, Liang, Jiaqing, Xiao, Yanghua

arXiv.org Artificial IntelligenceJun-16-2024

Multi-Modal Knowledge Graphs (MMKGs) have proven valuable for various downstream tasks. However, scaling them up is challenging because building large-scale MMKGs often introduces mismatched images (i.e., noise). Most entities in KGs belong to the long tail, meaning there are few images of them available online. This scarcity makes it difficult to determine whether a found image matches the entity. To address this, we draw on the Triangle of Reference Theory and suggest enhancing vision-language models with concept guidance. Specifically, we introduce COG, a two-stage framework with COncept-Guided vision-language models. The framework comprises a Concept Integration module, which effectively identifies image-text pairs of long-tailed entities, and an Evidence Fusion module, which offers explainability and enables human verification. To demonstrate the effectiveness of COG, we create a dataset of 25k image-text pairs of long-tailed entities. Our comprehensive experiments show that COG not only improves the accuracy of recognizing long-tailed image-text pairs compared to baselines but also offers flexibility and explainability.

artificial intelligence, long-tailed entity, natural language, (18 more...)

arXiv.org Artificial Intelligence

2406.10902

Country:

Europe > Austria > Vienna (0.14)
Asia > Middle East > UAE (0.14)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Reason from Fallacy: Enhancing Large Language Models' Logical Reasoning through Logical Fallacy Understanding

Li, Yanda, Wang, Dixuan, Liang, Jiaqing, Jiang, Guochao, He, Qianyu, Xiao, Yanghua, Yang, Deqing

arXiv.org Artificial IntelligenceApr-4-2024

Large Language Models (LLMs) have demonstrated good performance in many reasoning tasks, but they still struggle with some complicated reasoning tasks including logical reasoning. One non-negligible reason for LLMs' suboptimal performance on logical reasoning is their overlooking of understanding logical fallacies correctly. To evaluate LLMs' capability of logical fallacy understanding (LFU), we propose five concrete tasks from three cognitive dimensions of WHAT, WHY, and HOW in this paper. Towards these LFU tasks, we have successfully constructed a new dataset LFUD based on GPT-4 accompanied by a little human effort. Our extensive experiments justify that our LFUD can be used not only to evaluate LLMs' LFU capability, but also to fine-tune LLMs to obtain significantly enhanced performance on logical reasoning.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2404.04293

Country:

Asia (0.94)
North America > United States (0.67)
Europe > United Kingdom > England (0.28)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Is There a One-Model-Fits-All Approach to Information Extraction? Revisiting Task Definition Biases

Huang, Wenhao, He, Qianyu, Li, Zhixu, Liang, Jiaqing, Xiao, Yanghua

arXiv.org Artificial IntelligenceMar-24-2024

Definition bias is a negative phenomenon that can mislead models. Definition bias in information extraction appears not only across datasets from different domains but also within datasets sharing the same domain. We identify two types of definition bias in IE: bias among information extraction datasets and bias between information extraction datasets and instruction tuning datasets. To systematically investigate definition bias, we conduct three probing experiments to quantitatively analyze it and discover the limitations of unified information extraction and large language models in solving definition bias. To mitigate definition bias in information extraction, we propose a multi-stage framework consisting of definition bias measurement, bias-aware fine-tuning, and task-specific bias mitigation. Experimental results demonstrate the effectiveness of our framework in addressing definition bias. Resources of this paper can be found at https://github.com/EZ-hwh/definition-bias

definition bias, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2403.16396

Country:

Asia > China (0.28)
North America > United States > Alaska (0.14)

Genre: Research Report > New Finding (0.48)

Industry:

Materials > Chemicals (0.46)
Energy > Oil & Gas (0.46)

Technology:

Information Technology > Data Science > Data Mining > Text Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Laying the Foundation First? Investigating the Generalization from Atomic Skills to Complex Reasoning Tasks

Huang, Yuncheng, He, Qianyu, Xu, Yipei, Liang, Jiaqing, Xiao, Yanghua

arXiv.org Artificial IntelligenceMar-14-2024

Current language models have demonstrated their capability to develop basic reasoning, but struggle in more complicated reasoning tasks that require a combination of atomic skills, such as math word problem requiring skills like arithmetic and unit conversion. Previous methods either do not improve the inherent atomic skills of models or not attempt to generalize the atomic skills to complex reasoning tasks. In this paper, we first propose a probing framework to investigate whether the atomic skill can spontaneously generalize to complex reasoning tasks. Then, we introduce a hierarchical curriculum learning training strategy to achieve better skill generalization. In our experiments, we find that atomic skills can not spontaneously generalize to compositional tasks. By leveraging hierarchical curriculum learning, we successfully induce generalization, significantly improve the performance of open-source LMs on complex reasoning tasks. Promisingly, the skill generalization exhibit effective in cross-dataset and cross-domain scenarios. Complex reasoning can also help enhance atomic skills. Our findings offer valuable guidance for designing better training strategies for complex reasoning tasks.

atomic skill, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2403.09479

Country:

Europe (0.67)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.86)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)

Add feedback