AITopics | Yu, Peng

Collaborating Authors

Yu, Peng

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Self-Modeling Robots by Photographing

Hu, Kejun, Yu, Peng, Tan, Ning

arXiv.org Artificial IntelligenceMar-7-2025

Self-modeling enables robots to build task-agnostic models of their morphology and kinematics based on data that can be automatically collected, with minimal human intervention and prior information, thereby enhancing machine intelligence. Recent research has highlighted the potential of data-driven technology in modeling the morphology and kinematics of robots. However, existing self-modeling methods suffer from either low modeling quality or excessive data acquisition costs. Beyond morphology and kinematics, texture is also a crucial component of robots, which is challenging to model and remains unexplored. In this work, a high-quality, texture-aware, and link-level method is proposed for robot self-modeling. We utilize three-dimensional (3D) Gaussians to represent the static morphology and texture of robots, and cluster the 3D Gaussians to construct neural ellipsoid bones, whose deformations are controlled by the transformation matrices generated by a kinematic neural network. The 3D Gaussians and kinematic neural network are trained using data pairs composed of joint angles, camera parameters and multi-view images without depth information. By feeding the kinematic neural network with joint angles, we can utilize the well-trained model to describe the corresponding morphology, kinematics and texture of robots at the link level, and render robot images from different perspectives with the aid of 3D Gaussian splatting. Furthermore, we demonstrate that the established model can be exploited to perform downstream tasks such as motion planning and inverse kinematics.

artificial intelligence, machine learning, robot, (16 more...)

arXiv.org Artificial Intelligence

2503.05398

Country: Asia > China > Guangdong Province (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Towards a Unified Paradigm: Integrating Recommendation Systems as a New Language in Large Models

Zheng, Kai, Sun, Qingfeng, Xu, Can, Yu, Peng, Guo, Qingwei

arXiv.org Artificial IntelligenceDec-22-2024

This paper explores the use of Large Language Models (LLMs) for sequential recommendation, which predicts users' future interactions based on their past behavior. We introduce a new concept, "Integrating Recommendation Systems as a New Language in Large Models" (RSLLM), which combines the strengths of traditional recommenders and LLMs. RSLLM uses a unique prompting method that combines ID-based item embeddings from conventional recommendation models with textual item features. It treats users' sequential behaviors as a distinct language and aligns the ID embeddings with the LLM's input space using a projector. We also propose a two-stage LLM fine-tuning framework that refines a pretrained LLM using a combination of two contrastive losses and a language modeling loss. The LLM is first fine-tuned using text-only prompts, followed by target domain fine-tuning with unified prompts. This trains the model to incorporate behavioral knowledge from the traditional sequential recommender into the LLM. Our empirical results validate the effectiveness of our proposed framework.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2412.16933

Country: North America (0.67)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Large Language Model Can Be a Foundation for Hidden Rationale-Based Retrieval

Ji, Luo, Guo, Feixiang, Chen, Teng, Gu, Qingqing, Wang, Xiaoyu, Xi, Ningyuan, Wang, Yihong, Yu, Peng, Zhao, Yue, Lei, Hongyang, Jiang, Zhonglin, Chen, Yong

arXiv.org Artificial IntelligenceDec-21-2024

Despite the recent advancement in Retrieval-Augmented Generation (RAG) systems, most retrieval methodologies are often developed for factual retrieval, which assumes query and positive documents are semantically similar. In this paper, we instead propose and study a more challenging type of retrieval task, called hidden rationale retrieval, in which query and document are not similar but can be inferred by reasoning chains, logic relationships, or empirical experiences. To address such problems, an instruction-tuned Large language model (LLM) with a cross-encoder architecture could be a reasonable choice. To further strengthen pioneering LLM-based retrievers, we design a special instruction that transforms the retrieval task into a generative task by prompting LLM to answer a binary-choice question. The model can be fine-tuned with direct preference optimization (DPO). The framework is also optimized for computational efficiency with no performance degradation. We name this retrieval framework by RaHoRe and verify its zero-shot and fine-tuned performance superiority on Emotional Support Conversation (ESC), compared with previous retrieval works. Our study suggests the potential to employ LLM as a foundation for a wider scope of retrieval tasks. Our codes, models, and datasets are available on https://github.com/flyfree5/LaHoRe.

computational linguistic, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2412.16615

Country: Asia > Pakistan > Punjab > Lahore Division > Lahore (0.29)

Genre: Research Report (0.86)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

KaLM: Knowledge-aligned Autoregressive Language Modeling via Dual-view Knowledge Graph Contrastive Learning

Yu, Peng, Deng, Cheng, Dai, Beiya, Wang, Xinbing, Wen, Ying

arXiv.org Artificial IntelligenceDec-6-2024

Autoregressive large language models (LLMs) pre-trained by next token prediction are inherently proficient in generative tasks. However, their performance on knowledge-driven tasks such as factual knowledge querying remains unsatisfactory. Knowledge graphs (KGs), as high-quality structured knowledge bases, can provide reliable knowledge for LLMs, potentially compensating for their knowledge deficiencies. Aligning LLMs with explicit, structured knowledge from KGs has been a challenge; previous attempts either failed to effectively align knowledge representations or compromised the generative capabilities of LLMs, leading to less-than-optimal outcomes. This paper proposes \textbf{KaLM}, a \textit{Knowledge-aligned Language Modeling} approach, which fine-tunes autoregressive LLMs to align with KG knowledge via the joint objective of explicit knowledge alignment and implicit knowledge alignment. The explicit knowledge alignment objective aims to directly optimize the knowledge representation of LLMs through dual-view knowledge graph contrastive learning. The implicit knowledge alignment objective focuses on incorporating textual patterns of knowledge into LLMs through triple completion language modeling. Notably, our method achieves a significant performance boost in evaluations of knowledge-driven tasks, specifically embedding-based knowledge graph completion and generation-based knowledge graph question answering.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2412.04948

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

LTNER: Large Language Model Tagging for Named Entity Recognition with Contextualized Entity Marking

Yan, Faren, Yu, Peng, Chen, Xin

arXiv.org Artificial IntelligenceApr-8-2024

The use of LLMs for natural language processing has become a popular trend in the past two years, driven by their formidable capacity for context comprehension and learning, which has inspired a wave of research from academics and industry professionals. However, for certain NLP tasks, such as NER, the performance of LLMs still falls short when compared to supervised learning methods. In our research, we developed a NER processing framework called LTNER that incorporates a revolutionary Contextualized Entity Marking Gen Method. By leveraging the cost-effective GPT-3.5 coupled with context learning that does not require additional training, we significantly improved the accuracy of LLMs in handling NER tasks. The F1 score on the CoNLL03 dataset increased from the initial 85.9% to 91.9%, approaching the performance of supervised fine-tuning. This outcome has led to a deeper understanding of the potential of LLMs.

arxiv preprint arxiv, large language model, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2404.05624

Country: Asia > China (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Leveraging Large Language Model for Automatic Evolving of Industrial Data-Centric R&D Cycle

Yang, Xu, Yang, Xiao, Liu, Weiqing, Li, Jinhui, Yu, Peng, Ye, Zeqi, Bian, Jiang

arXiv.org Artificial IntelligenceOct-17-2023

In the wake of relentless digital transformation, data-driven solutions are emerging as powerful tools to address multifarious industrial tasks such as forecasting, anomaly detection, planning, and even complex decision-making. Although data-centric R&D has been pivotal in harnessing these solutions, it often comes with significant costs in terms of human, computational, and time resources. This paper delves into the potential of large language models (LLMs) to expedite the evolution cycle of data-centric R&D. Assessing the foundational elements of data-centric R&D, including heterogeneous task-related data, multi-facet domain knowledge, and diverse computing-functional tools, we explore how well LLMs can understand domain-specific requirements, generate professional ideas, utilize domain-specific tools to conduct experiments, interpret results, and incorporate knowledge from past endeavors to tackle new challenges. We take quantitative investment research as a typical example of industrial data-centric R&D scenario and verified our proposed framework upon our full-stack open-sourced quantitative research platform Qlib and obtained promising results which shed light on our vision of automatic evolving of industrial data-centric R&D cycle.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2310.11249

Genre: Research Report > New Finding (0.68)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Efficient Bayesian inference using physics-informed invertible neural networks for inverse problems

Guan, Xiaofei, Wang, Xintong, Wu, Hao, Yang, Zihao, Yu, Peng

arXiv.org Artificial IntelligenceOct-2-2023

In this paper, we introduce an innovative approach for addressing Bayesian inverse problems through the utilization of physics-informed invertible neural networks (PI-INN). The PI-INN framework encompasses two sub-networks: an invertible neural network (INN) and a neural basis network (NB-Net). The primary role of the NB-Net lies in modeling the spatial basis functions characterizing the solution to the forward problem dictated by the underlying partial differential equation. Simultaneously, the INN is designed to partition the parameter vector linked to the input physical field into two distinct components: the expansion coefficients representing the forward problem solution and the Gaussian latent noise. If the forward mapping is precisely estimated, and the statistical independence between expansion coefficients and latent noise is well-maintained, the PI-INN offers a precise and efficient generative model for Bayesian inverse problems, yielding tractable posterior density estimates. As a particular physics-informed deep learning model, the primary training challenge for PI-INN centers on enforcing the independence constraint, which we tackle by introducing a novel independence loss based on estimated density. We support the efficacy and precision of the proposed PI-INN through a series of numerical experiments, including inverse kinematics, 1-dimensional and 2-dimensional diffusion equations, and seismic traveltime tomography. Specifically, our experimental results showcase the superior performance of the proposed independence loss in comparison to the commonly used but computationally demanding kernel-based maximum mean discrepancy loss.

artificial intelligence, machine learning, pi-inn, (20 more...)

arXiv.org Artificial Intelligence

2304.12541

Country: Asia > China (0.29)

Genre:

Research Report > Promising Solution (0.48)
Research Report > New Finding (0.46)
Overview > Innovation (0.34)

Industry: Energy > Oil & Gas > Upstream (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Add feedback

Linear TreeShap

Yu, Peng, Xu, Chao, Bifet, Albert, Read, Jesse

arXiv.org Artificial IntelligenceJan-25-2023

Decision trees are well-known due to their ease of interpretability. To improve accuracy, we need to grow deep trees or ensembles of trees. These are hard to interpret, offsetting their original benefits. Shapley values have recently become a popular way to explain the predictions of tree-based machine learning models. It provides a linear weighting to features independent of the tree structure. The rise in popularity is mainly due to TreeShap, which solves a general exponential complexity problem in polynomial time. Following extensive adoption in the industry, more efficient algorithms are required. This paper presents a more efficient and straightforward algorithm: Linear TreeShap. Like TreeShap, Linear TreeShap is exact and requires the same amount of memory.

artificial intelligence, machine learning, polynomial, (18 more...)

arXiv.org Artificial Intelligence

2209.08192

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Add feedback

Negative Log Likelihood Ratio Loss for Deep Neural Network Classification

Zhu, Donglai, Yao, Hengshuai, Jiang, Bei, Yu, Peng

arXiv.org Machine LearningApr-27-2018

In deep neural network, the cross-entropy loss function is commonly used for classification. Minimizing cross-entropy is equivalent to maximizing likelihood under assumptions of uniform feature and class distributions. It belongs to generative training criteria which does not directly discriminate correct class from competing classes. We propose a discriminative loss function with negative log likelihood ratio between correct and competing classes. It significantly outperforms the cross-entropy loss on the CIFAR-10 image classification task.

deep learning, loss function, neural network, (18 more...)

arXiv.org Machine Learning

1804.1069

Country: North America > Canada > Alberta (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Add feedback

Resolving Over-Constrained Temporal Problems with Uncertainty through Conflict-Directed Relaxation

Yu, Peng, Williams, Brian, Fang, Cheng, Cui, Jing, Haslum, Patrik

Journal of Artificial Intelligence ResearchOct-30-2017

Over-subscription, that is, being assigned too many things to do, is commonly encountered in temporal scheduling problems. As human beings, we often want to do more than we can actually do, and underestimate how long it takes to perform each task. Decision makers can benefit from aids that identify when these failure situations are likely, the root causes of these failures, and resolutions to these failures. In this paper, we present a decision assistant that helps users resolve over-subscribed temporal problems. The system works like an experienced advisor that can quickly identify the cause of failure underlying temporal problems and compute resolutions. The core of the decision assistant is the Best-first Conflict-Directed Relaxation (BCDR) algorithm, which can detect conflicting sets of constraints within temporal problems, and computes continuous relaxations for them that weaken constraints to the minimum extent, instead of removing them completely. BCDR is an extension to the Conflict-Directed A* algorithm, first developed in the model-based reasoning community to compute most likely system diagnoses or reconfigurations. It generalizes the discrete conflicts and relaxations, to hybrid conflicts and relaxations, which denote minimal inconsistencies and minimal relaxations to both discrete and continuous relaxable constraints. In addition, BCDR is capable of handling temporal uncertainty, expressed as either set-bounded or probabilistic durations, and can compute preferred trade-offs between the risk of violating a schedule requirement, versus the loss of utility by weakening those requirements. BCDR has been applied to several decision support applications in different domains, including deep-sea exploration, urban travel planning and transit system management. It has demonstrated its effectiveness in helping users resolve over-subscribed scheduling problems and evaluate the robustness of existing solutions. In our benchmark experiments, BCDR has also demonstrated its efficiency on solving large-scale scheduling problems in the aforementioned domains. Thanks to its conflict-driven approach for computing relaxations, BCDR achieves one to two orders of magnitude improvements on runtime performance when compared to state-of-the-art numerical solvers.

artificial intelligence, constraint, planning & scheduling, (18 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.5431

AI Access Foundation

11089

Journal of Artificial Intelligence Research

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Genre: Research Report > New Finding (0.45)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (0.45)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.34)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)

Add feedback