AITopics | Yang, Yuxin

Collaborating Authors

Yang, Yuxin

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Pseudo-Knowledge Graph: Meta-Path Guided Retrieval and In-Graph Text for RAG-Equipped LLM

Yang, Yuxin, Wu, Haoyang, Wang, Tao, Yang, Jia, Ma, Hao, Luo, Guojie

arXiv.org Artificial IntelligenceFeb-28-2025

The advent of Large Language Models (LLMs) has revolutionized natural language processing. However, these models face challenges in retrieving precise information from vast datasets. Retrieval-Augmented Generation (RAG) was developed to combining LLMs with external information retrieval systems to enhance the accuracy and context of responses. Despite improvements, RAG still struggles with comprehensive retrieval in high-volume, low-information-density databases and lacks relational awareness, leading to fragmented answers. To address this, this paper introduces the Pseudo-Knowledge Graph (PKG) framework, designed to overcome these limitations by integrating Meta-path Retrieval, In-graph Text and Vector Retrieval into LLMs. By preserving natural language text and leveraging various retrieval techniques, the PKG offers a richer knowledge representation and improves accuracy in information retrieval. Extensive evaluations using Open Compass and MultiHop-RAG benchmarks demonstrate the framework's effectiveness in managing large volumes of data and complex relationships.

information retrieval, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2503.00309

Country:

Asia (1.00)
Europe > United Kingdom > England (0.28)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Food & Agriculture > Agriculture (0.93)
Health & Medicine > Pharmaceuticals & Biotechnology (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

MobileH2R: Learning Generalizable Human to Mobile Robot Handover Exclusively from Scalable and Diverse Synthetic Data

Wang, Zifan, Chen, Ziqing, Chen, Junyu, Wang, Jilong, Yang, Yuxin, Liu, Yunze, Liu, Xueyi, Wang, He, Yi, Li

arXiv.org Artificial IntelligenceJan-8-2025

This paper introduces MobileH2R, a framework for learning generalizable vision-based human-to-mobile-robot (H2MR) handover skills. Unlike traditional fixed-base handovers, this task requires a mobile robot to reliably receive objects in a large workspace enabled by its mobility. Our key insight is that generalizable handover skills can be developed in simulators using high-quality synthetic data, without the need for real-world demonstrations. To achieve this, we propose a scalable pipeline for generating diverse synthetic full-body human motion data, an automated method for creating safe and imitation-friendly demonstrations, and an efficient 4D imitation learning method for distilling large-scale demonstrations into closed-loop policies with base-arm coordination. Experimental evaluations in both simulators and the real world show significant improvements (at least +15% success rate) over baseline methods in all cases. Experiments also validate that large-scale and diverse synthetic data greatly enhances robot learning, highlighting our scalable framework.

artificial intelligence, demonstration, robot, (17 more...)

arXiv.org Artificial Intelligence

2501.04595

Country: Asia (0.14)

Genre: Research Report > Promising Solution (0.46)

Industry: Energy (0.48)

Technology: Information Technology > Artificial Intelligence > Robots > Locomotion (0.82)

Add feedback

Towards Ideal Temporal Graph Neural Networks: Evaluations and Conclusions after 10,000 GPU Hours

Yang, Yuxin, Zhou, Hongkuan, Kannan, Rajgopal, Prasanna, Viktor

arXiv.org Artificial IntelligenceDec-28-2024

Temporal Graph Neural Networks (TGNNs) have emerged as powerful tools for modeling dynamic interactions across various domains. The design space of TGNNs is notably complex, given the unique challenges in runtime efficiency and scalability raised by the evolving nature of temporal graphs. We contend that many of the existing works on TGNN modeling inadequately explore the design space, leading to suboptimal designs. Viewing TGNN models through a performance-focused lens often obstructs a deeper understanding of the advantages and disadvantages of each technique. Specifically, benchmarking efforts inherently evaluate models in their original designs and implementations, resulting in unclear accuracy comparisons and misleading runtime. To address these shortcomings, we propose a practical comparative evaluation framework that performs a design space search across well-known TGNN modules based on a unified, optimized code implementation. Using our framework, we make the first efforts towards addressing three critical questions in TGNN design, spending over 10,000 GPU hours: (1) investigating the efficiency of TGNN module designs, (2) analyzing how the effectiveness of these modules correlates with dataset patterns, and (3) exploring the interplay between multiple modules. Key outcomes of this directed investigative approach include demonstrating that the most recent neighbor sampling and attention aggregator outperform uniform neighbor sampling and MLP-Mixer aggregator; Assessing static node memory as an effective node memory alternative, and showing that the choice between static or dynamic node memory should be based on the repetition patterns in the dataset. Our in-depth analysis of the interplay between TGNN modules and dataset patterns should provide a deeper insight into TGNN performance along with potential research directions for designing more general and effective TGNNs.

artificial intelligence, data mining, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2412.20256

Country: North America > United States > California > Los Angeles County > Los Angeles (0.14)

Genre: Research Report > New Finding (0.93)

Industry: Education > Educational Setting (0.68)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Multi-view biomedical foundation models for molecule-target and property prediction

Suryanarayanan, Parthasarathy, Qiu, Yunguang, Sethi, Shreyans, Mahajan, Diwakar, Li, Hongyang, Yang, Yuxin, Eyigoz, Elif, Saenz, Aldo Guzman, Platt, Daniel E., Rumbell, Timothy H., Ng, Kenney, Dey, Sanjoy, Burch, Myson, Kwon, Bum Chul, Meyer, Pablo, Cheng, Feixiong, Hu, Jianying, Morrone, Joseph A.

arXiv.org Artificial IntelligenceOct-25-2024

Drug discovery is a complex, multi-stage process. Lead identification and lead optimization remain costly with low success-rates and computational methods play an important role in accelerating these tasks [1-3]. The prediction of a broad range of chemical and biological properties of candidate molecules is an essential component of screening and assessing molecules and data-driven, machine learning approaches have long aided in this process [4-6]. Molecular representations form the basis of machine learning models [2, 7], facilitating algorithmic and scientific advances in the field. However, learning useful and generalized latent representation is a hard problem due to limited amounts of labeled data, wide ranges of downstream tasks, vast chemical space, and large heterogeneity in molecular structures. Learning latent representations using unsupervised techniques is vital for such models to scale. Large language models (LLMs) have revolutionized other fields [8] and similar sequence-based foundation models have shown promise to learn molecular representations and be trainable on many downstream property prediction tasks [9-11]. A key advantage is that the transformer based architecture can learn in a self-supervised fashion to create a "pre-trained" molecular representation. The most direct application of LLM like transformers is facilitated by a sequence, text-based representation (e.g.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2410.19704

Country: North America > United States (1.00)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.48)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.47)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Corki: Enabling Real-time Embodied AI Robots via Algorithm-Architecture Co-Design

Huang, Yiyang, Hao, Yuhui, Yu, Bo, Yan, Feng, Yang, Yuxin, Min, Feng, Han, Yinhe, Ma, Lin, Liu, Shaoshan, Liu, Qiang, Gan, Yiming

arXiv.org Artificial IntelligenceJul-5-2024

Embodied AI robots have the potential to fundamentally improve the way human beings live and manufacture. Continued progress in the burgeoning field of using large language models to control robots depends critically on an efficient computing substrate. In particular, today's computing systems for embodied AI robots are designed purely based on the interest of algorithm developers, where robot actions are divided into a discrete frame-basis. Such an execution pipeline creates high latency and energy consumption. This paper proposes Corki, an algorithm-architecture co-design framework for real-time embodied AI robot control. Our idea is to decouple LLM inference, robotic control and data communication in the embodied AI robots compute pipeline. Instead of predicting action for one single frame, Corki predicts the trajectory for the near future to reduce the frequency of LLM inference. The algorithm is coupled with a hardware that accelerates transforming trajectory into actual torque signals used to control robots and an execution pipeline that parallels data communication with computation. Corki largely reduces LLM inference frequency by up to 8.0x, resulting in up to 3.6x speed up. The success rate improvement can be up to 17.3%. Code is provided for re-implementation. https://github.com/hyy0613/Corki

large language model, machine learning, trajectory, (21 more...)

arXiv.org Artificial Intelligence

2407.04292

Country:

Asia > China (0.29)
North America > United States > Massachusetts (0.14)

Genre: Research Report (1.00)

Industry: Energy (0.67)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

VerilogReader: LLM-Aided Hardware Test Generation

Ma, Ruiyang, Yang, Yuxin, Liu, Ziqian, Zhang, Jiaxi, Li, Min, Huang, Junhua, Luo, Guojie

arXiv.org Artificial IntelligenceJun-3-2024

Test generation has been a critical and labor-intensive process in hardware design verification. Recently, the emergence of Large Language Model (LLM) with their advanced understanding and inference capabilities, has introduced a novel approach. In this work, we investigate the integration of LLM into the Coverage Directed Test Generation (CDG) process, where the LLM functions as a Verilog Reader. It accurately grasps the code logic, thereby generating stimuli that can reach unexplored code branches. We compare our framework with random testing, using our self-designed Verilog benchmark suite. Experiments demonstrate that our framework outperforms random testing on designs within the LLM's comprehension scope. Our work also proposes prompt engineering optimizations to augment LLM's understanding scope and accuracy.

artificial intelligence, large language model, natural language, (15 more...)

arXiv.org Artificial Intelligence

2406.04373

Genre: Research Report > Promising Solution (0.34)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

LLM-based Privacy Data Augmentation Guided by Knowledge Distillation with a Distribution Tutor for Medical Text Classification

Song, Yiping, Zhang, Juhua, Tian, Zhiliang, Yang, Yuxin, Huang, Minlie, Li, Dongsheng

arXiv.org Artificial IntelligenceFeb-26-2024

As sufficient data are not always publically accessible for model training, researchers exploit limited data with advanced learning algorithms or expand the dataset via data augmentation (DA). Conducting DA in private domain requires private protection approaches (i.e. anonymization and perturbation), but those methods cannot provide protection guarantees. Differential privacy (DP) learning methods theoretically bound the protection but are not skilled at generating pseudo text samples with large models. In this paper, we transfer DP-based pseudo sample generation task to DP-based generated samples discrimination task, where we propose a DP-based DA method with a LLM and a DP-based discriminator for text classification on private domains. We construct a knowledge distillation model as the DP-based discriminator: teacher models, accessing private data, teaches students how to select private samples with calibrated noise to achieve DP. To constrain the distribution of DA's generation, we propose a DP-based tutor that models the noised private distribution and controls samples' generation with a low privacy cost. We theoretically analyze our model's privacy protection and empirically verify our model.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2402.16515

Country: North America > United States (0.67)

Genre: Research Report > New Finding (0.68)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

GROVE: A Retrieval-augmented Complex Story Generation Framework with A Forest of Evidence

Wen, Zhihua, Tian, Zhiliang, Wu, Wei, Yang, Yuxin, Shi, Yanqi, Huang, Zhen, Li, Dongsheng

arXiv.org Artificial IntelligenceOct-23-2023

Conditional story generation is significant in human-machine interaction, particularly in producing stories with complex plots. While Large language models (LLMs) perform well on multiple NLP tasks, including story generation, it is challenging to generate stories with both complex and creative plots. Existing methods often rely on detailed prompts to guide LLMs to meet target conditions, which inadvertently restrict the creative potential of the generated stories. We argue that leveraging information from exemplary human-written stories facilitates generating more diverse plotlines. Delving deeper into story details helps build complex and credible plots. In this paper, we propose a retrieval-au\textbf{G}mented sto\textbf{R}y generation framework with a f\textbf{O}rest of e\textbf{V}id\textbf{E}nce (GROVE) to enhance stories' complexity. We build a retrieval repository for target conditions to produce few-shot examples to prompt LLMs. Additionally, we design an ``asking-why'' prompting scheme that extracts a forest of evidence, providing compensation for the ambiguities that may occur in the generated story. This iterative process uncovers underlying story backgrounds. Finally, we select the most fitting chains of evidence from the evidence forest and integrate them into the generated story, thereby enhancing the narrative's complexity and credibility. Experimental results and numerous examples verify the effectiveness of our method.

information, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2310.05388

Country:

Europe (0.92)
Asia > China (0.28)
Asia > Middle East > UAE (0.14)
North America > United States > Louisiana (0.14)

Genre: Research Report (1.00)

Industry: Law (0.92)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Dadu-RBD: Robot Rigid Body Dynamics Accelerator with Multifunctional Pipelines

Yang, Yuxin, Chen, Xiaoming, Han, Yinhe

arXiv.org Artificial IntelligenceSep-28-2023

Rigid body dynamics is a key technology in the robotics field. In trajectory optimization and model predictive control algorithms, there are usually a large number of rigid body dynamics computing tasks. Using CPUs to process these tasks consumes a lot of time, which will affect the real-time performance of robots. To this end, we propose a multifunctional robot rigid body dynamics accelerator, named RBDCore, to address the performance bottleneck. By analyzing different functions commonly used in robot dynamics calculations, we summarize their reuse relationship and optimize them according to the hardware. Based on this, RBDCore can fully reuse common hardware modules when processing different computing tasks. By dynamically switching the dataflow path, RBDCore can accelerate various dynamics functions without reconfiguring the hardware. We design Structure-Adaptive Pipelines for RBDCore, which can greatly improve the throughput of the accelerator. Robots with different structures and parameters can be optimized specifically. Compared with the state-of-the-art CPU, GPU dynamics libraries and FPGA accelerator, RBDCore can significantly improve the performance.

artificial intelligence, multifunctional pipeline, robot rigid body dynamic accelerator, (1 more...)

arXiv.org Artificial Intelligence

2307.02274

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

FSD: An Initial Chinese Dataset for Fake Song Detection

Xie, Yuankun, Zhou, Jingjing, Lu, Xiaolin, Jiang, Zhenghao, Yang, Yuxin, Cheng, Haonan, Ye, Long

arXiv.org Artificial IntelligenceSep-6-2023

Singing voice synthesis and singing voice conversion have significantly advanced, revolutionizing musical experiences. However, the rise of "Deepfake Songs" generated by these technologies raises concerns about authenticity. Unlike Audio DeepFake Detection (ADD), the field of song deepfake detection lacks specialized datasets or methods for song authenticity verification. In this paper, we initially construct a Chinese Fake Song Detection (FSD) dataset to investigate the field of song deepfake detection. The fake songs in the FSD dataset are generated by five state-of-the-art singing voice synthesis and singing voice conversion methods. Our initial experiments on FSD revealed the ineffectiveness of existing speech-trained ADD models for the task of song deepFake detection. Thus, we employ the FSD dataset for the training of ADD models. We subsequently evaluate these models under two scenarios: one with the original songs and another with separated vocal tracks. Experiment results show that song-trained ADD models exhibit a 38.58% reduction in average equal error rate compared to speech-trained ADD models on the FSD test set.

artificial intelligence, initial chinese dataset, machine learning, (2 more...)

arXiv.org Artificial Intelligence

2309.02232

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback