AITopics | Zhang, Jiahui

Collaborating Authors

Zhang, Jiahui

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

VidSketch: Hand-drawn Sketch-Driven Video Generation with Diffusion Control

Jiang, Lifan, Chen, Shuang, Wu, Boxi, Guan, Xiaotong, Zhang, Jiahui

arXiv.org Artificial IntelligenceFeb-17-2025

With the advancement of generative artificial intelligence, previous studies have achieved the task of generating aesthetic images from hand-drawn sketches, fulfilling the public's needs for drawing. However, these methods are limited to static images and lack the ability to control video animation generation using hand-drawn sketches. To address this gap, we propose VidSketch, the first method capable of generating high-quality video animations directly from any number of hand-drawn sketches and simple text prompts, bridging the divide between ordinary users and professional artists. Specifically, our method introduces a Level-Based Sketch Control Strategy to automatically adjust the guidance strength of sketches during the generation process, accommodating users with varying drawing skills. Furthermore, a TempSpatial Attention mechanism is designed to enhance the spatiotemporal consistency of generated video animations, significantly improving the coherence across frames. You can find more detailed cases on our official website.

consistency, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2502.01101

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Graphics > Animation (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.49)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.34)

Add feedback

Learning an Adaptive Fall Recovery Controller for Quadrupeds on Complex Terrains

Lu, Yidan, Dong, Yinzhao, Ma, Ji, Zhang, Jiahui, Lu, Peng

arXiv.org Artificial IntelligenceDec-22-2024

Legged robots have made significant strides in locomotion However, in extreme or complex natural environments, capabilities, demonstrating impressive performance in robots still face the inevitability of falling. A major challenge tasks such as dynamic walking, running, and even complex in current research lies in developing adaptive controllers maneuvers like backflips [8], [2]. However, the ability to for robots to effectively recover from falls, allowing them recover from falls, especially on challenging and unpredictable to resume movement or efficiently complete tasks. However, terrains, remains a critical challenge in the field of legged model-based methods are often inadequate for these dynamic robotics. While substantial progress has been made in recovery tasks. For example, Mordatch et al. [12] proposed a framework strategies for flat or moderately uneven surfaces [7], [13], that optimizes automatic recovery through contact invariance, the problem of robust recovery on highly irregular terrains - but the reliance on predefined potential contact points limits such as rocky landscapes, steep inclines, or complex gaps - the exploration of flexible behaviors. In addition, classical has received limited attention.

artificial intelligence, machine learning, robot, (10 more...)

arXiv.org Artificial Intelligence

2412.16924

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Add feedback

SPRINT: Scalable Policy Pre-Training via Language Instruction Relabeling

Zhang, Jesse, Pertsch, Karl, Zhang, Jiahui, Lim, Joseph J.

arXiv.org Artificial IntelligenceJan-29-2024

Pre-training robot policies with a rich set of skills can substantially accelerate the learning of downstream tasks. Prior works have defined pre-training tasks via natural language instructions, but doing so requires tedious human annotation of hundreds of thousands of instructions. Thus, we propose SPRINT, a scalable offline policy pre-training approach which substantially reduces the human effort needed for pre-training a diverse set of skills. Our method uses two core ideas to automatically expand a base set of pre-training tasks: instruction relabeling via large language models and cross-trajectory skill chaining through offline reinforcement learning. As a result, SPRINT pre-training equips robots with a much richer repertoire of skills. Experimental results in a household simulator and on a real robot kitchen manipulation task show that SPRINT leads to substantially faster learning of new long-horizon tasks than previous pre-training approaches. Website at https://clvrai.com/sprint.

large language model, machine learning, trajectory, (21 more...)

arXiv.org Artificial Intelligence

2306.11886

Country:

North America > United States > California (0.14)
Europe > France (0.14)

Genre: Research Report (0.50)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Bootstrap Your Own Skills: Learning to Solve New Tasks with Large Language Model Guidance

Zhang, Jesse, Zhang, Jiahui, Pertsch, Karl, Liu, Ziyi, Ren, Xiang, Chang, Minsuk, Sun, Shao-Hua, Lim, Joseph J.

arXiv.org Artificial IntelligenceOct-17-2023

We propose BOSS, an approach that automatically learns to solve new long-horizon, complex, and meaningful tasks by growing a learned skill library with minimal supervision. Prior work in reinforcement learning require expert supervision, in the form of demonstrations or rich reward functions, to learn long-horizon tasks. Instead, our approach BOSS (BOotStrapping your own Skills) learns to accomplish new tasks by performing "skill bootstrapping," where an agent with a set of primitive skills interacts with the environment to practice new skills without receiving reward feedback for tasks outside of the initial skill set. This bootstrapping phase is guided by large language models (LLMs) that inform the agent of meaningful skills to chain together. Through this process, BOSS builds a wide range of complex and useful behaviors from a basic set of primitive skills. We demonstrate through experiments in realistic household environments that agents trained with our LLM-guided bootstrapping procedure outperform those trained with naive bootstrapping as well as prior unsupervised skill acquisition methods on zero-shot execution of unseen, long-horizon tasks in new environments. Website at clvrai.com/boss.

large language model, machine learning, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

2310.10021

Country: North America > United States > California (0.14)

Genre:

Research Report (0.82)
Instructional Material (0.67)

Industry:

Education (1.00)
Leisure & Entertainment > Sports > Tennis (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

An Intelligent Self-driving Truck System For Highway Transportation

Wang, Dawei, Gao, Lingping, Lan, Ziquan, Li, Wei, Ren, Jiaping, Zhang, Jiahui, Zhang, Peng, Zhou, Pei, Wang, Shengao, Pan, Jia, Manocha, Dinesh, Yang, Ruigang

arXiv.org Artificial IntelligenceDec-30-2021

Recently, there have been many advances in autonomous driving society, attracting a lot of attention from academia and industry. However, existing works mainly focus on cars, extra development is still required for self-driving truck algorithms and models. In this paper, we introduce an intelligent self-driving truck system. Our presented system consists of three main components, 1) a realistic traffic simulation module for generating realistic traffic flow in testing scenarios, 2) a high-fidelity truck model which is designed and evaluated for mimicking real truck response in real-world deployment, 3) an intelligent planning module with learning-based decision making algorithm and multi-mode trajectory planner, taking into account the truck's constraints, road slope changes, and the surrounding traffic flow. We provide quantitative evaluations for each component individually to demonstrate the fidelity and performance of each part. We also deploy our proposed system on a real truck and conduct real world experiments which shows our system's capacity of mitigating sim-to-real gap. Our code is available at https://github.com/InceptioResearch/IITS

artificial intelligence, machine learning, truck, (20 more...)

arXiv.org Artificial Intelligence

2112.15304

Country:

Asia (0.46)
North America > United States > Maryland (0.14)

Genre: Research Report > New Finding (0.68)

Industry:

Transportation > Passenger (1.00)
Transportation > Ground > Road (1.00)
Information Technology > Robotics & Automation (1.00)
Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Awakening Latent Grounding from Pretrained Language Models for Semantic Parsing

Liu, Qian, Yang, Dejian, Zhang, Jiahui, Guo, Jiaqi, Zhou, Bin, Lou, Jian-Guang

arXiv.org Artificial IntelligenceSep-22-2021

Recent years pretrained language models (PLMs) hit a success on several downstream tasks, showing their power on modeling language. To better understand and leverage what PLMs have learned, several techniques have emerged to explore syntactic structures entailed by PLMs. However, few efforts have been made to explore grounding capabilities of PLMs, which are also essential. In this paper, we highlight the ability of PLMs to discover which token should be grounded to which concept, if combined with our proposed erasing-then-awakening approach. Empirical studies on four datasets demonstrate that our approach can awaken latent grounding which is understandable to human experts, even if it is not exposed to such labels during training. More importantly, our approach shows great potential to benefit downstream semantic parsing models. Taking text-to-SQL as a case study, we successfully couple our approach with two off-the-shelf parsers, obtaining an absolute improvement of up to 9.8%.

artificial intelligence, computational linguistics, natural language, (16 more...)

arXiv.org Artificial Intelligence

2109.1054

Country:

Europe (1.00)
Asia (0.68)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback