AITopics | Yang, Gang

Collaborating Authors

Yang, Gang

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Advancing Single- and Multi-task Text Classification through Large Language Model Fine-tuning

Zhao, Hang, Chen, Qile P., Zhang, Yijing Barry, Yang, Gang

arXiv.org Artificial IntelligenceDec-11-2024

Both encoder-only models (e.g., BERT, RoBERTa) and large language models (LLMs, e.g., Llama3) have been widely used for text classification tasks. However, there is a lack of systematic studies comparing the performance of encoder-based models and LLMs in text classification, particularly when fine-tuning is involved. This study employed a diverse range of models and methods, varying in size and architecture, and including both fine-tuned and pre-trained approaches. We first assessed the performances of these LLMs on the 20 Newsgroups (20NG) and MASSIVE datasets, comparing them to encoder-only RoBERTa models. Additionally, we explored the multi-task capabilities of both model types by combining multiple classification tasks, including intent detection and slot-filling, into a single model using data from both datasets. Our results indicate that fully fine-tuned Llama3-70B models outperform RoBERTa-large and other decoder LLMs across various classification tasks and datasets. Moreover, the consolidated multi-task fine-tuned LLMs matched the performance of dual-model setups in both tasks across both datasets. Overall, our study provides a comprehensive benchmark of encoder-only and LLM models on text classification tasks and demonstrates a method to combine two or more fully fine-tuned decoder LLMs for reduced latency and equivalent performance.

artificial intelligence, large language model, natural language, (2 more...)

arXiv.org Artificial Intelligence

2412.08587

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Stable Object Placement Under Geometric Uncertainty via Differentiable Contact Dynamics

Li, Linfeng, Yang, Gang, Shao, Lin, Hsu, David

arXiv.org Artificial IntelligenceSep-26-2024

From serving a cup of coffee to carefully rearranging delicate items, stable object placement is a crucial skill for future robots. This skill is challenging due to the required accuracy, which is difficult to achieve under geometric uncertainty. We leverage differentiable contact dynamics to develop a principled method for stable object placement under geometric uncertainty. We estimate the geometric uncertainty by minimizing the discrepancy between the force-torque sensor readings and the model predictions through gradient descent. We further keep track of a belief over multiple possible geometric parameters to mitigate the gradient-based method's sensitivity to the initialization. We verify our approach in the real world on various geometric uncertainties, including the in-hand pose uncertainty of the grasped object, the object's shape uncertainty, and the environment's shape uncertainty.

artificial intelligence, geometric uncertainty, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2409.17725

Country: Asia > Singapore (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)

Add feedback

ManiFoundation Model for General-Purpose Robotic Manipulation of Contact Synthesis with Arbitrary Objects and Robots

Xu, Zhixuan, Gao, Chongkai, Liu, Zixuan, Yang, Gang, Tie, Chenrui, Zheng, Haozhuo, Zhou, Haoyu, Peng, Weikun, Wang, Debang, Chen, Tianyi, Yu, Zhouliang, Shao, Lin

arXiv.org Artificial IntelligenceMay-11-2024

To substantially enhance robot intelligence, there is a pressing need to develop a large model that enables general-purpose robots to proficiently undertake a broad spectrum of manipulation tasks, akin to the versatile task-planning ability exhibited by LLMs. The vast diversity in objects, robots, and manipulation tasks presents huge challenges. Our work introduces a comprehensive framework to develop a foundation model for general robotic manipulation that formalizes a manipulation task as contact synthesis. Specifically, our model takes as input object and robot manipulator point clouds, object physical attributes, target motions, and manipulation region masks. It outputs contact points on the object and associated contact forces or post-contact motions for robots to achieve the desired manipulation task. We perform extensive experiments both in the simulation and real-world settings, manipulating articulated rigid objects, rigid objects, and deformable objects that vary in dimensionality, ranging from one-dimensional objects like ropes to two-dimensional objects like cloth and extending to three-dimensional objects such as plasticine. Our model achieves average success rates of around 90\%. Supplementary materials and videos are available on our project website at https://manifoundationmodel.github.io/.

artificial intelligence, machine learning, point cloud, (18 more...)

arXiv.org Artificial Intelligence

2405.06964

Country: Asia > China (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

SoftMAC: Differentiable Soft Body Simulation with Forecast-based Contact Model and Two-way Coupling with Articulated Rigid Bodies and Clothes

Liu, Min, Yang, Gang, Luo, Siyuan, Yu, Chen, Shao, Lin

arXiv.org Artificial IntelligenceDec-6-2023

Differentiable physics simulation provides an avenue for tackling previously intractable challenges through gradient-based optimization, thereby greatly improving the efficiency of solving robotics-related problems. To apply differentiable simulation in diverse robotic manipulation scenarios, a key challenge is to integrate various materials in a unified framework. We present SoftMAC, a differentiable simulation framework coupling soft bodies with articulated rigid bodies and clothes. SoftMAC simulates soft bodies with the continuum-mechanics-based Material Point Method (MPM). We provide a forecast-based contact model for MPM, which greatly reduces artifacts like penetration and unnatural rebound. To couple MPM particles with deformable and non-volumetric clothes meshes, we also propose a penetration tracing algorithm that reconstructs the signed distance field in local area. Based on simulators for each modality and the contact model, we develop a differentiable coupling mechanism to simulate the interactions between soft bodies and the other two types of materials. Comprehensive experiments are conducted to validate the effectiveness and accuracy of the proposed differentiable pipeline in downstream robotic manipulation applications. Supplementary materials and videos are available on our project website at https://sites.google.com/view/softmac.

artificial intelligence, particle, simulation, (15 more...)

arXiv.org Artificial Intelligence

2312.03297

Country:

Asia > Singapore (0.14)
North America > United States (0.14)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

Jade: A Differentiable Physics Engine for Articulated Rigid Bodies with Intersection-Free Frictional Contact

Yang, Gang, Luo, Siyuan, Shao, Lin

arXiv.org Artificial IntelligenceSep-9-2023

Compared to existing differentiable simulations, Jade offers features including intersectionfree collision simulation and stable LCP solutions for multiple frictional contacts. We use continuous collision detection to detect the time of impact and adopt the backtracking strategy to prevent intersection between bodies with complex geometry shapes. We derive the gradient calculation to ensure the whole simulation process is differentiable under the backtracking mechanism. We modify the popular Dantzig algorithm to get valid solutions under multiple frictional contacts. We conduct extensive experiments to demonstrate the effectiveness of our differentiable physics simulation over a variety of contact-rich tasks. Figure 1: A bowling ball rolls on the floor and knocks pins off the ground at a speed of 10 meters per second.

artificial intelligence, machine learning, simulation, (19 more...)

arXiv.org Artificial Intelligence

2309.0471

Country:

North America > United States (0.67)
Asia (0.46)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Inverse design of nano-photonic wavelength demultiplexer with a deep neural network approach

Yuan, Mengwei, Yang, Gang, Song, Shijie, Zhou, Luping, Minasian, Robert, Yi, Xiaoke

arXiv.org Artificial IntelligenceMay-15-2022

In this paper, we propose a pre-trained-combined neural network (PTCN) as a comprehensive solution to the inverse design of an integrated photonic circuit. By utilizing both the initially pre-trained inverse and forward model with a joint training process, our PTCN model shows remarkable tolerance to the quantity and quality of the training data. As a proof of concept demonstration, the inverse design of a wavelength demultiplexer is used to verify the effectiveness of the PTCN model. The correlation coefficient of the prediction by the presented PTCN model remains greater than 0.974 even when the size of training data is decreased to 17%. The experimental results show a good agreement with predictions, and demonstrate a wavelength demultiplexer with an ultra-compact footprint, a high transmission efficiency with a transmission loss of -2dB, a low reflection of -10dB, and low crosstalk around -7dB simultaneously.

artificial intelligence, machine learning, wavelength demultiplexer, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1364/OE.462038

2206.07114

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Add feedback

SEA: Sentence Encoder Assembly for Video Retrieval by Textual Queries

Li, Xirong, Zhou, Fangming, Xu, Chaoxi, Ji, Jiaqi, Yang, Gang

arXiv.org Artificial IntelligenceNov-24-2020

Retrieving unlabeled videos by textual queries, known as Ad-hoc Video Search (AVS), is a core theme in multimedia data management and retrieval. The success of AVS counts on cross-modal representation learning that encodes both query sentences and videos into common spaces for semantic similarity computation. Inspired by the initial success of previously few works in combining multiple sentence encoders, this paper takes a step forward by developing a new and general method for effectively exploiting diverse sentence encoders. The novelty of the proposed method, which we term Sentence Encoder Assembly (SEA), is two-fold. First, different from prior art that use only a single common space, SEA supports text-video matching in multiple encoder-specific common spaces. Such a property prevents the matching from being dominated by a specific encoder that produces an encoding vector much longer than other encoders. Second, in order to explore complementarities among the individual common spaces, we propose multi-space multi-loss learning. As extensive experiments on four benchmarks (MSR-VTT, TRECVID AVS 2016-2019, TGIF and MSVD) show, SEA surpasses the state-of-the-art. In addition, SEA is extremely ease to implement. All this makes SEA an appealing solution for AVS and promising for continuously advancing the task by harvesting new sentence encoders.

deep learning, encoder, neural network, (23 more...)

arXiv.org Artificial Intelligence

2011.12091

Country: Asia > China (0.48)

Genre: Research Report (0.40)

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
(2 more...)

Add feedback