AITopics | Yang, Ge

Collaborating Authors

Yang, Ge

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Toward Efficient Deep Spiking Neuron Networks:A Survey On Compression

Xie, Hui, Yang, Ge, Gao, Wenjuan

arXiv.org Artificial IntelligenceJun-3-2024

With the rapid development of deep learning, Deep Spiking Neural Networks (DSNNs) have emerged as promising due to their unique spike event processing and asynchronous computation. When deployed on neuromorphic chips, DSNNs offer significant power advantages over Deep Artificial Neural Networks (DANNs) and eliminate time and energy consuming multiplications due to the binary nature of spikes (0 or 1). Additionally, DSNNs excel in processing temporal information, making them potentially superior for handling temporal data compared to DANNs. However, their deep network structure and numerous parameters result in high computational costs and energy consumption, limiting real-life deployment. To enhance DSNNs efficiency, researchers have adapted methods from DANNs, such as pruning, quantization, and knowledge distillation, and developed specific techniques like reducing spike firing and pruning time steps. While previous surveys have covered DSNNs algorithms, hardware deployment, and general overviews, focused research on DSNNs compression and efficiency has been lacking. This survey addresses this gap by concentrating on efficient DSNNs and their compression methods. It begins with an exploration of DSNNs' biological background and computational units, highlighting differences from DANNs. It then delves into various compression methods, including pruning, quantization, knowledge distillation, and reducing spike firing, and concludes with suggestions for future research directions.

artificial intelligence, machine learning, neural network, (14 more...)

arXiv.org Artificial Intelligence

2407.08744

Country: Asia > China (0.14)

Genre:

Research Report (0.50)
Overview (0.48)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Feature Splatting: Language-Driven Physics-Based Scene Synthesis and Editing

Qiu, Ri-Zhao, Yang, Ge, Zeng, Weijia, Wang, Xiaolong

arXiv.org Artificial IntelligenceApr-1-2024

Scene representations using 3D Gaussian primitives have produced excellent results in modeling the appearance of static and dynamic 3D scenes. Many graphics applications, however, demand the ability to manipulate both the appearance and the physical properties of objects. We introduce Feature Splatting, an approach that unifies physics-based dynamic scene synthesis with rich semantics from vision language foundation models that are grounded by natural language. Our first contribution is a way to distill high-quality, object-centric vision-language features into 3D Gaussians, that enables semi-automatic scene decomposition using text queries. Our second contribution is a way to synthesize physics-based dynamics from an otherwise static scene using a particle-based simulator, in which material properties are assigned automatically via text queries. We ablate key techniques used in this pipeline, to illustrate the challenge and opportunities in using feature-carrying 3D Gaussians as a unified format for appearance, geometry, material properties and semantics grounded on natural language. Project website: https://feature-splatting.github.io/

artificial intelligence, gaussian, natural language, (13 more...)

arXiv.org Artificial Intelligence

2404.01223

Country: North America > United States > California (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (0.82)

Add feedback

Learning Generalizable Feature Fields for Mobile Manipulation

Qiu, Ri-Zhao, Hu, Yafei, Yang, Ge, Song, Yuchen, Fu, Yang, Ye, Jianglong, Mu, Jiteng, Yang, Ruihan, Atanasov, Nikolay, Scherer, Sebastian, Wang, Xiaolong

arXiv.org Artificial IntelligenceMar-12-2024

An open problem in mobile manipulation is how to represent objects and scenes in a unified manner, so that robots can use it both for navigating in the environment and manipulating objects. The latter requires capturing intricate geometry while understanding fine-grained semantics, whereas the former involves capturing the complexity inherit to an expansive physical scale. In this work, we present GeFF (Generalizable Feature Fields), a scene-level generalizable neural feature field that acts as a unified representation for both navigation and manipulation that performs in real-time. To do so, we treat generative novel view synthesis as a pre-training task, and then align the resulting rich scene priors with natural language via CLIP feature distillation. We demonstrate the effectiveness of this approach by deploying GeFF on a quadrupedal robot equipped with a manipulator. We evaluate GeFF's ability to generalize to open-set objects as well as running time, when performing open-vocabulary mobile manipulation in dynamic scenes.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2403.07563

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.46)

Add feedback

Expressive Whole-Body Control for Humanoid Robots

Cheng, Xuxin, Ji, Yandong, Chen, Junming, Yang, Ruihan, Yang, Ge, Wang, Xiaolong

arXiv.org Artificial IntelligenceMar-5-2024

Can we enable humanoid robots to generate rich, diverse, and expressive motions in the real world? We propose to learn a whole-body control policy on a human-sized robot to mimic human motions as realistic as possible. To train such a policy, we leverage the large-scale human motion capture data from the graphics community in a Reinforcement Learning framework. However, directly performing imitation learning with the motion capture dataset would not work on the real humanoid robot, given the large gap in degrees of freedom and physical capabilities. Our method Expressive Whole-Body Control (Exbody) tackles this problem by encouraging the upper humanoid body to imitate a reference motion, while relaxing the imitation constraint on its two legs and only requiring them to follow a given velocity robustly. With training in simulation and Sim2Real transfer, our policy can control a humanoid robot to walk in different styles, shake hands with humans, and even dance with a human in the real world. We conduct extensive studies and comparisons on diverse motions in both simulation and the real world to show the effectiveness of our approach.

artificial intelligence, conference, robot, (12 more...)

arXiv.org Artificial Intelligence

2402.16796

Country: North America > United States (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Robots > Humanoid Robots (1.00)
Information Technology > Artificial Intelligence > Robots > Locomotion (0.69)

Add feedback

Distilled Feature Fields Enable Few-Shot Language-Guided Manipulation

Shen, William, Yang, Ge, Yu, Alan, Wong, Jansen, Kaelbling, Leslie Pack, Isola, Phillip

arXiv.org Artificial IntelligenceDec-29-2023

Self-supervised and language-supervised image models contain rich knowledge of the world that is important for generalization. Many robotic tasks, however, require a detailed understanding of 3D geometry, which is often lacking in 2D image features. This work bridges this 2D-to-3D gap for robotic manipulation by leveraging distilled feature fields to combine accurate 3D geometry with rich semantics from 2D foundation models. We present a few-shot learning method for 6-DOF grasping and placing that harnesses these strong spatial and semantic priors to achieve in-the-wild generalization to unseen objects. Using features distilled from a vision-language model, CLIP, we present a way to designate novel objects for manipulation via free-text natural language, and demonstrate its ability to generalize to unseen expressions and novel categories of objects.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2308.07931

Country:

North America > United States (0.46)
Europe > United Kingdom > England (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Compositional Sculpting of Iterative Generative Processes

Garipov, Timur, De Peuter, Sebastiaan, Yang, Ge, Garg, Vikas, Kaski, Samuel, Jaakkola, Tommi

arXiv.org Artificial IntelligenceSep-27-2023

High training costs of generative models and the need to fine-tune them for specific tasks have created a strong interest in model reuse and composition. A key challenge in composing iterative generative processes, such as GFlowNets and diffusion models, is that to realize the desired target distribution, all steps of the generative process need to be coordinated, and satisfy delicate balance conditions. In this work, we propose Compositional Sculpting: a general approach for defining compositions of iterative generative processes. We then introduce a method for sampling from these compositions built on classifier guidance. We showcase ways to accomplish compositional sculpting in both GFlowNets and diffusion models. We highlight two binary operations $\unicode{x2014}$ the harmonic mean ($p_1 \otimes p_2$) and the contrast ($p_1 \unicode{x25D1}\,p_2$) between pairs, and the generalization of these operations to multiple component distributions. We offer empirical results on image and molecular generation tasks.

classifier, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2309.16115

Country:

Europe (0.28)
North America > United States (0.28)
Asia > Middle East > Israel (0.14)

Genre: Research Report (0.64)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.45)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
(2 more...)

Add feedback

Improving Generalization of Adversarial Training via Robust Critical Fine-Tuning

Zhu, Kaijie, Wang, Jindong, Hu, Xixu, Xie, Xing, Yang, Ge

arXiv.org Artificial IntelligenceAug-1-2023

Deep neural networks are susceptible to adversarial examples, posing a significant security risk in critical applications. Adversarial Training (AT) is a well-established technique to enhance adversarial robustness, but it often comes at the cost of decreased generalization ability. This paper proposes Robustness Critical Fine-Tuning (RiFT), a novel approach to enhance generalization without compromising adversarial robustness. The core idea of RiFT is to exploit the redundant capacity for robustness by fine-tuning the adversarially trained model on its non-robust-critical module. To do so, we introduce module robust criticality (MRC), a measure that evaluates the significance of a given module to model robustness under worst-case weight perturbations. Using this measure, we identify the module with the lowest MRC value as the non-robust-critical module and fine-tune its weights to obtain fine-tuned weights. Subsequently, we linearly interpolate between the adversarially trained weights and fine-tuned weights to derive the optimal fine-tuned model weights. We demonstrate the efficacy of RiFT on ResNet18, ResNet34, and WideResNet34-10 models trained on CIFAR10, CIFAR100, and Tiny-ImageNet datasets. Our experiments show that \method can significantly improve both generalization and out-of-distribution robustness by around 1.5% while maintaining or even slightly enhancing adversarial robustness. Code is available at https://github.com/microsoft/robustlearn.

artificial intelligence, machine learning, robustness, (18 more...)

arXiv.org Artificial Intelligence

2308.02533

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Bilinear value networks

Hong, Zhang-Wei, Yang, Ge, Agrawal, Pulkit

arXiv.org Artificial IntelligenceJun-26-2023

The dominant framework for off-policy multi-goal reinforcement learning involves estimating goal conditioned Q-value function. When learning to achieve multiple goals, data efficiency is intimately connected with the generalization of the Q-function to new goals. The de-facto paradigm is to approximate Q(s, a, g) using monolithic neural networks. To improve the generalization of the Q-function, we propose a bilinear decomposition that represents the Q-value via a low-rank approximation in the form of a dot product between two vector fields. The first vector field, f(s, a), captures the environment's local dynamics at the state s; whereas the second component, {\phi}(s, g), captures the global relationship between the current state and the goal. We show that our bilinear decomposition scheme substantially improves data efficiency, and has superior transfer to out-of-distribution goals compared to prior methods. Empirical evidence is provided on the simulated Fetch robot task-suite and dexterous manipulation with a Shadow hand.

artificial intelligence, machine learning, reinforcement learning, (20 more...)

arXiv.org Artificial Intelligence

2204.13695

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report > New Finding (0.93)

Industry: Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

Neural Volumetric Memory for Visual Locomotion Control

Yang, Ruihan, Yang, Ge, Wang, Xiaolong

arXiv.org Artificial IntelligenceApr-3-2023

The control of such behaviors requires tight coupling with perception because vision is needed to provide details of Legged robots have the potential to expand the reach of the terrain right beneath the robot and the 3D scene immediately autonomy beyond paved roads. In this work, we consider around it. This problem is also partially-observable. the difficult problem of locomotion on challenging terrains Immediately relevant terrain information is often occluded using a single forward-facing depth camera. Due to the from the robot's current frame of observation, forcing it to partial observability of the problem, the robot has to rely rely on past observations for control decisions. For this reason, on past observations to infer the terrain currently beneath while blind controllers that are learned in simulation it. To solve this problem, we follow the paradigm in computer using reinforcement learning have achieved impressive results vision that explicitly models the 3D geometry of the in agility and robustness [33, 36, 38], there are clear scene and propose Neural Volumetric Memory (NVM), a geometric limitations on how much they can do. How to incorporate memory architecture that explicitly accounts for the perception into the pipeline to produce an integrated visuomotor SE(3) equivariance of the 3D world. NVM aggregates feature controller thus remains an open problem.

artificial intelligence, robot, visual observation, (14 more...)

arXiv.org Artificial Intelligence

2304.01201

Country:

Europe (1.00)
North America > United States (0.68)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Robots > Locomotion (0.89)

Add feedback

Overcoming the Spectral Bias of Neural Value Approximation

Yang, Ge, Ajay, Anurag, Agrawal, Pulkit

arXiv.org Machine LearningJun-9-2022

Value approximation using deep neural networks is at the heart of off-policy deep reinforcement learning, and is often the primary module that provides learning signals to the rest of the algorithm. While multi-layer perceptron networks are universal function approximators, recent works in neural kernel regression suggest the presence of a spectral bias, where fitting high-frequency components of the value function requires exponentially more gradient update steps than the low-frequency ones. In this work, we re-examine off-policy reinforcement learning through the lens of kernel regression and propose to overcome such bias via a composite neural tangent kernel. With just a single line-change, our approach, the Fourier feature networks (FFN) produce state-of-the-art performance on challenging continuous control domains with only a fraction of the compute. Faster convergence and better off-policy stability also make it possible to remove the target network without suffering catastrophic divergences, which further reduces TD}(0)'s estimation bias on a few tasks.

artificial intelligence, machine learning, neural value approximation, (2 more...)

arXiv.org Machine Learning

2206.04672

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.53)

Add feedback