AITopics | Lu, Peng

Collaborating Authors

Lu, Peng

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Resona: Improving Context Copying in Linear Recurrence Models with Retrieval

Wang, Xinyu, Ma, Linrui, Huang, Jerry, Lu, Peng, Parthasarathi, Prasanna, Chang, Xiao-Wen, Chen, Boxing, Cui, Yufei

arXiv.org Artificial IntelligenceMar-28-2025

Recent shifts in the space of large language model (LLM) research have shown an increasing focus on novel architectures to compete with prototypical Transformer-based models that have long dominated this space. Linear recurrent models have proven to be a viable competitor due to their computational efficiency. However, such models still demonstrate a sizable gap compared to Transformers in terms of in-context learning among other tasks that require recalling information from a context. In this work, we introduce __Resona__, a simple and scalable framework for augmenting linear recurrent models with retrieval. __Resona__~augments models with the ability to integrate retrieved information from the provided input context, enabling tailored behavior to diverse task requirements. Experiments on a variety of linear recurrent models demonstrate that __Resona__-augmented models observe significant performance gains on a variety of synthetic as well as real-world natural language tasks, highlighting its ability to act as a general purpose method to improve the in-context learning and language modeling abilities of linear recurrent LLMs.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2503.22913

Country:

North America > United States (1.00)
Asia (0.93)
Europe > Austria > Vienna (0.15)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Flying in Highly Dynamic Environments with End-to-end Learning Approach

Fan, Xiyu, Lu, Minghao, Xu, Bowen, Lu, Peng

arXiv.org Artificial IntelligenceMar-18-2025

Obstacle avoidance for unmanned aerial vehicles like quadrotors is a popular research topic. Most existing research focuses only on static environments, and obstacle avoidance in environments with multiple dynamic obstacles remains challenging. This paper proposes a novel deep-reinforcement learning-based approach for the quadrotors to navigate through highly dynamic environments. We propose a lidar data encoder to extract obstacle information from the massive point cloud data from the lidar. Multi frames of historical scans will be compressed into a 2-dimension obstacle map while maintaining the obstacle features required. An end-to-end deep neural network is trained to extract the kinematics of dynamic and static obstacles from the obstacle map, and it will generate acceleration commands to the quadrotor to control it to avoid these obstacles. Our approach contains perception and navigating functions in a single neural network, which can change from a navigating state into a hovering state without mode switching. We also present simulations and real-world experiments to show the effectiveness of our approach while navigating in highly dynamic cluttered environments.

dynamic obstacle, obstacle, quadrotor, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/LRA.2025.3547306

2503.14352

Country: Asia > China (0.14)

Industry:

Transportation > Air (0.93)
Information Technology (0.88)
Aerospace & Defense > Aircraft (0.66)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

ZETA: Leveraging Z-order Curves for Efficient Top-k Attention

Zeng, Qiuhao, Huang, Jerry, Lu, Peng, Xu, Gezheng, Chen, Boxing, Ling, Charles, Wang, Boyu

arXiv.org Artificial IntelligenceFeb-12-2025

Over recent years, the Transformer has become a fundamental building block for sequence modeling architectures. Yet at its core is the use of self-attention, whose memory and computational cost grow quadratically with the sequence length $N$, rendering it prohibitively expensive for long sequences. A promising approach is top-$k$ attention, which selects only the $k$ most relevant tokens and achieves performance comparable to vanilla self-attention while significantly reducing space and computational demands. However, causal masks require the current query token to only attend to past tokens, preventing the existing top-$k$ attention method from efficiently searching for the most relevant tokens in parallel, thereby limiting training efficiency. In this work, we propose ZETA, leveraging \textbf{Z}-Order Curves for \textbf{E}fficient \textbf{T}op-$k$ \textbf{A}ttention, to enable parallel querying of past tokens for entire sequences. % in both space and time complexity of $\mathcal{O}(N \log N)$. We first theoretically show that the choice of key and query dimensions involves a trade-off between the curse of dimensionality and the preservation of relative distances after projection. In light of this insight, we propose reducing the dimensionality of keys and queries in contrast to values and further leverage $Z$-order curves to map low-dimensional keys and queries into \emph{one}-dimensional space, which permits parallel sorting, thereby largely improving the efficiency for top-$k$ token selection. Experimental results demonstrate that ZETA matches the performance of standard attention on the synthetic \textsc{Multi-Query Associative Recall} task and outperforms attention and its variants on \textsc{Long Range Arena} and \textsc{WikiText-103} language modeling.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2501.14577

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)

Add feedback

ReGLA: Refining Gated Linear Attention

Lu, Peng, Kobyzev, Ivan, Rezagholizadeh, Mehdi, Chen, Boxing, Langlais, Philippe

arXiv.org Artificial IntelligenceFeb-5-2025

Recent advancements in Large Language Models (LLMs) have set themselves apart with their exceptional performance in complex language modelling tasks. However, these models are also known for their significant computational and storage requirements, primarily due to the quadratic computation complexity of softmax attention. To mitigate this issue, linear attention has been designed to reduce the quadratic space-time complexity that is inherent in standard transformers. In this work, we embarked on a comprehensive exploration of three key components that substantially impact the performance of the Gated Linear Attention module: feature maps, normalization, and the gating mechanism. We developed a feature mapping function to address some crucial issues that previous suggestions overlooked. Then we offered further rationale for the integration of normalization layers to stabilize the training process. Moreover, we explored the saturation phenomenon of the gating mechanism and augmented it with a refining module. We conducted extensive experiments and showed our architecture outperforms previous Gated Linear Attention mechanisms in extensive tasks including training from scratch and post-linearization with continual pre-training.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2502.01578

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Hawaii (0.14)
Europe > Austria > Vienna (0.14)
(3 more...)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

VinT-6D: A Large-Scale Object-in-hand Dataset from Vision, Touch and Proprioception

Wan, Zhaoliang, Ling, Yonggen, Yi, Senlin, Qi, Lu, Lee, Wangwei, Lu, Minglei, Yang, Sicheng, Teng, Xiao, Lu, Peng, Yang, Xu, Yang, Ming-Hsuan, Cheng, Hui

arXiv.org Artificial IntelligenceJan-6-2025

This paper addresses the scarcity of large-scale datasets for accurate object-in-hand pose estimation, which is crucial for robotic in-hand manipulation within the ``Perception-Planning-Control" paradigm. Specifically, we introduce VinT-6D, the first extensive multi-modal dataset integrating vision, touch, and proprioception, to enhance robotic manipulation. VinT-6D comprises 2 million VinT-Sim and 0.1 million VinT-Real splits, collected via simulations in MuJoCo and Blender and a custom-designed real-world platform. This dataset is tailored for robotic hands, offering models with whole-hand tactile perception and high-quality, well-aligned data. To the best of our knowledge, the VinT-Real is the largest considering the collection difficulties in the real-world environment so that it can bridge the gap of simulation to real compared to the previous works. Built upon VinT-6D, we present a benchmark method that shows significant improvements in performance by fusing multi-modal information. The project is available at https://VinT-6D.github.io/.

artificial intelligence, large-scale object-in-hand dataset, sensor, (14 more...)

arXiv.org Artificial Intelligence

2501.0051

Country:

North America > United States > California (0.14)
Europe > Austria > Vienna (0.14)
Asia > China > Guangdong Province (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.93)

Technology: Information Technology > Artificial Intelligence > Robots > Manipulation (1.00)

Add feedback

Learning an Adaptive Fall Recovery Controller for Quadrupeds on Complex Terrains

Lu, Yidan, Dong, Yinzhao, Ma, Ji, Zhang, Jiahui, Lu, Peng

arXiv.org Artificial IntelligenceDec-22-2024

Legged robots have made significant strides in locomotion However, in extreme or complex natural environments, capabilities, demonstrating impressive performance in robots still face the inevitability of falling. A major challenge tasks such as dynamic walking, running, and even complex in current research lies in developing adaptive controllers maneuvers like backflips [8], [2]. However, the ability to for robots to effectively recover from falls, allowing them recover from falls, especially on challenging and unpredictable to resume movement or efficiently complete tasks. However, terrains, remains a critical challenge in the field of legged model-based methods are often inadequate for these dynamic robotics. While substantial progress has been made in recovery tasks. For example, Mordatch et al. [12] proposed a framework strategies for flat or moderately uneven surfaces [7], [13], that optimizes automatic recovery through contact invariance, the problem of robust recovery on highly irregular terrains - but the reliance on predefined potential contact points limits such as rocky landscapes, steep inclines, or complex gaps - the exploration of flexible behaviors. In addition, classical has received limited attention.

artificial intelligence, machine learning, robot, (10 more...)

arXiv.org Artificial Intelligence

2412.16924

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Add feedback

DEIO: Deep Event Inertial Odometry

Guan, Weipeng, Lin, Fuling, Chen, Peiyu, Lu, Peng

arXiv.org Artificial IntelligenceNov-12-2024

Event cameras are bio-inspired, motion-activated sensors that demonstrate impressive potential in handling challenging situations, such as motion blur and high-dynamic range. Despite their promise, existing event-based simultaneous localization and mapping (SLAM) approaches exhibit limited performance in real-world applications. On the other hand, state-of-the-art SLAM approaches that incorporate deep neural networks for better robustness and applicability. However, these is a lack of research in fusing learning-based event SLAM methods with IMU, which could be indispensable to push the event-based SLAM to large-scale, low-texture or complex scenarios. In this paper, we propose DEIO, the first monocular deep event-inertial odometry framework that combines learning-based method with traditional nonlinear graph-based optimization. Specifically, we tightly integrate a trainable event-based differentiable bundle adjustment (e-DBA) with the IMU pre-integration in a factor graph which employs keyframe-based sliding window optimization. Numerical Experiments in nine public challenge datasets show that our method can achieve superior performance compared with the image-based and event-based benchmarks. The source code is available at: https://github.com/arclab-hku/DEIO.

artificial intelligence, machine learning, trajectory, (20 more...)

arXiv.org Artificial Intelligence

2411.03928

Country:

Europe > Switzerland (0.14)
Asia > China (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

LVI-GS: Tightly-coupled LiDAR-Visual-Inertial SLAM using 3D Gaussian Splatting

Zhao, Huibin, Guan, Weipeng, Lu, Peng

arXiv.org Artificial IntelligenceNov-4-2024

3D Gaussian Splatting (3DGS) has shown its ability in rapid rendering and high-fidelity mapping. In this paper, we introduce LVI-GS, a tightly-coupled LiDAR-Visual-Inertial mapping framework with 3DGS, which leverages the complementary characteristics of LiDAR and image sensors to capture both geometric structures and visual details of 3D scenes. To this end, the 3D Gaussians are initialized from colourized LiDAR points and optimized using differentiable rendering. In order to achieve high-fidelity mapping, we introduce a pyramid-based training approach to effectively learn multi-level features and incorporate depth loss derived from LiDAR measurements to improve geometric feature perception. Through well-designed strategies for Gaussian-Map expansion, keyframe selection, thread management, and custom CUDA acceleration, our framework achieves real-time photo-realistic mapping. Numerical experiments are performed to evaluate the superior performance of our method compared to state-of-the-art 3D reconstruction systems.

artificial intelligence, gaussian, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2411.02703

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Draft on the Fly: Adaptive Self-Speculative Decoding using Cosine Similarity

Metel, Michael R., Lu, Peng, Chen, Boxing, Rezagholizadeh, Mehdi, Kobyzev, Ivan

arXiv.org Artificial IntelligenceOct-1-2024

We present a simple on the fly method for faster inference of large language models. Unlike other (self-)speculative decoding techniques, our method does not require fine-tuning or black-box optimization to generate a fixed draft model, relying instead on simple rules to generate varying draft models adapted to the input context. We show empirically that our light-weight algorithm is competitive with the current SOTA for self-speculative decoding, while being a truly plug-and-play method.

draft model, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2410.01028

Country: North America > Canada (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.43)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback

SEGAN: semi-supervised learning approach for missing data imputation

Pan, Xiaohua, Wu, Weifeng, Liu, Peiran, Li, Zhen, Lu, Peng, Cao, Peijian, Zhang, Jianfeng, Qiu, Xianfei, Wu, YangYang

arXiv.org Artificial IntelligenceJun-12-2024

In many practical real-world applications, data missing is a very common phenomenon, making the development of data-driven artificial intelligence theory and technology increasingly difficult. Data completion is an important method for missing data preprocessing. Most existing miss-ing data completion models directly use the known information in the missing data set but ignore the impact of the data label information contained in the data set on the missing data completion model. To this end, this paper proposes a missing data completion model SEGAN based on semi-supervised learning, which mainly includes three important modules: generator, discriminator and classifier. In the SEGAN model, the classifier enables the generator to make more full use of known data and its label information when predicting missing data values. In addition, the SE-GAN model introduces a missing hint matrix to allow the discriminator to more effectively distinguish between known data and data filled by the generator. This paper theoretically proves that the SEGAN model that introduces a classifier and a missing hint matrix can learn the real known data distribution characteristics when reaching Nash equilibrium. Finally, a large number of experiments were conducted in this article, and the experimental results show that com-pared with the current state-of-the-art multivariate data completion method, the performance of the SEGAN model is improved by more than 3%.

artificial intelligence, data quality, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2405.13089

Country:

Asia > China (1.00)
North America > United States > Indiana > Tippecanoe County (0.14)

Genre: Research Report > Experimental Study (0.93)

Industry:

Health & Medicine > Therapeutic Area (0.94)
Government > Regional Government > North America Government (0.46)

Technology:

Information Technology > Data Science > Data Quality (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.86)
(2 more...)

Add feedback