AITopics | Ai, Bo

Collaborating Authors

Ai, Bo

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learning Adaptive Dexterous Grasping from Single Demonstrations

Shi, Liangzhi, Liu, Yulin, Zeng, Lingqi, Ai, Bo, Hong, Zhengdong, Su, Hao

arXiv.org Artificial IntelligenceMar-26-2025

How can robots learn dexterous grasping skills efficiently and apply them adaptively based on user instructions? This work tackles two key challenges: efficient skill acquisition from limited human demonstrations and context-driven skill selection. We introduce AdaDexGrasp, a framework that learns a library of grasping skills from a single human demonstration per skill and selects the most suitable one using a vision-language model (VLM). To improve sample efficiency, we propose a trajectory following reward that guides reinforcement learning (RL) toward states close to a human demonstration while allowing flexibility in exploration. To learn beyond the single demonstration, we employ curriculum learning, progressively increasing object pose variations to enhance robustness. At deployment, a VLM retrieves the appropriate skill based on user instructions, bridging low-level learned skills with high-level intent. We evaluate AdaDexGrasp in both simulation and real-world settings, showing that our approach significantly improves RL efficiency and enables learning human-like grasp strategies across varied object configurations. Finally, we demonstrate zero-shot transfer of our learned policies to a real-world PSYONIC Ability Hand, with a 90% success rate across objects, significantly outperforming the baseline.

demonstration, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2503.20208

Country: North America > United States (0.68)

Genre: Research Report (0.64)

Industry: Education (0.86)

Technology:

Information Technology > Artificial Intelligence > Robots > Manipulation (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Vision (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.67)

Add feedback

Diffusion Dynamics Models with Generative State Estimation for Cloth Manipulation

Tian, Tongxuan, Li, Haoyang, Ai, Bo, Yuan, Xiaodi, Huang, Zhiao, Su, Hao

arXiv.org Artificial IntelligenceMar-15-2025

Our approach integrates state estimation and dynamics modeling under a consistent architecture and training paradigm. Our diffusion-based perception model generates cloth states from partial observations, and the diffusion-based dynamics model generates physically plausible future states conditioned on action sequences, enabling robust model-based control. Our work demonstrates the potential of diffusion models in state estimation and dynamics modeling for manipulation tasks involving partial observability and complex dynamics. Abstract--Manipulating deformable objects like cloth is challenging states given the current state and robot actions. Leveraging a due to their complex dynamics, near-infinite degrees of transformer-based diffusion model, our method achieves highfidelity freedom, and frequent self-occlusions, which complicate state state reconstruction while reducing long-horizon dynamics estimation and dynamics modeling. Prior work has struggled with prediction errors by an order of magnitude compared to robust cloth state estimation, while dynamics models, primarily GNN-based approaches. Integrated with model-predictive control based on Graph Neural Networks (GNNs), are limited by their (MPC), our framework successfully executes cloth folding on a locality. Inspired by recent advances in generative models, we real robotic system, demonstrating the potential of generative hypothesize that these expressive models can effectively capture models for manipulation tasks with partial observability and intricate cloth configurations and deformation patterns from complex dynamics.

artificial intelligence, machine learning, manipulation, (17 more...)

arXiv.org Artificial Intelligence

2503.11999

Country: North America > United States > California (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry: Energy > Oil & Gas (0.55)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

A CGAN-LSTM-Based Framework for Time-Varying Non-Stationary Channel Modeling

Guo, Keying, He, Ruisi, Yang, Mi, Zhang, Yuxin, Ai, Bo, Zhang, Haoxiang, Han, Jiahui, Chen, Ruifeng

arXiv.org Artificial IntelligenceMar-2-2025

Time-varying non-stationary channels, with complex dynamic variations and temporal evolution characteristics, have significant challenges in channel modeling and communication system performance evaluation. Most existing methods of time-varying channel modeling focus on predicting channel state at a given moment or simulating short-term channel fluctuations, which are unable to capture the long-term evolution of the channel. This paper emphasizes the generation of long-term dynamic channel to fully capture evolution of non-stationary channel properties. The generated channel not only reflects temporal dynamics but also ensures consistent stationarity. We propose a hybrid deep learning framework that combines conditional generative adversarial networks (CGAN) with long short-term memory (LSTM) networks. A stationarity-constrained approach is designed to ensure temporal correlation of the generated time-series channel. This method can generate channel with required temporal non-stationarity. The model is validated by comparing channel statistical features, and the results show that the generated channel is in good agreement with raw channel and provides good performance in terms of non-stationarity.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2503.13468

Country: Asia > China (0.15)

Genre: Research Report > New Finding (0.48)

Industry: Transportation > Ground (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

COST CA20120 INTERACT Framework of Artificial Intelligence Based Channel Modeling

He, Ruisi, Cicco, Nicola D., Ai, Bo, Yang, Mi, Miao, Yang, Boban, Mate

arXiv.org Artificial IntelligenceOct-31-2024

Accurate channel models are the prerequisite for communication-theoretic investigations as well as system design. Channel modeling generally relies on statistical and deterministic approaches. However, there are still significant limits for the traditional modeling methods in terms of accuracy, generalization ability, and computational complexity. The fundamental reason is that establishing a quantified and accurate mapping between physical environment and channel characteristics becomes increasing challenging for modern communication systems. Here, in the context of COST CA20120 Action, we evaluate and discuss the feasibility and implementation of using artificial intelligence (AI) for channel modeling, and explore where the future of this field lies. Firstly, we present a framework of AI-based channel modeling to characterize complex wireless channels. Then, we highlight in detail some major challenges and present the possible solutions: i) estimating the uncertainty of AI-based channel predictions, ii) integrating prior knowledge of propagation to improve generalization capabilities, and iii) interpretable AI for channel modeling. We present and discuss illustrative numerical results to showcase the capabilities of AI-based channel modeling.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2411.11798

Country: Europe > Germany (0.28)

Genre: Research Report (0.64)

Industry: Telecommunications (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Communications > Networks (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

AI-Based Beam-Level and Cell-Level Mobility Management for High Speed Railway Communications

Li, Wen, Chen, Wei, Wang, Shiyue, Zhang, Yuanyuan, Matthaiou, Michail, Ai, Bo

arXiv.org Artificial IntelligenceJul-5-2024

High-speed railway (HSR) communications are pivotal for ensuring rail safety, operations, maintenance, and delivering passenger information services. The high speed of trains creates rapidly time-varying wireless channels, increases the signaling overhead, and reduces the system throughput, making it difficult to meet the growing and stringent needs of HSR applications. In this article, we explore artificial intelligence (AI)-based beam-level and cell-level mobility management suitable for HSR communications, including the use cases, inputs, outputs, and key performance indicators (KPI)s of AI models. Particularly, in comparison to traditional down-sampling spatial beam measurements, we show that the compressed spatial multi-beam measurements via compressive sensing lead to improved spatial-temporal beam prediction. Moreover, we demonstrate the performance gains of AI-assisted cell handover over traditional mobile handover mechanisms. In addition, we observe that the proposed approaches to reduce the measurement overhead achieve comparable radio link failure performance with the traditional approach that requires all the beam measurements of all cells, while the former methods can save 50% beam measurement overhead.

artificial intelligence, communication, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2407.04336

Country: Asia > China (0.14)

Genre: Research Report (1.00)

Industry: Transportation > Ground > Rail (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Communications > Mobile (0.76)

Add feedback

IntentionNet: Map-Lite Visual Navigation at the Kilometre Scale

Gao, Wei, Ai, Bo, Loo, Joel, Vinay, null, Hsu, David

arXiv.org Artificial IntelligenceJul-3-2024

Inspired by modern datadriven through diverse environments to distant goals? This remains approaches, the lower level of our system design is an open challenge due to the complexity and difficulty of a neural network-based controller that maps observations designing a robot that can generalise over environments, directly to velocity commands, and which is learned end-toend tolerate significant mapping and positioning inaccuracies from real world experience. Neural networks have the and recover from inevitable navigation errors. While many flexibility to accept a wide variety of input types, and we works tackle robot navigation, few systems capable of find that design space for the signals used by the system's long-range, kilometre-scale navigation exist. Classical robot upper level to guide the lower level is large. We exploit systems capable of long-range navigation like Montemerlo this property to design several different types of guidance et al. (2008); Kümmerle et al. (2013) use e xplicit signals, which we call intentions. We find that designing maps and find paths over them using classical planning the appropriate intention imbues the navigation system with algorithms (Siegwart et al. 2011), allowing them to reach specific abilities, such as the ability to tolerate significant arbitrarily distant goals in principle.

artificial intelligence, controller, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2407.03122

Country:

North America > United States (1.00)
Europe (0.68)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.50)

Industry: Transportation > Ground > Road (0.93)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

RoboPack: Learning Tactile-Informed Dynamics Models for Dense Packing

Ai, Bo, Tian, Stephen, Shi, Haochen, Wang, Yixuan, Tan, Cheston, Li, Yunzhu, Wu, Jiajun

arXiv.org Artificial IntelligenceJul-1-2024

Tactile feedback is critical for understanding the dynamics of both rigid and deformable objects in many manipulation tasks, such as non-prehensile manipulation and dense packing. We introduce an approach that combines visual and tactile sensing for robotic manipulation by learning a neural, tactile-informed dynamics model. Our proposed framework, RoboPack, employs a recurrent graph neural network to estimate object states, including particles and object-level latent physics information, from historical visuo-tactile observations and to perform future state predictions. Our tactile-informed dynamics model, learned from real-world data, can solve downstream robotics tasks with model-predictive control. We demonstrate our approach on a real robot equipped with a compliant Soft-Bubble tactile sensor on non-prehensile manipulation and dense packing tasks, where the robot must infer the physics properties of objects from direct and indirect interactions. Trained on only an average of 30 minutes of real-world interaction data per task, our model can perform online adaptation and make touch-informed predictions. Through extensive evaluations in both long-horizon dynamics prediction and real-world manipulation, our method demonstrates superior effectiveness compared to previous learning-based and physics-based simulation systems.

artificial intelligence, information, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2407.01418

Country: North America > United States > Illinois (0.14)

Genre:

Workflow (0.67)
Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)

Add feedback

VideoQA-SC: Adaptive Semantic Communication for Video Question Answering

Guo, Jiangyuan, Chen, Wei, Sun, Yuxuan, Xu, Jialong, Ai, Bo

arXiv.org Artificial IntelligenceMay-17-2024

Although semantic communication (SC) has shown its potential in efficiently transmitting multi-modal data such as text, speeches and images, SC for videos has focused primarily on pixel-level reconstruction. However, these SC systems may be suboptimal for downstream intelligent tasks. Moreover, SC systems without pixel-level video reconstruction present advantages by achieving higher bandwidth efficiency and real-time performance of various intelligent tasks. The difficulty in such system design lies in the extraction of task-related compact semantic representations and their accurate delivery over noisy channels. In this paper, we propose an end-to-end SC system for video question answering (VideoQA) tasks called VideoQA-SC. Our goal is to accomplish VideoQA tasks directly based on video semantics over noisy or fading wireless channels, bypassing the need for video reconstruction at the receiver. To this end, we develop a spatiotemporal semantic encoder for effective video semantic extraction, and a learning-based bandwidth-adaptive deep joint source-channel coding (DJSCC) scheme for efficient and robust video semantic transmission. Experiments demonstrate that VideoQA-SC outperforms traditional and advanced DJSCC-based SC systems that rely on video reconstruction at the receiver under a wide range of channel conditions and bandwidth constraints. In particular, when the signal-to-noise ratio is low, VideoQA-SC can improve the answer accuracy by 5.17% while saving almost 99.5% of the bandwidth at the same time, compared with the advanced DJSCC-based SC system. Our results show the great potential of task-oriented SC system design for video applications.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2406.18538

Country: Asia > China (0.14)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.71)

Add feedback

Generative AI Agent for Next-Generation MIMO Design: Fundamentals, Challenges, and Vision

Wang, Zhe, Zhang, Jiayi, Du, Hongyang, Zhang, Ruichen, Niyato, Dusit, Ai, Bo, Letaief, Khaled B.

arXiv.org Artificial IntelligenceApr-12-2024

Next-generation multiple input multiple output (MIMO) is expected to be intelligent and scalable. In this paper, we study generative artificial intelligence (AI) agent-enabled next-generation MIMO design. Firstly, we provide an overview of the development, fundamentals, and challenges of the next-generation MIMO. Then, we propose the concept of the generative AI agent, which is capable of generating tailored and specialized contents with the aid of large language model (LLM) and retrieval augmented generation (RAG). Next, we comprehensively discuss the features and advantages of the generative AI agent framework. More importantly, to tackle existing challenges of next-generation MIMO, we discuss generative AI agent-enabled next-generation MIMO design, from the perspective of performance analysis, signal processing, and resource allocation. Furthermore, we present two compelling case studies that demonstrate the effectiveness of leveraging the generative AI agent for performance analysis in complex configuration scenarios. These examples highlight how the integration of generative AI agents can significantly enhance the analysis and design of next-generation MIMO systems. Finally, we discuss important potential research future directions.

generative ai agent, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2404.08878

Country: Asia > China (0.14)

Genre:

Overview (0.68)
Research Report (0.50)

Industry:

Energy (0.69)
Telecommunications (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

Invariance is Key to Generalization: Examining the Role of Representation in Sim-to-Real Transfer for Visual Navigation

Ai, Bo, Wu, Zhanxin, Hsu, David

arXiv.org Artificial IntelligenceDec-3-2023

The data-driven approach to robot control has been gathering pace rapidly, yet generalization to unseen task domains remains a critical challenge. We argue that the key to generalization is representations that are (i) rich enough to capture all task-relevant information and (ii) invariant to superfluous variability between the training and the test domains. We experimentally study such a representation -- containing both depth and semantic information -- for visual navigation and show that it enables a control policy trained entirely in simulated indoor scenes to generalize to diverse real-world environments, both indoors and outdoors. Further, we show that our representation reduces the A-distance between the training and test domains, improving the generalization error bound as a result. Our proposed approach is scalable: the learned policy improves continuously, as the foundation models that it exploits absorb more diverse data during pre-training.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2310.1502

Country: Asia > Singapore (0.15)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.48)

Add feedback