Plotting

 Miao, Yang


Holistic Understanding of 3D Scenes as Universal Scene Description

arXiv.org Artificial Intelligence

3D scene understanding is a long-standing challenge in computer vision and a key component in enabling mixed reality, wearable computing, and embodied AI. Providing a solution to these applications requires a multifaceted approach that covers scene-centric, object-centric, as well as interaction-centric capabilities. While there exist numerous datasets approaching the former two problems, the task of understanding interactable and articulated objects is underrepresented and only partly covered by current works. In this work, we address this shortcoming and introduce (1) an expertly curated dataset in the Universal Scene Description (USD) format, featuring high-quality manual annotations, for instance, segmentation and articulation on 280 indoor scenes; (2) a learning-based model together with a novel baseline capable of predicting part segmentation along with a full specification of motion attributes, including motion type, articulated and interactable parts, and motion parameters; (3) a benchmark serving to compare upcoming methods for the task at hand. Overall, our dataset provides 8 types of annotations - object and part segmentations, motion types, movable and interactable parts, motion parameters, connectivity, and object mass annotations. With its broad and high-quality annotations, the data provides the basis for holistic 3D scene understanding models. All data is provided in the USD format, allowing interoperability and easy integration with downstream tasks. We provide open access to our dataset, benchmark, and method's source code.


COST CA20120 INTERACT Framework of Artificial Intelligence Based Channel Modeling

arXiv.org Artificial Intelligence

Accurate channel models are the prerequisite for communication-theoretic investigations as well as system design. Channel modeling generally relies on statistical and deterministic approaches. However, there are still significant limits for the traditional modeling methods in terms of accuracy, generalization ability, and computational complexity. The fundamental reason is that establishing a quantified and accurate mapping between physical environment and channel characteristics becomes increasing challenging for modern communication systems. Here, in the context of COST CA20120 Action, we evaluate and discuss the feasibility and implementation of using artificial intelligence (AI) for channel modeling, and explore where the future of this field lies. Firstly, we present a framework of AI-based channel modeling to characterize complex wireless channels. Then, we highlight in detail some major challenges and present the possible solutions: i) estimating the uncertainty of AI-based channel predictions, ii) integrating prior knowledge of propagation to improve generalization capabilities, and iii) interpretable AI for channel modeling. We present and discuss illustrative numerical results to showcase the capabilities of AI-based channel modeling.


Volumetric Semantically Consistent 3D Panoptic Mapping

arXiv.org Artificial Intelligence

We introduce an online 2D-to-3D semantic instance mapping algorithm aimed at generating comprehensive, accurate, and efficient semantic 3D maps suitable for autonomous agents in unstructured environments. The proposed approach is based on a Voxel-TSDF representation used in recent algorithms. It introduces novel ways of integrating semantic prediction confidence during mapping, producing semantic and instance-consistent 3D regions. Further improvements are achieved by graph optimization-based semantic labeling and instance refinement. The proposed method achieves accuracy superior to the state of the art on public large-scale datasets, improving on a number of widely used metrics. We also highlight a downfall in the evaluation of recent studies: using the ground truth trajectory as input instead of a SLAM-estimated one substantially affects the accuracy, creating a large gap between the reported results and the actual performance on real-world data.