AITopics | Hou, Yuenan

Collaborating Authors

Hou, Yuenan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

HexPlane Representation for 3D Semantic Scene Understanding

Chen, Zeren, Hou, Yuenan, Chen, Yulin, Liu, Li, Sun, Xiao, Sheng, Lu

arXiv.org Artificial IntelligenceMar-6-2025

In this paper, we introduce the HexPlane representation for 3D semantic scene understanding. Specifically, we first design the View Projection Module (VPM) to project the 3D point cloud into six planes to maximally retain the original spatial information. Features of six planes are extracted by the 2D encoder and sent to the HexPlane Association Module (HAM) to adaptively fuse the most informative information for each point. The fused point features are further fed to the task head to yield the ultimate predictions. Compared to the popular point and voxel representation, the HexPlane representation is efficient and can utilize highly optimized 2D operations to process sparse and unordered 3D point clouds. It can also leverage off-the-shelf 2D models, network weights, and training recipes to achieve accurate scene understanding in 3D space. On ScanNet and SemanticKITTI benchmarks, our algorithm, dubbed HexNet3D, achieves competitive performance with previous algorithms. In particular, on the ScanNet 3D segmentation task, our method obtains 77.0 mIoU on the validation set, surpassing Point Transformer V2 by 1.6 mIoU. We also observe encouraging results in indoor 3D detection tasks. Note that our method can be seamlessly integrated into existing voxel-based, point-based, and range-based approaches and brings considerable gains without bells and whistles. The codes will be available upon publication.

artificial intelligence, machine learning, point cloud, (15 more...)

arXiv.org Artificial Intelligence

2503.05127

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

A Comprehensive Survey on 3D Content Generation

Liu, Jian, Huang, Xiaoshui, Huang, Tianyu, Chen, Lu, Hou, Yuenan, Tang, Shixiang, Liu, Ziwei, Ouyang, Wanli, Zuo, Wangmeng, Jiang, Junjun, Liu, Xianming

arXiv.org Artificial IntelligenceFeb-2-2024

Recent years have witnessed remarkable advances in artificial intelligence generated content(AIGC), with diverse input modalities, e.g., text, image, video, audio and 3D. The 3D is the most close visual modality to real-world 3D environment and carries enormous knowledge. The 3D content generation shows both academic and practical values while also presenting formidable technical challenges. This review aims to consolidate developments within the burgeoning domain of 3D content generation. Specifically, a new taxonomy is proposed that categorizes existing approaches into three types: 3D native generative methods, 2D prior-based 3D generative methods, and hybrid 3D generative methods. The survey covers approximately 60 papers spanning the major techniques. Besides, we discuss limitations of current 3D content generation techniques, and point out open challenges as well as promising directions for future work. Accompanied with this survey, we have established a project website where the resources on 3D content generation research are provided. The project page is available at https://github.com/hitcslj/Awesome-AIGC-3D.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2402.01166

Country: Asia > China (0.28)

Genre: Overview (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Uni3D-LLM: Unifying Point Cloud Perception, Generation and Editing with Large Language Models

Liu, Dingning, Huang, Xiaoshui, Hou, Yuenan, Wang, Zhihui, Yin, Zhenfei, Gong, Yongshun, Gao, Peng, Ouyang, Wanli

arXiv.org Artificial IntelligenceJan-9-2024

In this paper, we introduce Uni3D-LLM, a unified framework that leverages a Large Language Model (LLM) to integrate tasks of 3D perception, generation, and editing within point cloud scenes. This framework empowers users to effortlessly generate and modify objects at specified locations within a scene, guided by the versatility of natural language descriptions. Uni3D-LLM harnesses the expressive power of natural language to allow for precise command over the generation and editing of 3D objects, thereby significantly enhancing operational flexibility and controllability. By mapping point cloud into the unified representation space, Uni3D-LLM achieves cross-application functionality, enabling the seamless execution of a wide array of tasks, ranging from the accurate instantiation of 3D objects to the diverse requirements of interactive design. Through a comprehensive suite of rigorous experiments, the efficacy of Uni3D-LLM in the comprehension, generation, and editing of point cloud has been validated. Additionally, we have assessed the impact of integrating a point cloud perception module on the generation and editing processes, confirming the substantial potential of our approach for practical applications.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2402.03327

Country:

Oceania > Australia (0.14)
Asia > China (0.14)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Rethinking Range View Representation for LiDAR Segmentation

Kong, Lingdong, Liu, Youquan, Chen, Runnan, Ma, Yuexin, Zhu, Xinge, Li, Yikang, Hou, Yuenan, Qiao, Yu, Liu, Ziwei

arXiv.org Artificial IntelligenceSep-3-2023

LiDAR segmentation is crucial for autonomous driving perception. Recent trends favor point- or voxel-based methods as they often yield better performance than the traditional range view representation. In this work, we unveil several key factors in building powerful range view models. We observe that the "many-to-one" mapping, semantic incoherence, and shape deformation are possible impediments against effective learning from range view projections. We present RangeFormer -- a full-cycle framework comprising novel designs across network architecture, data augmentation, and post-processing -- that better handles the learning and processing of LiDAR point clouds from the range view. We further introduce a Scalable Training from Range view (STR) strategy that trains on arbitrary low-resolution 2D range images, while still maintaining satisfactory 3D segmentation accuracy. We show that, for the first time, a range view method is able to surpass the point, voxel, and multi-view fusion counterparts in the competing LiDAR semantic and panoptic segmentation benchmarks, i.e., SemanticKITTI, nuScenes, and ScribbleKITTI.

artificial intelligence, machine learning, segmentation, (17 more...)

arXiv.org Artificial Intelligence

2303.05367

Country:

Asia > China (0.14)
Europe > Germany (0.14)

Genre: Research Report (0.82)

Industry:

Transportation > Ground > Road (0.48)
Information Technology > Robotics & Automation (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.48)

Add feedback

Agnostic Lane Detection

Hou, Yuenan

arXiv.org Artificial IntelligenceMay-2-2019

Lane detection is an important yet challenging task in autonomous driving, which is affected by many factors, e.g., light conditions, occlusions caused by other vehicles, irrelevant markings on the road and the inherent long and thin property of lanes. Conventional methods typically treat lane detection as a semantic segmentation task, which assigns a class label to each pixel of the image. This formulation heavily depends on the assumption that the number of lanes is pre-defined and fixed and no lane changing occurs, which does not always hold. To make the lane detection model applicable to an arbitrary number of lanes and lane changing scenarios, we adopt an instance segmentation approach, which first differentiates lanes and background and then classify each lane pixel into each lane instance. Besides, a multi-task learning paradigm is utilized to better exploit the structural information and the feature pyramid architecture is used to detect extremely thin lanes. Three popular lane detection benchmarks, i.e., TuSimple, CULane and BDD100K, are used to validate the effectiveness of our proposed algorithm.

artificial intelligence, lane detection, neural network, (18 more...)

arXiv.org Artificial Intelligence

1905.03704

Genre: Research Report (0.50)

Industry: Transportation > Ground > Road (0.49)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)

Add feedback