AITopics | Qiu, Di

Collaborating Authors

Qiu, Di

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

CHOSEN: Contrastive Hypothesis Selection for Multi-View Depth Refinement

Qiu, Di, Zhang, Yinda, Beeler, Thabo, Tankovich, Vladimir, Häne, Christian, Fanello, Sean, Rhemann, Christoph, Escolano, Sergio Orts

arXiv.org Artificial IntelligenceApr-2-2024

We propose CHOSEN, a simple yet flexible, robust and effective multi-view depth refinement framework. It can be employed in any existing multi-view stereo pipeline, with straightforward generalization capability for different multi-view capture systems such as camera relative positioning and lenses. Given an initial depth estimation, CHOSEN iteratively re-samples and selects the best hypotheses, and automatically adapts to different metric or intrinsic scales determined by the capture system. The key to our approach is the application of contrastive learning in an appropriate solution space and a carefully designed hypothesis feature, based on which positive and negative hypotheses can be effectively distinguished. Integrated in a simple baseline multi-view stereo pipeline, CHOSEN delivers impressive quality in terms of depth and normal accuracy compared to many current deep learning based multi-view stereo pipelines.

artificial intelligence, hypothesis, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2404.02225

Country:

Europe (0.46)
North America > United States (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

SWBT: Similarity Weighted Behavior Transformer with the Imperfect Demonstration for Robotic Manipulation

Wu, Kun, Liu, Ning, Zhao, Zhen, Qiu, Di, Li, Jinming, Che, Zhengping, Xu, Zhiyuan, Qiu, Qinru, Tang, Jian

arXiv.org Artificial IntelligenceJan-16-2024

Imitation learning (IL), aiming to learn optimal control policies from expert demonstrations, has been an effective method for robot manipulation tasks. However, previous IL methods either only use expensive expert demonstrations and omit imperfect demonstrations or rely on interacting with the environment and learning from online experiences. In the context of robotic manipulation, we aim to conquer the above two challenges and propose a novel framework named Similarity Weighted Behavior Transformer (SWBT). SWBT effectively learn from both expert and imperfect demonstrations without interaction with environments. We reveal that the easy-to-get imperfect demonstrations, such as forward and inverse dynamics, significantly enhance the network by learning fruitful information. To the best of our knowledge, we are the first to attempt to integrate imperfect demonstrations into the offline imitation learning setting for robot manipulation tasks. Extensive experiments on the ManiSkill2 benchmark built on the high-fidelity Sapien simulator and real-world robotic manipulation tasks demonstrated that the proposed method can extract better features and improve the success rates for all tasks. Our code will be released upon acceptance of the paper.

demonstration, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2401.08957

Country: Asia > China (0.14)

Genre:

Instructional Material > Course Syllabus & Notes (0.54)
Research Report (0.50)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.90)
Information Technology > Artificial Intelligence > Robots > Manipulation (0.69)

Add feedback

Gaussian3Diff: 3D Gaussian Diffusion for 3D Full Head Synthesis and Editing

Lan, Yushi, Tan, Feitong, Qiu, Di, Xu, Qiangeng, Genova, Kyle, Huang, Zeng, Fanello, Sean, Pandey, Rohit, Funkhouser, Thomas, Loy, Chen Change, Zhang, Yinda

arXiv.org Artificial IntelligenceDec-19-2023

We present a novel framework for generating photorealistic Editing capabilities for 3D-aware GANs have also been 3D human head and subsequently manipulating achieved through latent space auto-decoding, altering a 2D and reposing them with remarkable flexibility. The proposed semantic segmentation [62, 63], or modifying the underlying approach leverages an implicit function representation geometry scaffold [64]. However, generation and editing of 3D human heads, employing 3D Gaussians anchored quality tends to be unstable and less diversified due to on a parametric face model. To enhance representational the inherent limitation of GANs, and detailed-level editing capabilities and encode spatial information, we is not well supported due to feature entanglement in the embed a lightweight tri-plane payload within each Gaussian compact latent space or tri-plane representations.

artificial intelligence, gaussian, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2312.03763

Country: Asia (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision > Face Recognition (0.48)

Add feedback

Modal Uncertainty Estimation via Discrete Latent Representation

Qiu, Di, Lui, Lok Ming

arXiv.org Machine LearningJul-25-2020

Many important problems in the real world don't have unique solutions. It is thus important for machine learning models to be capable of proposing different plausible solutions with meaningful probability measures. In this work we introduce such a deep learning framework that learns the one-to-many mappings between the inputs and outputs, together with faithful uncertainty measures. We call our framework modal uncertainty estimation since we model the one-to-many mappings to be generated through a set of discrete latent variables, each representing a latent mode hypothesis that explains the corresponding type of input-output relationship. The discrete nature of the latent representations thus allows us to estimate for any input the conditional probability distribution of the outputs very effectively. Both the discrete latent space and its uncertainty estimation are jointly learned during training. We motivate our use of discrete latent space through the multi-modal posterior collapse problem in current conditional generative models, then develop the theoretical background, and extensively validate our method on both synthetic and realistic tasks. Our framework demonstrates significantly more accurate uncertainty estimation than the current state-of-the-art methods, and is informative and convenient for practical use. Making predictions in the real world has to face with various uncertainties.

deep learning, neural network, probabilistic u-net, (19 more...)

arXiv.org Machine Learning

2007.12858

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Add feedback