AITopics | Van Gool, Luc

Plotting

Van Gool, Luc

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

MHFormer: Multi-Hypothesis Transformer for 3D Human Pose Estimation

Li, Wenhao, Liu, Hong, Tang, Hao, Wang, Pichao, Van Gool, Luc

arXiv.org Artificial IntelligenceNov-24-2021

Estimating 3D human poses from monocular videos is a challenging task due to depth ambiguity and self-occlusion. Most existing works attempt to solve both issues by exploiting spatial and temporal relationships. However, those works ignore the fact that it is an inverse problem where multiple feasible solutions (i.e., hypotheses) exist. To relieve this limitation, we propose a Multi-Hypothesis Transformer (MHFormer) that learns spatio-temporal representations of multiple plausible pose hypotheses. In order to effectively model multi-hypothesis dependencies and build strong relationships across hypothesis features, the task is decomposed into three stages: (i) Generate multiple initial hypothesis representations; (ii) Model self-hypothesis communication, merge multiple hypotheses into a single converged representation and then partition it into several diverged hypotheses; (iii) Learn cross-hypothesis communication and aggregate the multi-hypothesis features to synthesize the final 3D pose. Through the above processes, the final representation is enhanced and the synthesized pose is much more accurate. Extensive experiments show that MHFormer achieves state-of-the-art results on two challenging datasets: Human3.6M and MPI-INF-3DHP. Without bells and whistles, its performance surpasses the previous best result by a large margin of 3% on Human3.6M. Code and models are available at https://github.com/Vegetebird/MHFormer.

artificial intelligence, hypothesis, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2111.12707

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.63)

Add feedback

MonoCInIS: Camera Independent Monocular 3D Object Detection using Instance Segmentation

Heylen, Jonas, De Wolf, Mark, Dawagne, Bruno, Proesmans, Marc, Van Gool, Luc, Abbeloos, Wim, Abdelkawy, Hazem, Reino, Daniel Olmeda

arXiv.org Artificial IntelligenceOct-1-2021

Monocular 3D object detection has recently shown promising results, however there remain challenging problems. One of those is the lack of invariance to different camera intrinsic parameters, which can be observed across different 3D object datasets. Little effort has been made to exploit the combination of heterogeneous 3D object datasets. In contrast to general intuition, we show that more data does not automatically guarantee a better performance, but rather, methods need to have a degree of 'camera independence' in order to benefit from large and heterogeneous training data. In this paper we propose a category-level pose estimation method based on instance segmentation, using camera independent geometric reasoning to cope with the varying camera viewpoints and intrinsics of different datasets. Every pixel of an instance predicts the object dimensions, the 3D object reference points projected in 2D image space and, optionally, the local viewing angle. Camera intrinsics are only used outside of the learned network to lift the predicted 2D reference points to 3D. We surpass camera independent methods on the challenging KITTI3D benchmark and show the key benefits compared to camera dependent methods.

artificial intelligence, object detection, segmentation, (2 more...)

arXiv.org Artificial Intelligence

2110.00464

Genre: Research Report (0.77)

Technology: Information Technology > Artificial Intelligence > Vision (0.89)

Add feedback

Temporally-Weighted Hierarchical Clustering for Unsupervised Action Segmentation

Sarfraz, M. Saquib, Murray, Naila, Sharma, Vivek, Diba, Ali, Van Gool, Luc, Stiefelhagen, Rainer

arXiv.org Artificial IntelligenceMar-25-2021

Action segmentation refers to inferring boundaries of semantically consistent visual concepts in videos and is an important requirement for many video understanding tasks. For this and other video understanding tasks, supervised approaches have achieved encouraging performance but require a high volume of detailed frame-level annotations. We present a fully automatic and unsupervised approach for segmenting actions in a video that does not require any training. Our proposal is an effective temporally-weighted hierarchical clustering algorithm that can group semantically consistent frames of the video. Our main finding is that representing a video with a 1-nearest neighbor graph by taking into account the time progression is sufficient to form semantically and temporally consistent clusters of frames where each cluster may represent some action in the video. Additionally, we establish strong unsupervised baselines for action segmentation and show significant performance improvements over published unsupervised methods on five challenging action segmentation datasets. Our approach also outperforms weakly-supervised methods by large margins on 4 of these datasets. Interestingly, we also achieve better results than many fully-supervised methods that have reported results on these datasets. Our code is available at https://github.com/ssarfraz/FINCH-Clustering/tree/master/TW-FINCH

artificial intelligence, machine learning, video, (17 more...)

arXiv.org Artificial Intelligence

2103.11264

Country:

North America > United States (0.28)
Europe (0.28)

Genre: Research Report (0.40)

Industry: Education (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Spectral Tensor Train Parameterization of Deep Learning Layers

Obukhov, Anton, Rakhuba, Maxim, Liniger, Alexander, Huang, Zhiwu, Georgoulis, Stamatios, Dai, Dengxin, Van Gool, Luc

arXiv.org Machine LearningMar-6-2021

We study low-rank parameterizations of weight matrices with embedded spectral properties in the Deep Learning context. The low-rank property leads to parameter efficiency and permits taking computational shortcuts when computing mappings. Spectral properties are often subject to constraints in optimization problems, leading to better models and stability of optimization. We start by looking at the compact SVD parameterization of weight matrices and identifying redundancy sources in the parameterization. We further apply the Tensor Train (TT) decomposition to the compact SVD components, and propose a non-redundant differentiable parameterization of fixed TT-rank tensor manifolds, termed the Spectral Tensor Train Parameterization (STTP). We demonstrate the effects of neural network compression in the image classification setting and both compression and improved training stability in the generative adversarial training setting.

deep learning, neural network, parameterization, (18 more...)

arXiv.org Machine Learning

2103.04217

Country:

Europe (0.93)
North America > United States > California (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Quantifying Aleatoric and Epistemic Uncertainty Using Density Estimation in Latent Space

Postels, Janis, Blum, Hermann, Cadena, Cesar, Siegwart, Roland, Van Gool, Luc, Tombari, Federico

arXiv.org Machine LearningDec-5-2020

The distribution of a neural network's latent representations has been successfully used to detect Out-of-Distribution (OOD) data. Since OOD detection denotes a popular benchmark for epistemic uncertainty estimates, this raises the question of a deeper correlation. This work investigates whether the distribution of latent representations indeed contains information about the uncertainty associated with the predictions of a neural network. Prior work identifies epistemic uncertainty with the surprise, thus the negative log-likelihood, of observing a particular latent representation, which we verify empirically. Moreover, we demonstrate that the output-conditional distribution of hidden representations allows quantifying aleatoric uncertainty via the entropy of the predictive distribution. We analyze epistemic and aleatoric uncertainty inferred from the representations of different layers and conclude with the exciting finding that the hidden repesentations of a deterministic neural network indeed contain information about its uncertainty. We verify our findings on both classification and regression models.

deep learning, epistemic uncertainty, neural network, (14 more...)

arXiv.org Machine Learning

2012.03082

Country:

Europe (0.28)
North America > United States > New York (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Understanding Bird's-Eye View Semantic HD-Maps Using an Onboard Monocular Camera

Can, Yigit Baran, Liniger, Alexander, Unal, Ozan, Paudel, Danda, Van Gool, Luc

arXiv.org Artificial IntelligenceDec-5-2020

Autonomous navigation requires scene understanding of the action-space to move or anticipate events. For planner agents moving on the ground plane, such as autonomous vehicles, this translates to scene understanding in the bird's-eye view. However, the onboard cameras of autonomous cars are customarily mounted horizontally for a better view of the surrounding. In this work, we study scene understanding in the form of online estimation of semantic bird's-eye-view HD-maps using the video input from a single onboard camera. We study three key aspects of this task, image-level understanding, BEV level understanding, and the aggregation of temporal information. Based on these three pillars we propose a novel architecture that combines these three aspects. In our extensive experiments, we demonstrate that the considered aspects are complementary to each other for HD-map understanding. Furthermore, the proposed architecture significantly surpasses the current state-of-the-art.

ground transportation, neural network, plane, (18 more...)

arXiv.org Artificial Intelligence

2012.0304

Country:

North America > United States > California (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (1.00)

Industry: Transportation > Ground > Road (0.90)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Learning from Simulation, Racing in Reality

Chisari, Eugenio, Liniger, Alexander, Rupenyan, Alisa, Van Gool, Luc, Lygeros, John

arXiv.org Artificial IntelligenceNov-26-2020

We present a reinforcement learning-based solution to autonomously race on a miniature race car platform. We show that a policy that is trained purely in simulation using a relatively simple vehicle model, including model randomization, can be successfully transferred to the real robotic setup. We achieve this by using novel policy output regularization approach and a lifted action space which enables smooth actions but still aggressive race car driving. We show that this regularized policy does outperform the Soft Actor Critic (SAC) baseline method, both in simulation and on the real car, but it is still outperformed by a Model Predictive Controller (MPC) state of the art method. The refinement of the policy with three hours of real-world interaction data allows the reinforcement learning policy to achieve lap times similar to the MPC controller while reducing track constraint violations by 50%.

artificial intelligence, downstream oil & gas, regularization, (14 more...)

arXiv.org Artificial Intelligence

2011.13332

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report > Promising Solution (0.34)

Industry:

Automobiles & Trucks (0.94)
Leisure & Entertainment > Sports > Motorsports (0.88)
Energy > Oil & Gas > Downstream (0.58)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Self-Supervised Shadow Removal

Vasluianu, Florin-Alexandru, Romero, Andres, Van Gool, Luc, Timofte, Radu

arXiv.org Artificial IntelligenceOct-22-2020

Shadow removal is an important computer vision task aiming at the detection and successful removal of the shadow produced by an occluded light source and a photo-realistic restoration of the image contents. Decades of re-search produced a multitude of hand-crafted restoration techniques and, more recently, learned solutions from shad-owed and shadow-free training image pairs. In this work,we propose an unsupervised single image shadow removal solution via self-supervised learning by using a conditioned mask. In contrast to existing literature, we do not require paired shadowed and shadow-free images, instead we rely on self-supervision and jointly learn deep models to remove and add shadows to images. We validate our approach on the recently introduced ISTD and USR datasets. We largely improve quantitatively and qualitatively over the compared methods and set a new state-of-the-art performance in single image shadow removal.

deep learning, neural network, shadow removal, (20 more...)

arXiv.org Artificial Intelligence

2010.11619

Genre: Research Report (0.64)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Commands 4 Autonomous Vehicles (C4AV) Workshop Summary

Deruyttere, Thierry, Vandenhende, Simon, Grujicic, Dusan, Liu, Yu, Van Gool, Luc, Blaschko, Matthew, Tuytelaars, Tinne, Moens, Marie-Francine

arXiv.org Artificial IntelligenceSep-18-2020

The task of visual grounding requires locating the most relevant region or object in an image, given a natural language query. So far, progress on this task was mostly measured on curated datasets, which are not always representative of human spoken language. In this work, we deviate from recent, popular task settings and consider the problem under an autonomous vehicle scenario. In particular, we consider a situation where passengers can give free-form natural language commands to a vehicle which can be associated with an object in the street scene. To stimulate research on this topic, we have organized the Commands for Autonomous Vehicles (C4AV) challenge based on the recent Talk2Car dataset. This paper presents the results of the challenge. First, we compare the used benchmark against existing datasets for visual grounding. Second, we identify the aspects that render top-performing models successful, and relate them to existing state-of-the-art models for visual grounding, in addition to detecting potential failure cases by evaluating on carefully selected subsets. Finally, we discuss several possibilities for future work.

deep learning, neural network, proceedings, (22 more...)

arXiv.org Artificial Intelligence

2009.08792

Country: Europe > Belgium > Flanders (0.14)

Genre:

Research Report (0.71)
Overview (0.68)
Instructional Material > Course Syllabus & Notes (0.40)

Industry:

Transportation > Passenger (0.48)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Neural Architecture Search as Sparse Supernet

Wu, Yan, Liu, Aoming, Huang, Zhiwu, Zhang, Siwei, Van Gool, Luc

arXiv.org Machine LearningJul-31-2020

This paper aims at enlarging the problem of Neural Architecture Search from Single-Path and Multi-Path Search to automated Mixed-Path Search. In particular, we model the new problem as a sparse supernet with a new continuous architecture representation using a mixture of sparsity constraints, i.e., Sparse Group Lasso. The sparse supernet is expected to automatically achieve sparsely-mixed paths upon a compact set of nodes. To optimize the proposed sparse supernet, we exploit a hierarchical accelerated proximal gradient algorithm within a bi-level optimization framework. Extensive experiments on CIFAR-10, CIFAR-100, Tiny ImageNet and ImageNet demonstrate that the proposed methodology is capable of searching for compact, general and powerful neural architectures.

architecture, artificial intelligence, neural network, (16 more...)

arXiv.org Machine Learning

2007.16112

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback