AITopics | Yang, Ming-Hsuan

Collaborating Authors

Yang, Ming-Hsuan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learning Spatial and Spatio-Temporal Pixel Aggregations for Image and Video Denoising

Xu, Xiangyu, Li, Muchen, Sun, Wenxiu, Yang, Ming-Hsuan

arXiv.org Artificial IntelligenceJan-26-2021

Existing denoising methods typically restore clear results by aggregating pixels from the noisy input. Instead of relying on hand-crafted aggregation schemes, we propose to explicitly learn this process with deep neural networks. We present a spatial pixel aggregation network and learn the pixel sampling and averaging strategies for image denoising. The proposed model naturally adapts to image structures and can effectively improve the denoised results. Furthermore, we develop a spatio-temporal pixel aggregation network for video denoising to efficiently sample pixels across the spatio-temporal space. Our method is able to solve the misalignment issues caused by large motion in dynamic scenes. In addition, we introduce a new regularization term for effectively training the proposed video denoising model. We present extensive analysis of the proposed method and demonstrate that our model performs favorably against the state-of-the-art image and video denoising approaches on both synthetic and real-world data.

deep learning, neural network, pixel, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TIP.2020.2999209

2101.1076

Country: North America > United States > California > Merced County > Merced (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Multi-path Neural Networks for On-device Multi-domain Visual Classification

Wang, Qifei, Ke, Junjie, Greaves, Joshua, Chu, Grace, Bender, Gabriel, Sbaiz, Luciano, Go, Alec, Howard, Andrew, Yang, Feng, Yang, Ming-Hsuan, Gilbert, Jeff, Milanfar, Peyman

arXiv.org Artificial IntelligenceOct-10-2020

Learning multiple domains/tasks with a single model is important for improving data efficiency and lowering inference cost for numerous vision tasks, especially on resource-constrained mobile devices. However, hand-crafting a multi-domain/task model can be both tedious and challenging. This paper proposes a novel approach to automatically learn a multi-path network for multi-domain visual classification on mobile devices. The proposed multi-path network is learned from neural architecture search by applying one reinforcement learning controller for each domain to select the best path in the super-network created from a MobileNetV3-like search space. An adaptive balanced domain prioritization algorithm is proposed to balance optimizing the joint model on multiple domains simultaneously. The determined multi-path model selectively shares parameters across domains in shared nodes while keeping domain-specific parameters within non-shared nodes in individual domain paths. This approach effectively reduces the total number of parameters and FLOPS, encouraging positive knowledge transfer while mitigating negative interference across domains. Extensive evaluations on the Visual Decathlon dataset demonstrate that the proposed multi-path model achieves state-of-the-art performance in terms of accuracy, model size, and FLOPS against other approaches using MobileNetV3-like architectures. Furthermore, the proposed method improves average accuracy over learning single-domain models individually, and reduces the total number of parameters and FLOPS by 78% and 32% respectively, compared to the approach that simply bundles single-domain models for multi-domain learning.

accuracy, artificial intelligence, neural network, (14 more...)

arXiv.org Artificial Intelligence

2010.04904

Genre:

Research Report > Promising Solution (0.34)
Overview > Innovation (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Semi-Supervised Learning with Meta-Gradient

Zhang, Xin-Yu, Jia, Hao-Lin, Xiao, Taihong, Cheng, Ming-Ming, Yang, Ming-Hsuan

arXiv.org Machine LearningJul-8-2020

In this work, we propose a simple yet effective meta-learning algorithm in the semi-supervised settings. We notice that existing consistency-based approaches mostly do not consider the essential role of the label information for consistency regularization. To alleviate this issue, we bridge the relationship between the consistency loss and label information by unfolding and differentiating through one optimization step. Specifically, we exploit the pseudo labels of the unlabeled examples which are guided by the meta-gradients of the labeled data loss so that the model can generalize well on the labeled examples. In addition, we introduce a simple first-order approximation to avoid computing higher-order derivatives and guarantee scalability. Extensive evaluations on the SVHN, CIFAR, and ImageNet datasets demonstrate that the proposed algorithm performs favorably against the state-of-the-art methods.

deep learning, neural network, pseudo label, (17 more...)

arXiv.org Machine Learning

2007.03966

Country: North America > United States > California (0.14)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.67)

Add feedback

Joint-task Self-supervised Learning for Temporal Correspondence

Li, Xueting, Liu, Sifei, Mello, Shalini De, Wang, Xiaolong, Kautz, Jan, Yang, Ming-Hsuan

Neural Information Processing SystemsMar-18-2020, 20:30:45 GMT

This paper proposes to learn reliable dense correspondence from videos in a self-supervised manner. Our learning process integrates two highly related tasks: tracking large image regions and establishing fine-grained pixel-level associations between consecutive video frames. We exploit the synergy between both tasks through a shared inter-frame affinity matrix, which simultaneously models transitions between video frames at both the region- and pixel-levels. While region-level localization helps reduce ambiguities in fine-grained matching by narrowing down search regions; fine-grained matching provides bottom-up features to facilitate region-level localization. Our method outperforms the state-of-the-art self-supervised methods on a variety of visual correspondence tasks, including video-object and part-segmentation propagation, keypoint tracking, and object tracking.

artificial intelligence, inductive learning, joint-task self-supervised learning, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.40)

Add feedback

Self-supervised Audio Spatialization with Correspondence Classifier

Lu, Yu-Ding, Lee, Hsin-Ying, Tseng, Hung-Yu, Yang, Ming-Hsuan

arXiv.org Machine LearningMay-13-2019

Spatial audio is an essential medium to audiences for 3D visual and auditory experience. However, the recording devices and techniques are expensive or inaccessible to the general public. In this work, we propose a self-supervised audio spatialization network that can generate spatial audio given the corresponding video and monaural audio. To enhance spatialization performance, we use an auxiliary classifier to classify ground-truth videos and those with audio where the left and right channels are swapped. We collect a large-scale video dataset with spatial audio to validate the proposed method. Experimental results demonstrate the effectiveness of the proposed model on the audio spatialization task.

deep learning, neural network, spatial audio, (20 more...)

arXiv.org Machine Learning

1905.05375

Genre: Research Report > New Finding (0.49)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Deep Non-Blind Deconvolution via Generalized Low-Rank Approximation

Ren, Wenqi, Zhang, Jiawei, Ma, Lin, Pan, Jinshan, Cao, Xiaochun, Zuo, Wangmeng, Liu, Wei, Yang, Ming-Hsuan

Neural Information Processing SystemsDec-31-2018

In this paper, we present a deep convolutional neural network to capture the inherent properties of image degradation, which can handle different kernels and saturated pixels in a unified framework. The proposed neural network is motivated by the low-rank property of pseudo-inverse kernels. We first compute a generalized low-rank approximation for a large number of blur kernels, and then use separable filters to initialize the convolutional parameters in the network. Our analysis shows that the estimated decomposed matrices contain the most essential information of the input kernel, which ensures the proposed network to handle various blurs in a unified framework and generate high-quality deblurring results. Experimental results on benchmark datasets with noise and saturated pixels demonstrate that the proposed algorithm performs favorably against state-of-the-art methods.

artificial intelligence, deep learning, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Asia > China (0.28)
North America > Canada (0.28)

Genre: Research Report (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Context-aware Synthesis and Placement of Object Instances

Lee, Donghoon, Liu, Sifei, Gu, Jinwei, Liu, Ming-Yu, Yang, Ming-Hsuan, Kautz, Jan

Neural Information Processing SystemsDec-31-2018

Learning to insert an object instance into an image in a semantically coherent manner is a challenging and interesting problem. Solving it requires (a) determining a location to place an object in the scene and (b) determining its appearance at the location. Such an object insertion model can potentially facilitate numerous image editing and scene parsing applications. In this paper, we propose an end-to-end trainable neural network for the task of inserting an object instance mask of a specified class into the semantic label map of an image. Our network consists of two generative modules where one determines where the inserted object mask should be (i.e., location and scale) and the other determines what the object mask shape (and pose) should look like. The two modules are connected together via a spatial transformation network and jointly trained. We devise a learning procedure that leverage both supervised and unsupervised data and show our model can insert an object at diverse locations with various appearances. We conduct extensive experimental validations with comparisons to strong baselines to verify the effectiveness of the proposed network. Code is available at https: //github.com/NVlabs/Instance_Insertion.

artificial intelligence, machine learning, module, (11 more...)

Neural Information Processing Systems

Country: North America > Canada (0.14)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Deep Attentive Tracking via Reciprocative Learning

Pu, Shi, Song, Yibing, Ma, Chao, Zhang, Honggang, Yang, Ming-Hsuan

Neural Information Processing SystemsDec-31-2018

Visual attention, derived from cognitive neuroscience, facilitates human perception on the most pertinent subset of the sensory data. Recently, significant efforts have been made to exploit attention schemes to advance computer vision systems. For visual tracking, it is often challenging to track target objects undergoing large appearance changes. Attention maps facilitate visual tracking by selectively paying attention to temporal robust features. Existing tracking-by-detection approaches mainly use additional attention modules to generate feature weights as the classifiers are not equipped with such mechanisms. In this paper, we propose a reciprocative learning algorithm to exploit visual attention for training deep classifiers. The proposed algorithm consists of feed-forward and backward operations to generate attention maps, which serve as regularization terms coupled with the original classification loss function for training. The deep classifier learns to attend to the regions of target objects robust to appearance changes. Extensive experiments on large-scale benchmark datasets show that the proposed attentive tracking method performs favorably against the state-of-the-art approaches.

attention map, health & medicine, neurology, (20 more...)

Neural Information Processing Systems

Country:

Asia > China (0.47)
North America (0.46)

Genre:

Research Report (0.49)
Overview (0.35)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.54)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Deep Attentive Tracking via Reciprocative Learning

Pu, Shi, Song, Yibing, Ma, Chao, Zhang, Honggang, Yang, Ming-Hsuan

Neural Information Processing SystemsDec-31-2018

attention map, health & medicine, neurology, (20 more...)

Neural Information Processing Systems

Country:

Asia > China (0.47)
North America (0.46)

Genre:

Research Report (0.49)
Overview (0.35)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.54)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Deep Non-Blind Deconvolution via Generalized Low-Rank Approximation

Ren, Wenqi, Zhang, Jiawei, Ma, Lin, Pan, Jinshan, Cao, Xiaochun, Zuo, Wangmeng, Liu, Wei, Yang, Ming-Hsuan

Neural Information Processing SystemsDec-31-2018

In this paper, we present a deep convolutional neural network to capture the inherent properties of image degradation, which can handle different kernels and saturated pixels in a unified framework. The proposed neural network is motivated by the low-rank property of pseudo-inverse kernels. Specifically, we first compute a generalized low-rank approximation to a large number of blur kernels, and then use separable filters to initialize the convolutional parameters in the network. Our analysis shows that the estimated decomposed matrices contain the most essential information of an input kernel, which ensures the proposed network to handle various blurs in a unified framework and generate high-quality deblurring results. Experimental results on benchmark datasets with noisy and saturated pixels demonstrate that the proposed deconvolution approach relying on generalized low-rank approximation performs favorably against the state-of-the-arts.

deep learning, kernel, neural network, (18 more...)

Neural Information Processing Systems

Country:

Asia > China (0.28)
North America > Canada (0.28)

Genre: Research Report (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback