Goto

Collaborating Authors

 filter


Pruning Filter in Filter

Neural Information Processing Systems

Pruning has become a very powerful and effective technique to compress and accelerate modern neural networks. Existing pruning methods can be grouped into two categories: filter pruning (FP) and weight pruning (WP). FP wins at hardware compatibility but loses at the compression ratio compared with WP. To converge the strength of both methods, we propose to prune the filter in the filter. Specifically, we treat a filter F, whose size is C K, as K 1 filters, then by pruning the stripes instead of the whole filter, we can achieves finer granularity than traditional FP while being hardware friendly.


No Filter: Cultural and Socioeconomic Diversity in Contrastive Vision-Language Models

Neural Information Processing Systems

We study cultural and socioeconomic diversity in contrastive vision-language models (VLMs). Using a broad range of benchmark datasets and evaluation metrics, we bring to attention several important findings. Notably, this performance gap is not captured by - and even at odds with - the currently popular evaluation metrics derived from the Western-centric ImageNet and COCO datasets. Second, pretraining with global, unfiltered data before fine-tuning on English content can improve cultural understanding without sacrificing performance on said popular benchmarks. Third, we introduce the task of geo-localization as a novel evaluation metric to assess cultural diversity in VLMs.


Open RL Benchmark: Comprehensive Tracked Experiments for Reinforcement Learning

Huang, Shengyi, Gallouédec, Quentin, Felten, Florian, Raffin, Antonin, Dossa, Rousslan Fernand Julien, Zhao, Yanxiao, Sullivan, Ryan, Makoviychuk, Viktor, Makoviichuk, Denys, Danesh, Mohamad H., Roumégous, Cyril, Weng, Jiayi, Chen, Chufan, Rahman, Md Masudur, Araújo, João G. M., Quan, Guorui, Tan, Daniel, Klein, Timo, Charakorn, Rujikorn, Towers, Mark, Berthelot, Yann, Mehta, Kinal, Chakraborty, Dipam, KG, Arjun, Charraut, Valentin, Ye, Chang, Liu, Zichen, Alegre, Lucas N., Nikulin, Alexander, Hu, Xiao, Liu, Tianlin, Choi, Jongwook, Yi, Brent

arXiv.org Artificial Intelligence

In many Reinforcement Learning (RL) papers, learning curves are useful indicators to measure the effectiveness of RL algorithms. However, the complete raw data of the learning curves are rarely available. As a result, it is usually necessary to reproduce the experiments from scratch, which can be time-consuming and error-prone. We present Open RL Benchmark, a set of fully tracked RL experiments, including not only the usual data such as episodic return, but also all algorithm-specific and system metrics. Open RL Benchmark is community-driven: anyone can download, use, and contribute to the data. At the time of writing, more than 25,000 runs have been tracked, for a cumulative duration of more than 8 years. Open RL Benchmark covers a wide range of RL libraries and reference implementations. Special care is taken to ensure that each experiment is precisely reproducible by providing not only the full parameters, but also the versions of the dependencies used to generate it. In addition, Open RL Benchmark comes with a command-line interface (CLI) for easy fetching and generating figures to present the results. In this document, we include two case studies to demonstrate the usefulness of Open RL Benchmark in practice. To the best of our knowledge, Open RL Benchmark is the first RL benchmark of its kind, and the authors hope that it will improve and facilitate the work of researchers in the field.


On original and latent space connectivity in deep neural networks

Gu, Boyang, Borovykh, Anastasia

arXiv.org Artificial Intelligence

The manifold hypothesis states that high-dimensional real-world data typically lies in a lower-dimensional submanifold, the axes of this dimensionality-reduced space representing factors of variation [26, 5]. Relatedly, the flattening hypothesis [2] and work in disentanglement [22] states that througout learning, subsequent layers in a deep neural network (DNN) disentangle the data in such a way that finally a linear model can separate the classes. Understanding how a DNN itself views its input space can be related to explainability (e.g.


Discovering Object-Centric Generalized Value Functions From Pixels

Nath, Somjit, Subbaraj, Gopeshh Raaj, Khetarpal, Khimya, Kahou, Samira Ebrahimi

arXiv.org Artificial Intelligence

Deep Reinforcement Learning has shown significant progress in extracting useful representations from high-dimensional inputs albeit using hand-crafted auxiliary tasks and pseudo rewards. Automatically learning such representations in an object-centric manner geared towards control and fast adaptation remains an open research problem. In this paper, we introduce a method that tries to discover meaningful features from objects, translating them to temporally coherent "question" functions and leveraging the subsequent learned general value functions for control. We compare our approach with state-of-the-art techniques alongside other ablations and show competitive performance in both stationary and non-stationary settings. Finally, we also investigate the discovered general value functions and through qualitative analysis show that the learned representations are not only interpretable but also, centered around objects that are invariant to changes across tasks facilitating fast adaptation.


Bayesian MRI Reconstruction with Joint Uncertainty Estimation using Diffusion Models

Luo, Guanxiong, Blumenthal, Moritz, Heide, Martin, Uecker, Martin

arXiv.org Artificial Intelligence

We introduce a framework that enables efficient sampling from learned probability distributions for MRI reconstruction. Different from conventional deep learning-based MRI reconstruction techniques, samples are drawn from the posterior distribution given the measured k-space using the Markov chain Monte Carlo (MCMC) method. In addition to the maximum a posteriori (MAP) estimate for the image, which can be obtained with conventional methods, the minimum mean square error (MMSE) estimate and uncertainty maps can also be computed. The data-driven Markov chains are constructed from the generative model learned from a given image database and are independent of the forward operator that is used to model the k-space measurement. This provides flexibility because the method can be applied to k-space acquired with different sampling schemes or receive coils using the same pre-trained models. Furthermore, we use a framework based on a reverse diffusion process to be able to utilize advanced generative models. The performance of the method is evaluated on an open dataset using 10-fold undersampling in k-space.


Disentanglement by Cyclic Reconstruction

Bertoin, David, Rachelson, Emmanuel

arXiv.org Artificial Intelligence

Deep neural networks have demonstrated their ability to automatically extract meaningful features from data. However, in supervised learning, information specific to the dataset used for training, but irrelevant to the task at hand, may remain encoded in the extracted representations. This remaining information introduces a domain-specific bias, weakening the generalization performance. In this work, we propose splitting the information into a task-related representation and its complementary context representation. We propose an original method, combining adversarial feature predictors and cyclic reconstruction, to disentangle these two representations in the single-domain supervised case. We then adapt this method to the unsupervised domain adaptation problem, consisting of training a model capable of performing on both a source and a target domain. In particular, our method promotes disentanglement in the target domain, despite the absence of training labels. This enables the isolation of task-specific information from both domains and a projection into a common representation. The task-specific representation allows efficient transfer of knowledge acquired from the source domain to the target domain. In the single-domain case, we demonstrate the quality of our representations on information retrieval tasks and the generalization benefits induced by sharpened task-specific representations. We then validate the proposed method on several classical domain adaptation benchmarks and illustrate the benefits of disentanglement for domain adaptation.


Do We Really 'Lose Our Filter' as We Age? – Neuroscience News

#artificialintelligence

From age-related brain shrinkage that may affect our social cognition, to feeling more confident in our own skin, researchers investigate why …


iphone-8-iphone-8-plus-announcement-release-date

Engadget

The iPhone 8 will pack a new 4.7-inch Retina HD display, while the iPhone 8 Plus has 5.5-inch Retina HD display -- what's new here is the addition of Apple's True Tone tech. Apple has embedded Qi inductive wireless charging to mean that both phones will charge on compatible pucks and and surfaces when they launch. Qi charging surfaces are already on sale pretty much everywhere.


?utm_content=buffer3577f&utm_medium=social&utm_source=twitter.com&utm_campaign=buffer

#artificialintelligence

As is described in the report on Future Work Skills 2020 (Institute for the Future), working with data becomes very important. For example, skills as cognitive load management and computational thinking are essential to analyze, filter and manipulate data. We need to have computational thinking skills to translate amounts of data into abstract concepts and to make data-based decisions. With the increased amount of data, it becomes important to filter the right information (cognitive load management).