Goto

Collaborating Authors

 filter


Pruning Filter in Filter

Neural Information Processing Systems

Pruning has become a very powerful and effective technique to compress and accelerate modern neural networks. Existing pruning methods can be grouped into two categories: filter pruning (FP) and weight pruning (WP). FP wins at hardware compatibility but loses at the compression ratio compared with WP. To converge the strength of both methods, we propose to prune the filter in the filter. Specifically, we treat a filter F, whose size is C K, as K 1 filters, then by pruning the stripes instead of the whole filter, we can achieves finer granularity than traditional FP while being hardware friendly.


No Filter: Cultural and Socioeconomic Diversity in Contrastive Vision-Language Models

Neural Information Processing Systems

We study cultural and socioeconomic diversity in contrastive vision-language models (VLMs). Using a broad range of benchmark datasets and evaluation metrics, we bring to attention several important findings. Notably, this performance gap is not captured by - and even at odds with - the currently popular evaluation metrics derived from the Western-centric ImageNet and COCO datasets. Second, pretraining with global, unfiltered data before fine-tuning on English content can improve cultural understanding without sacrificing performance on said popular benchmarks. Third, we introduce the task of geo-localization as a novel evaluation metric to assess cultural diversity in VLMs.


Open RL Benchmark: Comprehensive Tracked Experiments for Reinforcement Learning

Huang, Shengyi, Gallouédec, Quentin, Felten, Florian, Raffin, Antonin, Dossa, Rousslan Fernand Julien, Zhao, Yanxiao, Sullivan, Ryan, Makoviychuk, Viktor, Makoviichuk, Denys, Danesh, Mohamad H., Roumégous, Cyril, Weng, Jiayi, Chen, Chufan, Rahman, Md Masudur, Araújo, João G. M., Quan, Guorui, Tan, Daniel, Klein, Timo, Charakorn, Rujikorn, Towers, Mark, Berthelot, Yann, Mehta, Kinal, Chakraborty, Dipam, KG, Arjun, Charraut, Valentin, Ye, Chang, Liu, Zichen, Alegre, Lucas N., Nikulin, Alexander, Hu, Xiao, Liu, Tianlin, Choi, Jongwook, Yi, Brent

arXiv.org Artificial Intelligence

In many Reinforcement Learning (RL) papers, learning curves are useful indicators to measure the effectiveness of RL algorithms. However, the complete raw data of the learning curves are rarely available. As a result, it is usually necessary to reproduce the experiments from scratch, which can be time-consuming and error-prone. We present Open RL Benchmark, a set of fully tracked RL experiments, including not only the usual data such as episodic return, but also all algorithm-specific and system metrics. Open RL Benchmark is community-driven: anyone can download, use, and contribute to the data. At the time of writing, more than 25,000 runs have been tracked, for a cumulative duration of more than 8 years. Open RL Benchmark covers a wide range of RL libraries and reference implementations. Special care is taken to ensure that each experiment is precisely reproducible by providing not only the full parameters, but also the versions of the dependencies used to generate it. In addition, Open RL Benchmark comes with a command-line interface (CLI) for easy fetching and generating figures to present the results. In this document, we include two case studies to demonstrate the usefulness of Open RL Benchmark in practice. To the best of our knowledge, Open RL Benchmark is the first RL benchmark of its kind, and the authors hope that it will improve and facilitate the work of researchers in the field.


On original and latent space connectivity in deep neural networks

Gu, Boyang, Borovykh, Anastasia

arXiv.org Artificial Intelligence

The manifold hypothesis states that high-dimensional real-world data typically lies in a lower-dimensional submanifold, the axes of this dimensionality-reduced space representing factors of variation [26, 5]. Relatedly, the flattening hypothesis [2] and work in disentanglement [22] states that througout learning, subsequent layers in a deep neural network (DNN) disentangle the data in such a way that finally a linear model can separate the classes. Understanding how a DNN itself views its input space can be related to explainability (e.g.


Discovering Object-Centric Generalized Value Functions From Pixels

Nath, Somjit, Subbaraj, Gopeshh Raaj, Khetarpal, Khimya, Kahou, Samira Ebrahimi

arXiv.org Artificial Intelligence

Deep Reinforcement Learning has shown significant progress in extracting useful representations from high-dimensional inputs albeit using hand-crafted auxiliary tasks and pseudo rewards. Automatically learning such representations in an object-centric manner geared towards control and fast adaptation remains an open research problem. In this paper, we introduce a method that tries to discover meaningful features from objects, translating them to temporally coherent "question" functions and leveraging the subsequent learned general value functions for control. We compare our approach with state-of-the-art techniques alongside other ablations and show competitive performance in both stationary and non-stationary settings. Finally, we also investigate the discovered general value functions and through qualitative analysis show that the learned representations are not only interpretable but also, centered around objects that are invariant to changes across tasks facilitating fast adaptation.


Bayesian MRI Reconstruction with Joint Uncertainty Estimation using Diffusion Models

Luo, Guanxiong, Blumenthal, Moritz, Heide, Martin, Uecker, Martin

arXiv.org Artificial Intelligence

We introduce a framework that enables efficient sampling from learned probability distributions for MRI reconstruction. Different from conventional deep learning-based MRI reconstruction techniques, samples are drawn from the posterior distribution given the measured k-space using the Markov chain Monte Carlo (MCMC) method. In addition to the maximum a posteriori (MAP) estimate for the image, which can be obtained with conventional methods, the minimum mean square error (MMSE) estimate and uncertainty maps can also be computed. The data-driven Markov chains are constructed from the generative model learned from a given image database and are independent of the forward operator that is used to model the k-space measurement. This provides flexibility because the method can be applied to k-space acquired with different sampling schemes or receive coils using the same pre-trained models. Furthermore, we use a framework based on a reverse diffusion process to be able to utilize advanced generative models. The performance of the method is evaluated on an open dataset using 10-fold undersampling in k-space.


What are you wearing? Building a CNN model to predict articles of clothing.

#artificialintelligence

In this article, we will explore the architecture of a CNN, talk about what convolution does, and build a functioning model to predict Fashion MNIST data with a 90% accuracy among 10 classes. So what is this special process that CNNs get their name from? When we Convolve over an image we apply matrix multiplication using another "Kernel" matrix which has some weights for the values (these are learned during the training process). In the above example we have a 5X5 image and a 3X3 Kernel. Stride is how much to move the Kernel each time, by default the stride is 1 so we only move it one column or row over each time.


A Brief Introduction to Kalman Filters - KDnuggets

#artificialintelligence

Can you measure the temperature inside the core of a nuclear reactor to make sure the nuclear reaction is controlled? It certainly is too hot for any thermostat manufactured to date. The closest one can go is to measure the temperature of a surface close to the core and estimate the temperature inside it. Let us consider another example to internalize this concept where direct measurement of a phenomenon is not possible – can you measure the exact position of a flying object using radar technology considering variable air density, wind direction, and wind speed? What if the wind changed direction?


Disentanglement by Cyclic Reconstruction

Bertoin, David, Rachelson, Emmanuel

arXiv.org Artificial Intelligence

Deep neural networks have demonstrated their ability to automatically extract meaningful features from data. However, in supervised learning, information specific to the dataset used for training, but irrelevant to the task at hand, may remain encoded in the extracted representations. This remaining information introduces a domain-specific bias, weakening the generalization performance. In this work, we propose splitting the information into a task-related representation and its complementary context representation. We propose an original method, combining adversarial feature predictors and cyclic reconstruction, to disentangle these two representations in the single-domain supervised case. We then adapt this method to the unsupervised domain adaptation problem, consisting of training a model capable of performing on both a source and a target domain. In particular, our method promotes disentanglement in the target domain, despite the absence of training labels. This enables the isolation of task-specific information from both domains and a projection into a common representation. The task-specific representation allows efficient transfer of knowledge acquired from the source domain to the target domain. In the single-domain case, we demonstrate the quality of our representations on information retrieval tasks and the generalization benefits induced by sharpened task-specific representations. We then validate the proposed method on several classical domain adaptation benchmarks and illustrate the benefits of disentanglement for domain adaptation.


AI Tool Hub : Supercharge your content creation - Prelaunch Sale 🤟

#artificialintelligence

Want to supercharge your content creation process? 🔥 Pro Tip : Use AI Tool Hub to build a tool stack that works for you 🔥Does content creation intimidate you? 😥Do you want to create kickass content but end up wasting time in mundane repetitive tasks instead? 😫Are you wasting your precious time for finding ideas, checking grammar, and paraphrasing etc.? 🙄Have you been dreaming about starting that podcast or YouTube channel or that video course, scared to even think about editing? 😵Want to create content for driving growth but don't know where to even begin? 😤Are you still breaking your head against the "writer's block" in 2022? 🤦‍♀️I know the struggle.The list of challenges a digital content creator faces everyday is endless, whether you are a writer, an editor, a marketer, a podcaster, a vlogger, or even an aspiring creator.But this is 2022, AI is here, and it doesn't have to be like that anymore.No matter what your content creation needs are, there is an AI tool for you 🎉🥂🎶Don't waste your time and creative energy on time guzzling tasks that can be done by an AI tool. Wait, but then, finding the right AI Tools is a rabbit hole! 🤯You leave aside your project and drown yourself in finding the right AI tools. And before you know, you've wasted hours, sometimes even days.I know, I've been there myself, for years. The pain is real and it sucks.That's why I built AI Tool Hub, for myself, and for you 👇🚀 AI Tool Hub : the most comprehensive directory of AI Tools for writers, marketers & creators 🚀I spent 250+ hours building AI Tool Hub, so that you don't have to. It's a directory of 80 AI Tools, specifically curated for writers, marketers and digital creators. You will never waste hours doing tasks that a tool can do in minutes, or searching for the right tools, instead of creating awesome content. Folks tell me that this is the most comprehensive curated directory of AI tools, and there's nothing like this out there 😎In "AI Tool Hub" you will find:✔ Curated list of 80 AI Tools✔ 15+ Categories✔ 25+ Use-cases✔ Who could use : 15+ User Personas✔ Filter by Pricing Plans✔ Filter by Ease of Use✔ Standout USPs✔ Red Flags✔ and more...What will you get?🚀 List of 80 hand-picked AI tools for writers, marketers & creators🚀 Airtable directory with 10 properties & 3 views to filter and sort🚀 Lifetime access & regular updates🔥 BONUS: A sweet surprise. Stay tuned!Focus on creating more meaningful content, not searching for the right tools.Reach out to me on Twitter , if:you've any questions about this, or you want to become an affiliate, orjust simply to hangout 🍻