Practical and Consistent Estimation of f-Divergences
Paul Rubenstein, Olivier Bousquet, Josip Djolonga, Carlos Riquelme, Ilya O. Tolstikhin
The estimation of an f-divergence between two probability distributions based on samples is a fundamental problem in statistics and machine learning. Most works study this problem under very weak assumptions, in which case it is provably hard. We consider the case of stronger structural assumptions that are commonly satisfied in modern machine learning, including representation learning and generative modelling with autoencoder architectures. Under these assumptions we propose and study an estimator that can be easily implemented, works well in high dimensions, and enjoys faster rates of convergence. We verify the behavior of our estimator empirically in both synthetic and real-data experiments, and discuss its direct implications for total correlation, entropy, and mutual information estimation.
b2eeb7362ef83deff5c7813a67e14f0a-AuthorFeedback.pdf
We thank all the reviewers for their valuable feedback. To summarize, we received three "Good Paper; accept" and a "Marginally above the acceptance threshold" ratings, which confirm the significance of our work for the community. R1.1: "The experiment is only done for one method..." Authors: We only tested aLRP Loss on RetinaNet following However, these results are not final because we used RetinaNet's optimal learning rate and schedule, which we will Our experiments are still in progress for two-stage detectors and other one-stage detectors. R1.2: "...I wonder how these label-assign strategies work with the aLRP loss. Can they be further improved?"
Exact inference in structured prediction
Structured prediction can be thought of as a simultaneous prediction of multiple labels. This is often done by maximizing a score function on the space of labels, which decomposes as a sum of pairwise and unary potentials. The above is naturally modeled with a graph, where edges and vertices are related to pairwise and unary potentials, respectively. We consider the generative process proposed by Globerson et al. (2015) and apply it to general connected graphs. We analyze the structural conditions of the graph that allow for the exact recovery of the labels. Our results show that exact recovery is possible and achievable in polynomial time for a large class of graphs. In particular, we show that graphs that are bad expanders can be exactly recovered by adding small edge perturbations coming from the Erdลs-Rรฉnyi model. Finally, as a byproduct of our analysis, we provide an extension of Cheeger's inequality.
Monoculture in Matching Markets
Algorithmic monoculture arises when many decision-makers rely on the same algorithm to evaluate applicants. An emerging body of work investigates possible harms of such homogeneity, but has been limited by the challenge of incorporating market effects in which the preferences of many applicants and decision-makers jointly interact to determine outcomes. Addressing this challenge, we introduce a tractable theoretical model of algorithmic monoculture in a two-sided matching market with many participants. We use the model to analyze outcomes under monoculture (when decision-makers all evaluate applicants using a common algorithm) and under polyculture (when decision-makers evaluate applicants independently). All else equal, monoculture (1) selects less-preferred applicants when noise is well-behaved, (2) matches more applicants to their top choice, though individual applicants may be worse off and have higher variance outcomes depending on their value to decision-makers, and (3) is more robust to disparities in the number of applications submitted. Overall, our approach strengthens, challenges, and broadens the scope of the existing monoculture literature.
Probabilistic Federated Prompt-Tuning with Non-IID and Imbalanced Data
Fine-tuning pre-trained models is a popular approach in machine learning for solving complex tasks with moderate data. However, fine-tuning the entire pretrained model is ineffective in federated data scenarios where local data distributions are diversely skewed. To address this, we explore integrating federated learning with a more effective prompt-tuning method, optimizing for a small set of input prefixes to reprogram the pre-trained model's behavior. Our approach transforms federated learning into a distributed set modeling task, aggregating diverse sets of prompts to globally fine-tune the pre-trained model. We benchmark various baselines based on direct adaptations of existing federated model aggregation techniques and introduce a new probabilistic prompt aggregation method that substantially outperforms these baselines. Our reported results on a variety of computer vision datasets confirm that the proposed method is most effective to combat extreme data heterogeneity in federated learning.
Meta-Inverse Reinforcement Learning with Probabilistic Context Variables
Providing a suitable reward function to reinforcement learning can be difficult in many real world applications. While inverse reinforcement learning (IRL) holds promise for automatically learning reward functions from demonstrations, several major challenges remain. First, existing IRL methods learn reward functions from scratch, requiring large numbers of demonstrations to correctly infer the reward for each task the agent may need to perform. Second, existing methods typically assume homogeneous demonstrations for a single behavior or task, while in practice, it might be easier to collect datasets of heterogeneous but related behaviors. To this end, we propose a deep latent variable model that is capable of learning rewards from demonstrations of distinct but related tasks in an unsupervised way.
realSEUDO for real-time calcium imaging analysis
Closed-loop neuroscience experimentation, where recorded neural activity is used to modify the experiment on-the-fly, is critical for deducing causal connections and optimizing experimental time. A critical step in creating a closed-loop experiment is real-time inference of neural activity from streaming recordings. One challenging modality for real-time processing is multi-photon calcium imaging (CI). CI enables the recording of activity in large populations of neurons however, often requires batch processing of the video data to extract single-neuron activity from the fluorescence videos. We use the recently proposed robust time-trace estimator--Sparse Emulation of Unused Dictionary Objects (SEUDO) algorithm--as a basis for a new on-line processing algorithm that simultaneously identifies neurons in the fluorescence video and infers their time traces in a way that is robust to as-yet unidentified neurons. To achieve real-time SEUDO (realSEUDO), we optimize the core estimator via both algorithmic improvements and an fast C-based implementation, and create a new cell finding loop to enable realSEUDO to also identify new cells. We demonstrate comparable performance to offline algorithms (e.g., CNMF), and improved performance over the current on-line approach (OnACID) at speeds of 120 Hz on average.
Fox News AI Newsletter: Scammers can exploit your data from just 1 ChatGPT search
Welcome to Fox News' Artificial Intelligence newsletter with the latest AI technology advancements. IN TODAY'S NEWSLETTER: - Scammers can exploit your data from just one ChatGPT search - Business Insider embraces AI while laying off 21% of workforce - Nvidia, Dell partner with Trump admin to make next-gen supercomputer GUARD YOUR DATA: ChatGPT and other large language models (LLMs) have become amazing helpers for everyday tasks. Whether it's summarizing complex ideas, designing a birthday card or even planning your apartment's layout, you can get impressive results with just a simple prompt. NEWS BREAK: Business Insider announced Thursday that the company will be shrinking the size of its newsroom and making layoffs, impacting over a fifth of its staff. Business Insider CEO Barbara Peng said in an internal memo obtained by Fox News Digital that the company is "fully embracing AI," as 70% of the company's staff currently uses Enterprise ChatGPT, with a goal of 100%.