Not enough data to create a plot.
Try a different view from the menu above.
principled model and learning-based method for building deformable templates, (2) extending this to learn conditional
We thank the reviewers for their insightful and constructive feedback. Regarding Reviewer 1 and 3's note about code release: we will release code, model weights and atlases. We agree with comments from Reviewers 1 and 2 about the importance of neuroimaging experiments. We will expand the description of these experiments in the supplementary, and add detail to the main text. We will release code, please see our explanation above.
Paradoxes in Fair Machine Learning
Paul Goelz, Anson Kahng, Ariel D. Procaccia
Equalized odds is a statistical notion of fairness in machine learning that ensures that classification algorithms do not discriminate against protected groups. We extend equalized odds to the setting of cardinality-constrained fair classification, where we have a bounded amount of a resource to distribute. This setting coincides with classic fair division problems, which allows us to apply concepts from that literature in parallel to equalized odds. In particular, we consider the axioms of resource monotonicity, consistency, and population monotonicity, all three of which relate different allocation instances to prevent paradoxes. Using a geometric characterization of equalized odds, we examine the compatibility of equalized odds with these axioms. We empirically evaluate the cost of allocation rules that satisfy both equalized odds and axioms of fair division on a dataset of FICO credit scores.
Mitigating Biases in Blackbox Feature Extractors for Image Classification Tasks
In image classification, it is common to utilize a pretrained model to extract meaningful features of the input images, and then to train a classifier on top of it to make predictions for any downstream task. Trained on enormous amounts of data, these models have been shown to contain harmful biases which can hurt their performance when adapted for a downstream classification task. Further, very often they may be blackbox, either due to scale, or because of unavailability of model weights or architecture. Thus, during a downstream task, we cannot debias such models by updating the weights of the feature encoder, as only the classifier can be finetuned. In this regard, we investigate the suitability of some existing debiasing techniques and thereby motivate the need for more focused research towards this problem setting. Furthermore, we propose a simple method consisting of a clustering-based adaptive margin loss with a blackbox feature encoder, with no knowledge of the bias attribute. Our experiments demonstrate the effectiveness of our method across multiple benchmarks.
Visual Data Diagnosis and Debiasing with Concept Graphs
The widespread success of deep learning models today is owed to the curation of extensive datasets significant in size and complexity. However, such models frequently pick up inherent biases in the data during the training process, leading to unreliable predictions. Diagnosing and debiasing datasets is thus a necessity to ensure reliable model performance.
Regularized Anderson Acceleration for Off-Policy Deep Reinforcement Learning
Wenjie Shi, Shiji Song, Hui Wu, Ya-Chu Hsu, Cheng Wu, Gao Huang
Model-free deep reinforcement learning (RL) algorithms have been widely used for a range of complex control tasks. However, slow convergence and sample inefficiency remain challenging problems in RL, especially when handling continuous and high-dimensional state spaces. To tackle this problem, we propose a general acceleration method for model-free, off-policy deep RL algorithms by drawing the idea underlying regularized Anderson acceleration (RAA), which is an effective approach to accelerating the solving of fixed point problems with perturbations. Specifically, we first explain how policy iteration can be applied directly with Anderson acceleration. Then we extend RAA to the case of deep RL by introducing a regularization term to control the impact of perturbation induced by function approximation errors. We further propose two strategies, i.e., progressive update and adaptive restart, to enhance the performance. The effectiveness of our method is evaluated on a variety of benchmark tasks, including Atari 2600 and MuJoCo. Experimental results show that our approach substantially improves both the learning speed and final performance of state-of-the-art deep RL algorithms. The code and models are available at: https://github.com/shiwj16/raa-drl.
Checklist
For all authors... (a) Do the main claims made in the abstract and introduction accurately reflect the paper's contributions and scope? If you ran experiments (e.g. for benchmarks)... (a) Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] See data website, "vocalator" GitHub repo for DNNs, and supplement. If you used crowdsourcing or conducted research with human subjects... (a) Did you include the full text of instructions given to participants and screenshots, if applicable? [N/A] (b) Did you describe any potential participant risks, with links to Institutional Review Board (IRB) approvals, if applicable? [N/A] (c) Did you include the estimated hourly wage paid to participants and the total amount spent on participant compensation? A.) Performance of models trained on single stimuli from Speaker-4M-E1 dataset and evaluated on all other stimulus types. Additionally, we wish to utilize the tool for long-term recordings in which the types of vocalizations encountered may change over time as the animals enter new stages of life. As such, we have significant interest in the model's ability to generalize to unfamiliar vocal calls To explore this, we tested the ability of deep networks to generalize to new vocal calls with different acoustic features. We partitioned the Speaker-4M-E1 Dataset according to stimulus type (Supplementary Figure 2A), trained a deep neural network on each subset, and measured its performance on every stimulus type individually (Supplementary Figure 2B). We found that while many models could generalize to new stimuli with performance exceeding chance, their ability to do so is greatly overshadowed by their performance on their own subsets. Models trained on a single stimulus type generalized well to the same stimulus at different volumes.
Vocal Call Locator Benchmark (VCL) for localizing rodent vocalizations from multi-channel audio
Understanding the behavioral and neural dynamics of social interactions is a goal of contemporary neuroscience. Many machine learning methods have emerged in recent years to make sense of complex video and neurophysiological data that result from these experiments. Less focus has been placed on understanding how animals process acoustic information, including social vocalizations. A critical step to bridge this gap is determining the senders and receivers of acoustic information in social interactions. While sound source localization (SSL) is a classic problem in signal processing, existing approaches are limited in their ability to localize animal-generated sounds in standard laboratory environments.
UniAR: A Unified model for predicting human Attention and Responses on visual content, Gang Li
Progress in human behavior modeling involves understanding both implicit, earlystage perceptual behavior, such as human attention, and explicit, later-stage behavior, such as subjective preferences or likes. Yet most prior research has focused on modeling implicit and explicit human behavior in isolation; and often limited to a specific type of visual content. We propose UniAR - a unified model of human attention and preference behavior across diverse visual content. UniAR leverages a multimodal transformer to predict subjective feedback, such as satisfaction or aesthetic quality, along with the underlying human attention or interaction heatmaps and viewing order. We train UniAR on diverse public datasets spanning natural images, webpages, and graphic designs, and achieve SOTA performance on multiple benchmarks across various image domains and behavior modeling tasks. Potential applications include providing instant feedback on the effectiveness of UIs/visual content, and enabling designers and content-creation models to optimize their creation for human-centric improvements.
bb04af0f7ecaee4aae62035497da1387-AuthorFeedback.pdf
We thank the reviewers for their comments and suggestions. We will incorporate the suggestions in our revised version. Below, we address the main concerns raised in the reviews. This extends the 2-order result discussed in Maron et al. to any k. The work of Morris et al. [2019] was one of our main inspirations.