Goto

Collaborating Authors

 Information Technology


containing n = 5 10

Neural Information Processing Systems

We thank all reviewers for their comments, and will incorporate suggestions in the final version. We compare the proposed algorithms with baseline algorithms on the U.S. 2000 Census Data All algorithms are implemented in Python 3.7. We also calculate the optimal solution to verify the approximation ratio. See Table 1 in our submission for definitions. In both datasets, our algorithm outperforms both baseline algorithms by a significant margin, and achieves the best accuracy in almost all settings.


A Data-Centric Perspective on Evaluating Machine Learning Models for Tabular Data

Neural Information Processing Systems

Tabular data is prevalent in real-world machine learning applications, and new models for supervised learning of tabular data are frequently proposed. Comparative studies assessing the performance of models typically consist of model-centric evaluation setups with overly standardized data preprocessing. This paper demonstrates that such model-centric evaluations are biased, as real-world modeling pipelines often require dataset-specific preprocessing, which includes feature engineering. Therefore, we propose a data-centric evaluation framework. We select 10 relevant datasets from Kaggle competitions and implement expert-level preprocessing pipelines for each dataset. We conduct experiments with different preprocessing pipelines and hyperparameter optimization (HPO) regimes to quantify the impact of model selection, HPO, feature engineering, and test-time adaptation. Our main findings are: 1. After dataset-specific feature engineering, model rankings change considerably, performance differences decrease, and the importance of model selection reduces.


7e9e346dc5fd268b49bf418523af8679-AuthorFeedback.pdf

Neural Information Processing Systems

Comments on presentation: Thank you for the helpful suggestions. We will move some of the "drier" portions of We will refine and improve these diagrams for the final version of the paper. We'll make the figures bigger in the final paper (please zoom-in). We will add a discussion on this as possible future work in our revision. Reviewer 2: We'd like to clarify that our claims of a new SOTA were only for the neural LFP task; we did not intend to Regardless, pushing a new SOTA was not our primary objective.


WoodFisher: Efficient Second-Order Approximation for Neural Network Compression

Neural Information Processing Systems

Second-order information, in the form of Hessian-or Inverse-Hessian-vector products, is a fundamental tool for solving optimization problems. Recently, there has been significant interest in utilizing this information in the context of deep neural networks; however, relatively little is known about the quality of existing approximations in this context. Our work examines this question, identifies issues with existing approaches, and proposes a method called WoodFisher to compute a faithful and efficient estimate of the inverse Hessian. Our main application is to neural network compression, where we build on the classic Optimal Brain Damage/Surgeon framework. We demonstrate that WoodFisher significantly outperforms popular state-of-the-art methods for oneshot pruning. Further, even when iterative, gradual pruning is allowed, our method results in a gain in test accuracy over the state-of-the-art approaches, for standard image classification datasets such as ImageNet ILSVRC. We examine how our method can be extended to take into account first-order information, as well as illustrate its ability to automatically set layer-wise pruning thresholds and perform compression in the limited-data regime.


WoodFisher: Efficient Second-Order Approximation for Neural Network Compression

Neural Information Processing Systems

Second-order information, in the form of Hessian-or Inverse-Hessian-vector products, is a fundamental tool for solving optimization problems. Recently, there has been significant interest in utilizing this information in the context of deep neural networks; however, relatively little is known about the quality of existing approximations in this context. Our work examines this question, identifies issues with existing approaches, and proposes a method called WoodFisher to compute a faithful and efficient estimate of the inverse Hessian. Our main application is to neural network compression, where we build on the classic Optimal Brain Damage/Surgeon framework. We demonstrate that WoodFisher significantly outperforms popular state-of-the-art methods for oneshot pruning. Further, even when iterative, gradual pruning is allowed, our method results in a gain in test accuracy over the state-of-the-art approaches, for standard image classification datasets such as ImageNet ILSVRC. We examine how our method can be extended to take into account first-order information, as well as illustrate its ability to automatically set layer-wise pruning thresholds and perform compression in the limited-data regime.


Don't Blame the ELBO! A Linear VAE Perspective on Posterior Collapse

Neural Information Processing Systems

Posterior collapse in Variational Autoencoders (VAEs) arises when the variational posterior distribution closely matches the prior for a subset of latent variables. This paper presents a simple and intuitive explanation for posterior collapse through the analysis of linear VAEs and their direct correspondence with Probabilistic PCA (pPCA). We explain how posterior collapse may occur in pPCA due to local maxima in the log marginal likelihood. Unexpectedly, we prove that the ELBO objective for the linear VAE does not introduce additional spurious local maxima relative to log marginal likelihood. We show further that training a linear VAE with exact variational inference recovers an identifiable global maximum corresponding to the principal component directions. Empirically, we find that our linear analysis is predictive even for high-capacity, non-linear VAEs and helps explain the relationship between the observation noise, local maxima, and posterior collapse in deep Gaussian VAEs.


Randomized algorithms and PAC bounds for inverse reinforcement learning in continuous spaces

Neural Information Processing Systems

This work studies discrete-time discounted Markov decision processes with continuous state and action spaces and addresses the inverse problem of inferring a cost function from observed optimal behavior. We first consider the case in which we have access to the entire expert policy and characterize the set of solutions to the inverse problem by using occupation measures, linear duality, and complementary slackness conditions. To avoid trivial solutions and ill-posedness, we introduce a natural linear normalization constraint.


MultiOrg: A Multi-rater Organoid-detection Dataset Christina Bukas

Neural Information Processing Systems

High-throughput image analysis in the biomedical domain has gained significant attention in recent years, driving advancements in drug discovery, disease prediction, and personalized medicine. Organoids, specifically, are an active area of research, providing excellent models for human organs and their functions. Automating the quantification of organoids in microscopy images would provide an effective solution to overcome substantial manual quantification bottlenecks, particularly in high-throughput image analysis. However, there is a notable lack of open biomedical datasets, in contrast to other domains, such as autonomous driving, and, notably, only few of them have attempted to quantify annotation uncertainty. In this work, we present MultiOrg a comprehensive organoid dataset tailored for object detection tasks with uncertainty quantification. This dataset comprises over 400 high-resolution 2d microscopy images and curated annotations of more than 60,000 organoids. Most importantly, it includes three label sets for the test data, independently annotated by two experts at distinct time points. We additionally provide a benchmark for organoid detection, and make the best model available through an easily installable, interactive plugin for the popular image visualization tool Napari, to perform organoid quantification.


explicitly we assume that we agree with the reviewer and will address properly in the revision

Neural Information Processing Systems

We thank all the reviewers for their insightful comments. A more detailed explanation is as follows. First, we agree that Equation 3 is not a concrete optimization procedure. We have also tested the hinge loss as done in, e.g., BigGAN, which works equally well w.r.t. the We will add proper clarifications and discussions. We will make this clear in the paper.


Supplement: Hybrid Models for Learning to Branch

Neural Information Processing Systems

In this section, we argue that the GNN architecture looses its advantages in the face of solving multiple MILPs at the same time. In the applications like multi-objective optimization [4], where multiple MILPs are solved in parallel, a GNN for each MILP needs to be initialized on the GPU because of the sequentially asynchronous nature of solving MILPs. Not only is there a limit to the number of such GNNs that can fit on a single GPU because of memory constraints, but also several GNNs on a single GPU results in an inefficient GPU utilization. One can, for instance, try to time multiple MILPs such that there is a need for a single forward evaluation on a GPU, but, in our knowledge, it has not been done and it results in frequent interruptions in the solving procedure. An alternative, much simpler, method is to pack multiple GNNs on a single GPU such that each GNN is dedicated to solving one MILP.