Mixed Linear Regression with Multiple Components

Neural Information Processing Systems

In this paper, we study the mixed linear regression (MLR) problem, where the goal is to recover multiple underlying linear models from their unlabeled linear measurements. We propose a non-convex objective function which we show is {\em locally strongly convex} in the neighborhood of the ground truth. We use a tensor method for initialization so that the initial models are in the local strong convexity region. We then employ general convex optimization algorithms to minimize the objective function. To the best of our knowledge, our approach provides first exact recovery guarantees for the MLR problem with K \geq 2 components.


Generalized Correspondence-LDA Models (GC-LDA) for Identifying Functional Regions in the Brain

Neural Information Processing Systems

This paper presents Generalized Correspondence-LDA (GC-LDA), a generalization of the Correspondence-LDA model that allows for variable spatial representations to be associated with topics, and increased flexibility in terms of the strength of the correspondence between data types induced by the model. We present three variants of GC-LDA, each of which associates topics with a different spatial representation, and apply them to a corpus of neuroimaging data. In the context of this dataset, each topic corresponds to a functional brain region, where the region's spatial extent is captured by a probability distribution over neural activity, and the region's cognitive function is captured by a probability distribution over linguistic terms. We illustrate the qualitative improvements offered by GC-LDA in terms of the types of topics extracted with alternative spatial representations, as well as the model's ability to incorporate a-priori knowledge from the neuroimaging literature. We furthermore demonstrate that the novel features of GC-LDA improve predictions for missing data.


Active Learning with Oracle Epiphany

Neural Information Processing Systems

We present a theoretical analysis of active learning with more realistic interactions with human oracles. Previous empirical studies have shown oracles abstaining on difficult queries until accumulating enough information to make label decisions. We formalize this phenomenon with an "oracle epiphany model" and analyze active learning query complexity under such oracles for both the realizable and the agnos- tic cases. Our analysis shows that active learning is possible with oracle epiphany, but incurs an additional cost depending on when the epiphany happens. Our results suggest new, principled active learning approaches with realistic oracles.


Upgrade his lawn game: Get 15% off the robot mower Dad didn't know he needed

PCWorld

From spring to fall, every weekend -- and even more frequently if it rains -- a couple of hours go into maintaining the lawn. What if you had all those hours back to yourself without worrying that your yard would turn into a jungle? This is the problem Segway's Navimow i series robot lawnmowers aim to solve. Available at a discounted price of 849* (regular price 999), the Navimow i105N can mow up to 1/8 of an acre without needing your input for anything. For larger lawns there is also the currently discounted 1,099* (usual price 1,299) Navimow i110N, which can mow up to 1/4 of an acre on a single charge.


5c8cb735a1ce65dac514233cbd5576d6-AuthorFeedback.pdf

Neural Information Processing Systems

First of all, we want to thank every reviewer for valuable notes and comments. In particular, we will discuss tuning time of the algorithms. Our paper is based on a standard GBDT score function (as, e.g., in [21]). The algorithm is easy to derive from our paper, when you replace a leaf size in Eq. 6 with sum Performance of this hessian-based sampling is even better (see Table 1), and we will add these results to the paper. We will add this to the paper.


604b37ea63ea51fa5fb3d8a89ec056e6-AuthorFeedback.pdf

Neural Information Processing Systems

We got a clear sense of where more clarification would be helpful. To what solution do neural nets (trained w. GD on this network simulates the unnormalized exponentiated gradient algorithm (EGU). Previously it was thought that GD cannot take advantage of the sparsity of the solution. What is the surprising insight?


From Boltzmann Machines to Neural Networks and Back Again

Neural Information Processing Systems

Graphical models are powerful tools for modeling high-dimensional data, but learning graphical models in the presence of latent variables is well-known to be difficult. In this work we give new results for learning Restricted Boltzmann Machines, probably the most well-studied class of latent variable models.


Exponential Quantum Communication Advantage in Distributed Inference and Learning

Neural Information Processing Systems

Training and inference with large machine learning models that far exceed the memory capacity of individual devices necessitates the design of distributed architectures, forcing one to contend with communication constraints. We present a framework for distributed computation over a quantum network in which data is encoded into specialized quantum states. We prove that for models within this framework, inference and training using gradient descent can be performed with exponentially less communication compared to their classical analogs, and with relatively modest overhead relative to standard gradient-based methods. We show that certain graph neural networks are particularly amenable to implementation within this framework, and moreover present empirical evidence that they perform well on standard benchmarks. To our knowledge, this is the first example of exponential quantum advantage for a generic class of machine learning problems that hold regardless of the data encoding cost. Moreover, we show that models in this class can encode highly nonlinear features of their inputs, and their expressivity increases exponentially with model depth. We also delineate the space of models for which exponential communication advantages hold by showing that they cannot hold for linear classification. Communication of quantum states that potentially limit the amount of information that can be extracted from them about the data and model parameters may also lead to improved privacy guarantees for distributed computation. Taken as a whole, these findings form a promising foundation for distributed machine learning over quantum networks.


Design from Policies: Conservative Test-Time Adaptation for Offline Policy Optimization Zifeng Zhuang 1,2

Neural Information Processing Systems

Specifically, this non-iterative paradigm allows us to conduct inner-level optimization (value estimation) in training, while performing outer-level optimization (policy extraction) in testing. Naturally, such a paradigm raises three core questions that are not fully answered by prior non-iterative offline RL counterparts like rewardconditioned policy: Q1) What information should we transfer from the inner-level to the outer-level? Q2) What should we pay attention to when exploiting the transferred information for safe/confident outer-level optimization? Q3) What are the benefits of concurrently conducting outer-level optimization during testing? Motivated by model-based optimization (MBO), we propose DROP (Design fROm Policies), which fully answers the above questions. Specifically, in the inner-level, DROP decomposes offline data into multiple subsets and learns an MBO score model (A1). To keep safe exploitation to the score model in the outer-level, we explicitly learn a behavior embedding and introduce a conservative regularization (A2). During testing, we show that DROP permits test-time adaptation, enabling an adaptive inference across states (A3). Empirically, we find that DROP, compared to prior non-iterative offline RL counterparts, gains an average improvement probability of more than 80%, and achieves comparable or better performance compared to prior iterative baselines.


FairJob: A Real-World Dataset for Fairness in Online Systems

Neural Information Processing Systems

We introduce a fairness-aware dataset for job recommendation in advertising, designed to foster research in algorithmic fairness within real-world scenarios. It was collected and prepared to comply with privacy standards and business confidentiality. An additional challenge is the lack of access to protected user attributes such as gender, for which we propose a solution to obtain a proxy estimate. Despite being anonymized and including a proxy for a sensitive attribute, our dataset preserves predictive power and maintains a realistic and challenging benchmark. This dataset addresses a significant gap in the availability of fairnessfocused resources for high-impact domains like advertising - the actual impact being having access or not to precious employment opportunities, where balancing fairness and utility is a common industrial challenge. We also explore various stages in the advertising process where unfairness can occur and introduce a method to compute a fair utility metric for the job recommendations in online systems case from a biased dataset. Experimental evaluations of bias mitigation techniques on the released dataset demonstrate potential improvements in fairness and the associated trade-offs with utility.