Goto

Collaborating Authors

 Genre


Forward-Backward Greedy Algorithms for General Convex Smooth Functions over A Cardinality Constraint

arXiv.org Machine Learning

We consider forward-backward greedy algorithms for solving sparse feature selection problems with general convex smooth functions. A state-of-the-art greedy method, the Forward-Backward greedy algorithm (FoBa-obj) requires to solve a large number of optimization problems, thus it is not scalable for large-size problems. The FoBa-gdt algorithm, which uses the gradient information for feature selection at each forward iteration, significantly improves the efficiency of FoBa-obj. In this paper, we systematically analyze the theoretical properties of both forward-backward greedy algorithms. Our main contributions are: 1) We derive better theoretical bounds than existing analyses regarding FoBa-obj for general smooth convex functions; 2) We show that FoBa-gdt achieves the same theoretical performance as FoBa-obj under the same condition: restricted strong convexity condition. Our new bounds are consistent with the bounds of a special case (least squares) and fills a previously existing theoretical gap for general convex smooth functions; 3) We show that the restricted strong convexity condition is satisfied if the number of independent samples is more than $\bar{k}\log d$ where $\bar{k}$ is the sparsity number and $d$ is the dimension of the variable; 4) We apply FoBa-gdt (with the conditional random field objective) to the sensor selection problem for human indoor activity recognition and our results show that FoBa-gdt outperforms other methods (including the ones based on forward greedy selection and L1-regularization).


Insights from the Wikipedia Contest (IEEE Contest for Data Mining 2011)

arXiv.org Machine Learning

The Wikimedia Foundation has recently observed that newly joining editors on Wikipedia are increasingly failing to integrate into the Wikipedia editors' community, i.e. the community is becoming increasingly harder to penetrate [1]. To sustain healthy growth of the community, the Wikimedia Foundation aims to quantitatively understand the factors that determine the editing behavior, and explain why most new editors become inactive soon after joining. As a step towards this broader goal, the Wikimedia foundation sponsored the ICDM (IEEE International Conference for Data Mining) contest [2] for the year 2011. The objective for the participants was to develop models to predict the number of edits that an editor will make in future five months based on the editing history of the editor. Here we describe the approach we followed for developing predictive models towards this goal, the results that we obtained and the modeling insights that we gained from this exercise. In addition, towards the broader goal of Wikimedia Foundation, we also summarize the factors that emerged during our model building exercise as powerful predictors of future editing activity.


Cortical prediction markets

arXiv.org Artificial Intelligence

We investigate cortical learning from the perspective of mechanism design. First, we show that discretizing standard models of neurons and synaptic plasticity leads to rational agents maximizing simple scoring rules. Second, our main result is that the scoring rules are proper, implying that neurons faithfully encode expected utilities in their synaptic weights and encode high-scoring outcomes in their spikes. Third, with this foundation in hand, we propose a biologically plausible mechanism whereby neurons backpropagate incentives which allows them to optimize their usefulness to the rest of cortex. Finally, experiments show that networks that backpropagate incentives can learn simple tasks. Keywords: incentives for cooperation, multiagent learning, biologically-inspired approaches, prediction markets 1. Introduction How does the brain encode information about the environment into its structure [26]? Inspired by recent work in prediction markets, this paper investigates cortical learning and the neural code from the perspective of mechanism design [15, 18, 2, 3, 1]. To the best of our knowledge it is the first paper to do so.


Approximated Infomax Early Stopping: Revisiting Gaussian RBMs on Natural Images

arXiv.org Machine Learning

We pursue an early stopping technique that helps Gaussian Restricted Boltzmann Machines (GRBMs) to gain good natural image representations in terms of overcompleteness and data fitting. GRBMs are widely considered as an unsuitable model for natural images because they gain non-overcomplete representations which include uniform filters that do not represent useful image features. We have recently found that GRBMs once gain and subsequently lose useful filters during their training, contrary to this common perspective. We attribute this phenomenon to a tradeoff between overcompleteness of GRBM representations and data fitting. To gain GRBM representations that are overcomplete and fit data well, we propose a measure for GRBM representation quality, approximated mutual information, and an early stopping technique based on this measure. The proposed method boosts performance of classifiers trained on GRBM representations.


Differentially Private Data Releasing for Smooth Queries with Synthetic Database Output

arXiv.org Machine Learning

Machine learning is often conducted on datasets containing sensitive information, such as medical records, commercial data, etc. The benefit of learning from such data is tremendous. But when releasing sensitive data, one must take privacy into consideration, and has to tradeoff between the accuracy and the amount of privacy loss of the individuals in the database. In this paper we study differential privacy [11], which has become a standard concept of privacy. Differential privacy guarantees that almost nothing new can be learned from the database that contains one specific individual's information compared with that from the database without that individual's information. More concretely, a mechanism which releases information about the database is said to preserve differential privacy, if the change of a single database element does not affect the probability distribution of the output significantly. Therefore differential privacy provides strong guarantees against attacks; the risk of any individual to submit her information to the database is very small.


Structure estimation for discrete graphical models: Generalized covariance matrices and their inverses

arXiv.org Machine Learning

We investigate the relationship between the structure of a discrete graphical model and the support of the inverse of a generalized covariance matrix. We show that for certain graph structures, the support of the inverse covariance matrix of indicator variables on the vertices of a graph reflects the conditional independence structure of the graph. Our work extends results that have previously been established only in the context of multivariate Gaussian graphical models, thereby addressing an open question about the significance of the inverse covariance matrix of a non-Gaussian distribution. The proof exploits a combination of ideas from the geometry of exponential families, junction tree theory and convex analysis. These population-level results have various consequences for graph selection methods, both known and novel, including a novel method for structure estimation for missing or corrupted observations. We provide nonasymptotic guarantees for such methods and illustrate the sharpness of these predictions via simulations.


Quantile Regression for Large-scale Applications

arXiv.org Machine Learning

Quantile regression is a method to estimate the quantiles of the conditional distribution of a response variable, and as such it permits a much more accurate portrayal of the relationship between the response variable and observed covariates than methods such as Least-squares or Least Absolute Deviations regression. It can be expressed as a linear program, and, with appropriate preprocessing, interior-point methods can be used to find a solution for moderately large problems. Dealing with very large problems, \emph(e.g.), involving data up to and beyond the terabyte regime, remains a challenge. Here, we present a randomized algorithm that runs in nearly linear time in the size of the input and that, with constant probability, computes a $(1+\epsilon)$ approximate solution to an arbitrary quantile regression problem. As a key step, our algorithm computes a low-distortion subspace-preserving embedding with respect to the loss function of quantile regression. Our empirical evaluation illustrates that our algorithm is competitive with the best previous work on small to medium-sized problems, and that in addition it can be implemented in MapReduce-like environments and applied to terabyte-sized problems.


Constraint Solvers for User Interface Layout

arXiv.org Artificial Intelligence

Constraints have played an important role in the construction of GUIs, where they are mainly used to define the layout of the widgets. Resizing behavior is very important in GUIs because areas have domain specific parameters such as form the resizing of windows. If linear objective function is used and window is resized then error is not distributed equally. To distribute the error equally, a quadratic objective function is introduced. Different algorithms are widely used for solving linear constraints and quadratic problems in a variety of different scientific areas. The linear relxation, Kaczmarz, direct and linear programming methods are common methods for solving linear constraints for GUI layout. The interior point and active set methods are most commonly used techniques to solve quadratic programming problems. Current constraint solvers designed for GUI layout do not use interior point methods for solving a quadratic objective function subject to linear equality and inequality constraints. In this paper, performance aspects and the convergence speed of interior point and active set methods are compared along with one most commonly used linear programming method when they are implemented for graphical user interface layout. The performance and convergence of the proposed algorithms are evaluated empirically using randomly generated UI layout specifications of various sizes. The results show that the interior point algorithms perform significantly better than the Simplex method and QOCA-solver, which uses the active set method implementation for solving quadratic optimization.


Solver Scheduling via Answer Set Programming

arXiv.org Artificial Intelligence

Although Boolean Constraint Technology has made tremendous progress over the last decade, the efficacy of state-of-the-art solvers is known to vary considerably across different types of problem instances and is known to depend strongly on algorithm parameters. This problem was addressed by means of a simple, yet effective approach using handmade, uniform and unordered schedules of multiple solvers in ppfolio, which showed very impressive performance in the 2011 SAT Competition. Inspired by this, we take advantage of the modeling and solving capacities of Answer Set Programming (ASP) to automatically determine more refined, that is, non-uniform and ordered solver schedules from existing benchmarking data. We begin by formulating the determination of such schedules as multi-criteria optimization problems and provide corresponding ASP encodings. The resulting encodings are easily customizable for different settings and the computation of optimum schedules can mostly be done in the blink of an eye, even when dealing with large runtime data sets stemming from many solvers on hundreds to thousands of instances. Also, the fact that our approach can be customized easily enabled us to swiftly adapt it to generate parallel schedules for multi-processor machines.


Reasoning about Independence in Probabilistic Models of Relational Data

arXiv.org Artificial Intelligence

We extend the theory of d-separation to cases in which data instances are not independent and identically distributed. We show that applying the rules of d-separation directly to the structure of probabilistic models of relational data inaccurately infers conditional independence. We introduce relational d-separation, a theory for deriving conditional independence facts from relational models. We provide a new representation, the abstract ground graph, that enables a sound, complete, and computationally efficient method for answering d-separation queries about relational models, and we present empirical results that demonstrate effectiveness.