Goto

Collaborating Authors

 Search


Optimal Solutions for Sparse Principal Component Analysis

arXiv.org Artificial Intelligence

Given a sample covariance matrix, we examine the problem of maximizing the variance explained by a linear combination of the input variables while constraining the number of nonzero coefficients in this combination. This is known as sparse principal component analysis and has a wide array of applications in machine learning and engineering. We formulate a new semidefinite relaxation to this problem and derive a greedy algorithm that computes a full set of good solutions for all target numbers of non zero coefficients, with total complexity O(n^3), where n is the number of variables. We then use the same relaxation to derive sufficient conditions for global optimality of a solution, which can be tested in O(n^3) per pattern. We discuss applications in subset selection and sparse recovery and show on artificial examples and biological data that our algorithm does provide globally optimal solutions in many cases.


Stationary probability density of stochastic search processes in global optimization

arXiv.org Artificial Intelligence

A method for the construction of approximate analytical expressions for the stationary marginal densities of general stochastic search processes is proposed. By the marginal densities, regions of the search space that with high probability contain the global optima can be readily defined. The density estimation procedure involves a controlled number of linear operations, with a computational cost per iteration that grows linearly with problem size.


Compressed Pattern Databases

Journal of Artificial Intelligence Research

A pattern database (PDB) is a heuristic function implemented as a lookup table that stores the lengths of optimal solutions for subproblem instances. Standard PDBs have a distinct entry in the table for each subproblem instance. In this paper we investigate compressing PDBs by merging several entries into one, thereby allowing the use of PDBs that exceed available memory in their uncompressed form. We introduce a number of methods for determining which entries to merge and discuss their relative merits. These vary from domain-independent approaches that allow any set of entries in the PDB to be merged, to more intelligent methods that take into account the structure of the problem. The choice of the best compression method is based on domain-dependent attributes. We present experimental results on a number of combinatorial problems, including the four-peg Towers of Hanoi problem, the sliding-tile puzzles, and the Top-Spin puzzle. For the Towers of Hanoi, we show that the search time can be reduced by up to three orders of magnitude by using compressed PDBs compared to uncompressed PDBs of the same size. More modest improvements were observed for the other domains.


Graph Abstraction in Real-time Heuristic Search

Journal of Artificial Intelligence Research

Real-time heuristic search methods are used by situated agents in applications that require the amount of planning per move to be independent of the problem size. Such agents plan only a few actions at a time in a local search space and avoid getting trapped in local minima by improving their heuristic function over time. We extend a wide class of real-time search algorithms with automatically-built state abstraction and prove completeness and convergence of the resulting family of algorithms. We then analyze the impact of abstraction in an extensive empirical study in real-time pathfinding. Abstraction is found to improve efficiency by providing better trading offs between planning time, learning speed and other negatively correlated performance measures.


Heuristic Search and Information Visualization Methods for School Redistricting

AI Magazine

We describe an application of AI search and information visualization techniques to the problem of school redistricting, in which students are assigned to home schools within a county or school district. Because of the complexity of the decision-making problem, tools are needed to help end users generate, evaluate, and compare alternative school assignment plans. A key goal of our research is to aid users in finding multiple qualitatively different redistricting plans that represent different trade-offs in the decision space. We show the resulting plans using novel visualization methods that we have developed for summarizing and comparing alternative plans.


Heuristic Search and Information Visualization Methods for School Redistricting

AI Magazine

We describe an application of AI search and information visualization techniques to the problem of school redistricting, in which students are assigned to home schools within a county or school district. This is a multicriteria optimization problem in which competing objectives, such as school capacity, busing costs, and socioeconomic distribution, must be considered. Because of the complexity of the decision-making problem, tools are needed to help end users generate, evaluate, and compare alternative school assignment plans. A key goal of our research is to aid users in finding multiple qualitatively different redistricting plans that represent different trade-offs in the decision space. We present heuristic search methods that can be used to find a set of qualitatively different plans, and give empirical results of these search methods on population data from the school district of Howard County, Maryland. We show the resulting plans using novel visualization methods that we have developed for summarizing and comparing alternative plans.


Expressive Commerce and Its Application to Sourcing: How We Conducted $35 Billion of Generalized Combinatorial Auctions

AI Magazine

Sourcing professionals buy several trillion dollars worth of goods and services yearly. We introduced a new paradigm called expressive commerceand applied it to sourcing. It combines the advantages of highly expressive human negotiation with the advantages of electronic reverse auctions. The idea is that supply and demand are expressed in drastically greater detail than in traditional electronic auctions and are algorithmically cleared. This creates a Pareto efficiency improvement in the allocation (a win-win between the buyer and the sellers), but the market-clearing problem is a highly complex combinatorial optimization problem. We developed the world's fastest tree search algorithms for solving it. We have hosted $35 billion of sourcing using the technology and created $4.4 billion of hard-dollar savings plus numerous harder-to-quantify benefits. The suppliers also benefited by being able to express production efficiencies and creativity, and through exposure problem removal. Supply networks were redesigned, with quantitative understanding of the trade-offs, and implemented in weeks instead of months.


Learning Semantic Definitions of Online Information Sources

Journal of Artificial Intelligence Research

The Internet contains a very large number of information sources providing many types of data from weather forecasts to travel deals and financial information. These sources can be accessed via Web-forms, Web Services, RSS feeds and so on. In order to make automated use of these sources, we need to model them semantically, but writing semantic descriptions for Web Services is both tedious and error prone. In this paper we investigate the problem of automatically generating such models. We introduce a framework for learning Datalog definitions of Web sources. In order to learn these definitions, our system actively invokes the sources and compares the data they produce with that of known sources of information. It then performs an inductive logic search through the space of plausible source definitions in order to learn the best possible semantic model for each new source. In this paper we perform an empirical evaluation of the system using real-world Web sources. The evaluation demonstrates the effectiveness of the approach, showing that we can automatically learn complex models for real sources in reasonable time. We also compare our system with a complex schema matching system, showing that our approach can handle the kinds of problems tackled by the latter.


A preliminary analysis on metaheuristics methods applied to the Haplotype Inference Problem

arXiv.org Artificial Intelligence

Haplotype Inference is a challenging problem in bioinformatics that consists in inferring the basic genetic constitution of diploid organisms on the basis of their genotype. This information allows researchers to perform association studies for the genetic variants involved in diseases and the individual responses to therapeutic agents. A notable approach to the problem is to encode it as a combinatorial problem (under certain hypotheses, such as the pure parsimony criterion) and to solve it using off-the-shelf combinatorial optimization techniques. The main methods applied to Haplotype Inference are either simple greedy heuristic or exact methods (Integer Linear Programming, Semidefinite Programming, SAT encoding) that, at present, are adequate only for moderate size instances. We believe that metaheuristic and hybrid approaches could provide a better scalability. Moreover, metaheuristics can be very easily combined with problem specific heuristics and they can also be integrated with tree-based search techniques, thus providing a promising framework for hybrid systems in which a good trade-off between effectiveness and efficiency can be reached. In this paper we illustrate a feasibility study of the approach and discuss some relevant design issues, such as modeling and design of approximate solvers that combine constructive heuristics, local search-based improvement strategies and learning mechanisms. Besides the relevance of the Haplotype Inference problem itself, this preliminary analysis is also an interesting case study because the formulation of the problem poses some challenges in modeling and hybrid metaheuristic solver design that can be generalized to other problems.


Mixed Integer Linear Programming For Exact Finite-Horizon Planning In Decentralized Pomdps

arXiv.org Artificial Intelligence

We consider the problem of finding an n-agent joint-policy for the optimal finite-horizon control of a decentralized Pomdp (Dec-Pomdp). This is a problem of very high complexity (NEXP-hard in n >= 2). In this paper, we propose a new mathematical programming approach for the problem. Our approach is based on two ideas: First, we represent each agent's policy in the sequence-form and not in the tree-form, thereby obtaining a very compact representation of the set of joint-policies. Second, using this compact representation, we solve this problem as an instance of combinatorial optimization for which we formulate a mixed integer linear program (MILP). The optimal solution of the MILP directly yields an optimal joint-policy for the Dec-Pomdp. Computational experience shows that formulating and solving the MILP requires significantly less time to solve benchmark Dec-Pomdp problems than existing algorithms. For example, the multi-agent tiger problem for horizon 4 is solved in 72 secs with the MILP whereas existing algorithms require several hours to solve it.