Goto

Collaborating Authors

 Country


Optimal Packing of High-Precision Rectangles

AAAI Conferences

The rectangle-packing problem consists of ๏ฌnding an enclosing rectangle of smallest area that can contain a given set of rectangles without overlap. Our new benchmark includes rectangles of successively higher precision, challenging the previous state-of-the-art, which enumerates all locations for placing rectangles, as well as all bounding box widths and heights up to the optimal box. We instead limit the rectanglesโ€™ coordinates and bounding box dimensions to the set of subset sums of the rectanglesโ€™ dimensions. We also dynamically prune values by learning from infeasible subtrees and constrain the problem by replacing rectangles and empty space with larger rectangles. Compared to the previous state-of-the-art, we test 4,500 times fewer bounding boxes on the high-precision benchmark and solve N =9 over two orders of magnitude faster. We also present all optimal solutions up to N =15, which requires 39 bits of precision to solve. Finally, on the open problem of whether or not one can pack a particular in๏ฌnite series of rectangles into the unit square, we pack the ๏ฌrst 50,000 such rectangles witha greedy heuristic and conjecture that the entire in๏ฌnite series can ๏ฌt..


Core-Guided Binary Search Algorithms for Maximum Satisfiability

AAAI Conferences

Several MaxSAT algorithms based on iterative SAT solving have been proposed in recent years. These algorithms are in general the most ef๏ฌcient for real-world applications. Existing data indicates that, among MaxSAT algorithms based on iterative SAT solving, the most ef๏ฌcient ones are core-guided, i.e. algorithms which guide the search by iteratively computing unsatis๏ฌable subformulas (or cores). For weighted MaxSAT, core-guided algorithms exhibit a number of important drawbacks, including a possibly exponential number of iterations and the use of a large number of auxiliary variables. This paper develops two new algorithms for (weighted) MaxSAT that address these two drawbacks. The ๏ฌrst MaxSAT algorithm implements core-guided iterative SAT solving with binary search. The second algorithm extends the ๏ฌrst one by exploiting disjoint cores. The empirical evaluation shows that core-guided binary search is competitive with current MaxSAT solvers.


Heuristic Search for Large Problems With Real Costs

AAAI Conferences

The memory requirements of basic best-first heuristic search algorithms like A* make them infeasible for solving large problems. External disk storage is cheap and plentiful com- pared to the cost of internal RAM. Unfortunately, state-of- the-art external memory search algorithms either rely on brute-force search techniques, such as breadth-first search, or they rely on all node values falling in a narrow range of in- tegers, and thus perform poorly on real-world domains with real-valued costs. We present a new general-purpose algo- rithm, PEDAL, that uses external memory and parallelism to perform a best-first heuristic search capable of solving large problems with real costs. We show theoretically that PEDAL is I/O efficient and empirically that it is both better on a stan- dard unit-cost benchmark, surpassing internal IDA* on the 15-puzzle, and gives far superior performance on problems with real costs.


The Compressed Differential Heuristic

AAAI Conferences

The differential heuristic (DH) is an effective memory-based heuristic for explicit state spaces. In this paper we aim to improve its performance and memory usage. We introduce a compression method for DHs which stores only a portion of the original uncompressed DH, while preserving enough information to enable efficient search. Compressed DHs (CDH) are flexible and can be tuned to fit any size of memory, even smaller than the size of the state space. Furthermore, CDHs can be built without the need to create and store the entire uncompressed DH. Experimental results across different domains show that, for a given amount of memory, a CDH significantly outperforms an uncompressed DH.


On the Complexity of BDDs for State Space Search: A Case Study in Connect Four

AAAI Conferences

Symbolic search using BDDs usually saves huge amounts of memory, while in some domains its savings are moderate at best. It is an open problem to determine if BDDs work well for a certain domain. Motivated by finding evidences for BDD growths for state space search, in this paper we are concerned with symbolic search in the domain of Connect Four. We prove that there is a variable ordering for which the set of all possible states โ€“ when continuing after a terminal state has been reached โ€“ can be represented by polynomial sized BDDs, whereas the termination criterion leads to an exponential number of nodes in the BDD given any variable ordering.


Optimal Graph Search with Iterated Graph Cuts

AAAI Conferences

Informed search algorithms such as A* use heuristics to focus exploration on states with low total path cost. To the extent that heuristics underestimate forward costs, a wider cost radius of suboptimal states will be explored. For many weighted graphs, however, a small distance in terms of cost may encompass a large fraction of the unweighted graph. We present a new informed search algorithm, Iterative Monotonically Bounded A* (IMBA*), which first proves that no optimal paths exist in a bounded cut of the graph before considering larger cuts. We prove that IMBA* has the same optimality and completeness guarantees as A* and, in a non-uniform pathfinding application, we empirically demonstrate substantial speed improvements over classic A*.


A Data Mining Approach to the Diagnosis of Tuberculosis by Cascading Clustering and Classification

arXiv.org Artificial Intelligence

In this paper, a methodology for the automated detection and classification of Tuberculosis(TB) is presented. Tuberculosis is a disease caused by mycobacterium which spreads through the air and attacks low immune bodies easily. Our methodology is based on clustering and classification that classifies TB into two categories, Pulmonary Tuberculosis(PTB) and retroviral PTB(RPTB) that is those with Human Immunodeficiency Virus (HIV) infection. Initially K-means clustering is used to group the TB data into two clusters and assigns classes to clusters. Subsequently multiple different classification algorithms are trained on the result set to build the final classifier model based on K-fold cross validation method. This methodology is evaluated using 700 raw TB data obtained from a city hospital. The best obtained accuracy was 98.7% from support vector machine (SVM) compared to other classifiers. The proposed approach helps doctors in their diagnosis decisions and also in their treatment planning procedures for different categories.


Selecting Attributes for Sport Forecasting using Formal Concept Analysis

arXiv.org Artificial Intelligence

In order to address complex systems, apply pattern recongnition on their evolution could play an key role to understand their dynamics. Global patterns are required to detect emergent concepts and trends, some of them with qualitative nature. Formal Concept Analysis (FCA) is a theory whose goal is to discover and to extract Knowledge from qualitative data. It provides tools for reasoning with implication basis (and association rules). Implications and association rules are usefull to reasoning on previously selected attributes, providing a formal foundation for logical reasoning. In this paper we analyse how to apply FCA reasoning to increase confidence in sports betting, by means of detecting temporal regularities from data. It is applied to build a Knowledge Based system for confidence reasoning.


Accurate Estimators for Improving Minwise Hashing and b-Bit Minwise Hashing

arXiv.org Machine Learning

Minwise hashing is the standard technique in the context of search and databases for efficiently estimating set (e.g., high-dimensional 0/1 vector) similarities. Recently, b-bit minwise hashing was proposed which significantly improves upon the original minwise hashing in practice by storing only the lowest b bits of each hashed value, as opposed to using 64 bits. b-bit hashing is particularly effective in applications which mainly concern sets of high similarities (e.g., the resemblance >0.5). However, there are other important applications in which not just pairs of high similarities matter. For example, many learning algorithms require all pairwise similarities and it is expected that only a small fraction of the pairs are similar. Furthermore, many applications care more about containment (e.g., how much one object is contained by another object) than the resemblance. In this paper, we show that the estimators for minwise hashing and b-bit minwise hashing used in the current practice can be systematically improved and the improvements are most significant for set pairs of low resemblance and high containment.


On the Evaluation Criterions for the Active Learning Processes

arXiv.org Machine Learning

In many data mining applications collection of sufficiently large datasets is the most time consuming and expensive. On the other hand, industrial methods of data collection create huge databases, and make difficult direct applications of the advanced machine learning algorithms. To address the above problems, we consider active learning (AL), which may be very efficient either for the experimental design or for the data filtering. In this paper we demonstrate using the online evaluation opportunity provided by the AL Challenge that quite competitive results may be produced using a small percentage of the available data. Also, we present several alternative criteria, which may be useful for the evaluation of the active learning processes. The author of this paper attended special presentation in Barcelona, where results of the WCCI 2010 AL Challenge were discussed.