Plotting

 Country


Exploring Large Feature Spaces with Hierarchical Multiple Kernel Learning

arXiv.org Machine Learning

For supervised and unsupervised learning, positive definite kernels allow to use large and potentially infinite dimensional feature spaces with a computational cost that only depends on the number of observations. This is usually done through the penalization of predictor functions by Euclidean or Hilbertian norms. In this paper, we explore penalizing by sparsity-inducing norms such as the l1-norm or the block l1-norm. We assume that the kernel decomposes into a large sum of individual basis kernels which can be embedded in a directed acyclic graph; we show that it is then possible to perform kernel selection through a hierarchical multiple kernel learning framework, in polynomial time in the number of selected kernels. This framework is naturally applied to non linear variable selection; our extensive simulations on synthetic datasets and datasets from the UCI repository show that efficiently exploring the large feature space through sparsity-inducing norms leads to state-of-the-art predictive performance.


Predictive Hypothesis Identification

arXiv.org Machine Learning

While statistics focusses on hypothesis testing and on estimating (properties of) the true sampling distribution, in machine learning the performance of learning algorithms on future data is the primary issue. In this paper we bridge the gap with a general principle (PHI) that identifies hypotheses with best predictive performance. This includes predictive point and interval estimation, simple and composite hypothesis testing, (mixture) model selection, and others as special cases. For concrete instantiations we will recover well-known methods, variations thereof, and new ones. PHI nicely justifies, reconciles, and blends (a reparametrization invariant variation of) MAP, ML, MDL, and moment estimation. One particular feature of PHI is that it can genuinely deal with nested hypotheses.


Necessary and Sufficient Conditions for Success of the Nuclear Norm Heuristic for Rank Minimization

arXiv.org Machine Learning

Minimizing the rank of a matrix subject to constraints is a challenging problem that arises in many applications in control theory, machine learning, and discrete geometry. This class of optimization problems, known as rank minimization, is NP-HARD, and for most practical problems there are no efficient algorithms that yield exact solutions. A popular heuristic algorithm replaces the rank function with the nuclear norm--equal to the sum of the singular values--of the decision variable. In this paper, we provide a necessary and sufficient condition that quantifies when this heuristic successfully finds the minimum rank solution of a linear constraint set. We additionally provide a probability distribution over instances of the affine rank minimization problem such that instances sampled from this distribution satisfy our conditions for success with overwhelming probability provided the number of constraints is appropriately large. Finally, we give empirical evidence that these probabilistic bounds provide accurate predictions of the heuristic's performance in non-asymptotic scenarios.


Applications of Universal Source Coding to Statistical Analysis of Time Series

arXiv.org Artificial Intelligence

We show how universal codes can be used for solving some of the most important statistical problems for time series. By definition, a universal code (or a universal lossless data compressor) can compress any sequence generated by a stationary and ergodic source asymptotically to the Shannon entropy, which, in turn, is the best achievable ratio for lossless data compressors. We consider finite-alphabet and real-valued time series and the following problems: estimation of the limiting probabilities for finite-alphabet time series and estimation of the density for real-valued time series, the on-line prediction, regression, classification (or problems with side information) for both types of the time series and the following problems of hypothesis testing: goodness-of-fit testing, or identity testing, and testing of serial independence. It is important to note that all problems are considered in the framework of classical mathematical statistics and, on the other hand, everyday methods of data compression (or archivers) can be used as a tool for the estimation and testing. It turns out, that quite often the suggested methods and tests are more powerful than known ones when they are applied in practice.


MOOPPS: An Optimization System for Multi Objective Scheduling

arXiv.org Artificial Intelligence

In the current paper, we present an optimization system solving multi objective production scheduling problems (MOOPPS). The identification of Pareto optimal alternatives or at least a close approximation of them is possible by a set of implemented metaheuristics. Necessary control parameters can easily be adjusted by the decision maker as the whole software is fully menu driven. This allows the comparison of different metaheuristic algorithms for the considered problem instances. Results are visualized by a graphical user interface showing the distribution of solutions in outcome space as well as their corresponding Gantt chart representation. The identification of a most preferred solution from the set of efficient solutions is supported by a module based on the aspiration interactive method (AIM). The decision maker successively defines aspiration levels until a single solution is chosen. After successfully competing in the finals in Ronneby, Sweden, the MOOPPS software has been awarded the European Academic Software Award 2002 (http://www.bth.se/llab/easa_2002.nsf)


Variable Neighborhood Search for the University Lecturer-Student Assignment Problem

arXiv.org Artificial Intelligence

The paper presents a study of local search heuristics in general and variable neighborhood search in particular for the resolution of an assignment problem studied in the practical work of universities. Here, students have to be assigned to scientific topics which are proposed and supported by members of staff. The problem involves the optimization under given preferences of students which may be expressed when applying for certain topics. It is possible to observe that variable neighborhood search leads to superior results for the tested problem instances. One instance is taken from an actual case, while others have been generated based on the real world data to support the analysis with a deeper analysis. An extension of the problem has been formulated by integrating a second objective function that simultaneously balances the workload of the members of staff while maximizing utility of the students. The algorithmic approach has been prototypically implemented in a computer system. One important aspect in this context is the application of the research work to problems of other scientific institutions, and therefore the provision of decision support functionalities.


Proposition of the Interactive Pareto Iterated Local Search Procedure - Elements and Initial Experiments

arXiv.org Artificial Intelligence

The article presents an approach to interactively solve multi-objective optimization problems. While the identification of efficient solutions is supported by computational intelligence techniques on the basis of local search, the search is directed by partial preference information obtained from the decision maker. An application of the approach to biobjective portfolio optimization, modeled as the well-known knapsack problem, is reported, and experimental results are reported for benchmark instances taken from the literature. In brief, we obtain encouraging results that show the applicability of the approach to the described problem.


An application of the Threshold Accepting metaheuristic for curriculum based course timetabling

arXiv.org Artificial Intelligence

The article presents a local search approach for the solution of timetabling problems in general, with a particular implementation for competition track 3 of the International Timetabling Competition 2007 (ITC 2007). The heuristic search procedure is based on Threshold Accepting to overcome local optima. A stochastic neighborhood is proposed and implemented, randomly removing and reassigning events from the current solution. The overall concept has been incrementally obtained from a series of experiments, which we describe in each (sub)section of the paper. In result, we successfully derived a potential candidate solution approach for the finals of track 3 of the ITC 2007.


Bin Packing Under Multiple Objectives - a Heuristic Approximation Approach

arXiv.org Artificial Intelligence

The article proposes a heuristic approximation approach to the bin packing problem under multiple objectives. In addition to the traditional objective of minimizing the number of bins, the heterogeneousness of the elements in each bin is minimized, leading to a biobjective formulation of the problem with a tradeoff between the number of bins and their heterogeneousness. An extension of the Best-Fit approximation algorithm is presented to solve the problem. Experimental investigations have been carried out on benchmark instances of different size, ranging from 100 to 1000 items. Encouraging results have been obtained, showing the applicability of the heuristic approach to the described problem.


A framework for the interactive resolution of multi-objective vehicle routing problems

arXiv.org Artificial Intelligence

The article presents a framework for the resolution of rich vehicle routing problems which are difficult to address with standard optimization techniques. We use local search on the basis on variable neighborhood search for the construction of the solutions, but embed the techniques in a flexible framework that allows the consideration of complex side constraints of the problem such as time windows, multiple depots, heterogeneous fleets, and, in particular, multiple optimization criteria. In order to identify a compromise alternative that meets the requirements of the decision maker, an interactive procedure is integrated in the resolution of the problem, allowing the modification of the preference information articulated by the decision maker. The framework is prototypically implemented in a computer system. First results of test runs on multiple depot vehicle routing problems with time windows are reported.