Optimization
How to Calibrate the Scores of Biased Reviewers by Quadratic Programming
Roos, Magnus (Heinrich-Heine-Universität) | Rothe, Jörg (Heinrich-Heine-Universität) | Scheuermann, Björn (Julius-Maximilians-Universität Würzburg)
Peer reviewing is the key ingredient of evaluating the quality of scientific work. Based on the review scores assigned by the individual reviewers to the submissions, program committees of conferences and journal editors decide which papers to accept for publication and which to reject. However, some reviewers may be more rigorous than others, they may be biased one way or the other, and they often have highly subjective preferences over the papers they review. Moreover, each reviewer usually has only a very local view, as he or she evaluates only a small fraction of the submissions. Despite all these shortcomings, the review scores obtained need to be aggregrated in order to globally rank all submissions and to make the acceptance/rejection decision. A common method is to simply take the average of each submission's review scores, possibly weighted by the reviewers' confidence levels. Unfortunately, the global ranking thus produced often suffers a certain unfairness, as the reviewers' biases and limitations are not taken into account. We propose a method for calibrating the scores of reviewers that are potentially biased and blindfolded by having only partial information. Our method uses a maximum likelihood estimator, which estimates both the bias of each individual reviewer and the unknown "ideal" score of each submission. This yields a quadratic program whose solution transforms the individual review scores into calibrated, globally comparable scores. We argue why our method results in a fairer and more reasonable global ranking than simply taking the average of scores. To show its usefulness, we test our method empirically using real-world data.
Inner Regions and Interval Linearizations for Global Optimization
Trombettoni, Gilles (INRIA, I3S, Université) | Araya, Ignacio (Nice-Sophia) | Neveu, Bertrand (UTFSM) | Chabert, Gilles (Imagine, LIGM, Université)
Researchers from interval analysis and constraint (logic) programming communities have studied intervals for their ability to manage infinite solution sets of numerical constraint systems. In particular, inner regions represent subsets of the search space in which all points are solutions. Our main contribution is the use of recent and new inner region extraction algorithms in the upper bounding phase of constrained global optimization. Convexification is a major key for efficiently lower bounding the objective function. We have adapted the convex interval taylorization proposed by Lin and Stadherr for producing a reliable outer and inner polyhedral approximation of the solution set and a linearization of the objective function. Other original ingredients are part of our optimizer, including an efficient interval constraint propagation algorithm exploiting monotonicity of functions. We end up with a new framework for reliable continuous constrained global optimization. Our interval B&B is implemented in the interval-based explorer Ibex and extends this free C++ library. Our strategy outperforms the best reliable global optimizers.
Distributed Constraint Optimization Under Stochastic Uncertainty
Léauté, Thomas (Ecole Polytechnique Federale de Lausanne (EPFL)) | Faltings, Boi (Ecole Polytechnique Federale de Lausanne (EPFL))
In many real-life optimization problems involving multiple agents, the rewards are not necessarily known exactly in advance, but rather depend on sources of exogenous uncertainty. For instance, delivery companies might have to coordinate to choose who should serve which foreseen customer, under uncertainty in the locations of the customers. The framework of Distributed Constraint Optimization under Stochastic Uncertainty was proposed to model such problems; in this paper, we generalize this formalism by introducing the concept of evaluation functions that model various optimization criteria. We take the example of three such evaluation functions, expectation , consensus , and robustness , and we adapt and generalize two previous algorithms accordingly. Our experimental results on a class of Vehicle Routing Problems show that incomplete algorithms are not only cheaper than complete ones (in terms of simulated time , Non-Concurrent Constraint Checks , and information exchange) , but they are also often able to find the optimal solution. We also show that exchanging more information about the dependencies of their respective cost functions on the sources of uncertainty can help the agents discover higher-quality solutions.
Optimal Packing of High-Precision Rectangles
Huang, Eric (Palo Alto Research Center) | Korf, Richard E. (University of California, Los Angeles)
The rectangle-packing problem consists of finding an enclosing rectangle of smallest area that can contain a given set of rectangles without overlap. Our new benchmark includes rectangles of successively higher precision, challenging the previous state-of-the-art, which enumerates all locations for placing rectangles, as well as all bounding box widths and heights up to the optimal box. We instead limit the rectangles’ coordinates and bounding box dimensions to the set of subset sums of the rectangles’ dimensions. We also dynamically prune values by learning from infeasible subtrees and constrain the problem by replacing rectangles and empty space with larger rectangles. Compared to the previous state-of-the-art, we test 4,500 times fewer bounding boxes on the high-precision benchmark and solve N =9 over two orders of magnitude faster. We also present all optimal solutions up to N =15, which requires 39 bits of precision to solve. Finally, on the open problem of whether or not one can pack a particular infinite series of rectangles into the unit square, we pack the first 50,000 such rectangles witha greedy heuristic and conjecture that the entire infinite series can fit..
Approximation-Guided Evolutionary Multi-Objective Optimization
Bringmann, Karl (Max Planck Institute for Informatic) | Friedrich, Tobias (Max Planck Institute for Informatic) | Neumann, Frank (The University of Adelaide) | Wagner, Markus (The University of Adelaide)
Multi-objective optimization problems arise frequently in applications but can often only be solved approximately by heuristic approaches. Evolutionary algorithms have been widely used to tackle multi-objective problems. These algorithms use different measures to ensure diversity in the objective space but are not guided by a formal notion of approximation. We present a new framework of an evolutionary algorithm for multi-objective optimization that allows to work with a formal notion of approximation. Our experimental results show that our approach outperforms state-of-the-art evolutionary algorithms in terms of the quality of the approximation that is obtained in particular for problems with many objectives.
User Similarity from Linked Taxonomies: Subjective Assessments of Items
Nakatsuji, Makoto (NTT Cyber Solutions Laboratories) | Fujiwara, Yasuhiro (NTT Cyber Space Laboratories) | Uchiyama, Toshio (NTT Cyber Solutions Laboratories) | Fujimura, Ko (NTT Cyber Solutions Laboratories)
Subjective assessments (SAs) are assigned by users against items, such as ’elegant’ and ’gorgeous’, and are common in reviews/tags in many online-sites. However, previous studies fail to effectively use SAs for improving recommendations because few users rate the same items with the same SAs, which triggers the sparsity problem in collaborative filtering. We propose a novel algorithm that links a taxonomy of items to a taxonomy of SAs to assess user interests in detail. That is, it merges the SAs assigned by users against an item into subjective classes (SCs) and reflects the SAs/SCs assigned to an item to its classes. Thus, it can measure the similarity of users from not only SAs/SCs assigned to items but also their classes, which overcomes the sparsity problem. Our evaluation, which uses data from a popular restaurant review site, shows that our method generates more accurate recommendations than previous methods. Furthermore, we find that SAs frequently assigned on a few item classes are more useful than those widely assigned against many item classes in terms of recommendation accuracy.
Combining Machine Learning and Optimization Techniques to Determine 3-D Structures of Polypeptides
Dorn, Marcio (Federal University of Rio Grande do Sul) | Buriol, Luciana Salete (Federal University of Rio Grande do Sul) | Lamb, Luis da Cunha (Federal University of Rio Grande do Sul)
One of the main research problems in Structural Bioinformatics is the analysis and prediction of three-dimensional structures (3-D) of polypeptides or proteins. The 1990’s Genome projects resulted in a large increase in the number of protein sequences. However, the number of identified 3-D protein structures has not followed the same trend.The determination of protein structure is experimentally expensive and time consuming. This makes scientists largely dependent on computational methods that can predict correct 3-D protein structures only from extended and full amino acid sequences. Several computational methodologies and algorithms have been proposed as a solution to the Protein Structure Prediction (PSP) problem. We briefly describe the AI techniques we have been used to tackle this problem.
Large Linear Classification When Data Cannot Fit in Memory
Yu, Hsiang-Fu (National Taiwan University) | Hsieh, Cho-Jui (National Taiwan University) | Chang, Kai-Wei (National Taiwan University) | Lin, Chih-Jen (National Taiwan University)
Linear classification is a useful tool for dealing with large-scale data in applications such as document classification and natural language processing. Recent developments of linear classification have shown that the training process can be efficiently conducted. However, when the data size exceeds the memory capacity, most training methods suffer from very slow convergence due to the severe disk swapping. Although some methods have attempted to handle such a situation, they are usually too complicated to support some important functions such as parameter selection. In this paper, we introduce a block minimization framework for data larger than memory. Under the framework, a solver splits data into blocks and stores them into separate files. Then, at each time, the solver trains a data block loaded from disk. Although the framework is simple, the experimental results show that it effectively handles a data set 20 times larger than the memory capacity.
A System for Providing Differentiated QoS in Retail Banking
Mehta, Sameep (IBM Research - India) | Chafle, Girish (IBM India Software Lab) | Parija, Gyana (IBM Research - India) | Kedia, Vikas (Google Inc.)
In today's services driven economic environment, it is imperative for organizations to provide better quality service experience to differentiate and grow their business. Customer satisfaction (C-SAT) is the key driver for retention and growth in Retail Banking. Wait time, the time spent by a customer at the branch before getting serviced, contributes significantly to C-SAT. Due to high footfall, it is improbable to improve the wait time of every customer walking in the branch. Therefore, banks in developing countries are strategically looking to segment its customers and services and offer differentiated QoS based service delivery. In this work, we present a system for customer segmentation, and scheduling based on historic value of the customer and characteristics of current service request. We describe the system and give mathematical formulation of the scheduling problem and the associated heuristics. We present results and experience of deployment of this solution in multiple branches of a leading bank in India.
Coordinating Logistics Operations with Privacy Guarantees
Léauté, Thomas (Ecole Polytechnique Federale de Lausanne (EPFL)) | Faltings, Boi (Ecole Polytechnique Federale de Lausanne (EPFL))
Several logistics service providers serve a certain number of customers, geographically spread over an area of operations. They would like to coordinate their operations so as to minimize overall cost. At the same time, they would like to keep information about their costs, constraints and preferences private, thus precluding conventional negotiation. We show how AI techniques, in particular Distributed Constraint Optimization (DCOP), can be integrated with cryptographic techniques to allow such coordination without revealing agents' private information. The problem of assigning customers to companies is formulated as a DCOP, for which we propose two novel, privacy-preserving algorithms. We compare their performances and privacy properties on a set of Vehicle Routing Problem benchmarks.