Goto

Collaborating Authors

 Optimization


ECC: Energy-Constrained Deep Neural Network Compression via a Bilinear Regression Model

arXiv.org Machine Learning

Many DNN-enabled vision applications constantly operate under severe energy constraints such as unmanned aerial vehicles, Augmented Reality headsets, and smartphones. Designing DNNs that can meet a stringent energy budget is becoming increasingly important. This paper proposes ECC, a framework that compresses DNNs to meet a given energy constraint while minimizing accuracy loss. The key idea of ECC is to model the DNN energy consumption via a novel bilinear regression function. The energy estimate model allows us to formulate DNN compression as a constrained optimization that minimizes the DNN loss function over the energy constraint. The optimization problem, however, has nontrivial constraints. Therefore, existing deep learning solvers do not apply directly. We propose an optimization algorithm that combines the essence of the Alternating Direction Method of Multipliers (ADMM) framework with gradient-based learning algorithms. The algorithm decomposes the original constrained optimization into several subproblems that are solved iteratively and efficiently. ECC is also portable across different hardware platforms without requiring hardware knowledge. Experiments show that ECC achieves higher accuracy under the same or lower energy budget compared to state-of-the-art resource-constrained DNN compression techniques.


Classification using Ensemble Learning under Weighted Misclassification Loss

arXiv.org Machine Learning

Binary classification rules based on covariates typically depend on simple loss functions such as zero-one misclassification. Some cases may require more complex loss functions. For example, individual-level monitoring of HIV-infected individuals on antiretroviral therapy (ART) requires periodic assessment of treatment failure, defined as having a viral load (VL) value above a certain threshold. In some resource limited settings, VL tests may be limited by cost or technology, and diagnoses are based on other clinical markers. Depending on scenario, higher premium may be placed on avoiding false-positives which brings greater cost and reduced treatment options. Here, the optimal rule is determined by minimizing a weighted misclassification loss/risk. We propose a method for finding and cross-validating optimal binary classification rules under weighted misclassification loss. We focus on rules comprising a prediction score and an associated threshold, where the score is derived using an ensemble learner. Simulations and examples show that our method, which derives the score and threshold jointly, more accurately estimates overall risk and has better operating characteristics compared with methods that derive the score first and the cutoff conditionally on the score especially for finite samples.


A Logarithmic Barrier Method For Proximal Policy Optimization

arXiv.org Machine Learning

Proximal policy optimization(PPO) [ Schulman et al., 2017 ] has been proposed as a first-order optimization method for reinforcement learning. We should notice that an exterior penalty method is used in it. Often, the minimizers of the exterior penalty functions approach feasibility only in the limits as the penalty parameter grows increasingly large. Therefore, it may result in the low level of sampling efficiency. This method, which we call proximal policy optimization with barrier method (PPO-B), keeps almost all advantageous spheres of PPO such as easy implementation and good generalization. Specifically, a new surrogate objective with interior penalty method is proposed to avoid the defect arose from exterior penalty method. Conclusions can be draw that PPO-B is able to outperform PPO in terms of sampling efficiency since PPO-B achieved clearly better performance on Atari and Mujoco environment than PPO.


Online Decisioning Meta-Heuristic Framework for Large Scale Black-Box Optimization

arXiv.org Artificial Intelligence

Out of practical concerns and with the expectation to achieve high overall efficiency of the resource utilization, this paper transforms the large scale black-box optimization problems with limited resources into online decision problems from the perspective of dynamic multi-armed bandits, a simplified view of Markov decision processes. The proposed Online Decisioning Meta-heuristic framework (ODM) is particularly well suited for real-world applications, with flexible compatibility for various kinds of costs, interfaces for easy heuristic articulation as well as fewer hyper-parameters for less variance in performance. Experimental results on benchmark functions suggest that ODM has demonstrated significant capabilities for online decisioning. Furthermore, when ODM is articulated with three heuristics, competitive performance can be achieved on benchmark problems with search dimensions up to 10000.


A near Pareto optimal approach to student-supervisor allocation with two sided preferences and workload balance

arXiv.org Artificial Intelligence

Students are usually allocated tosupervisors for their projects by means of a centralized human decision maker or by means of interactions between students and staff members. The decision makers have to take into consideration the preferences of both students and supervisors with respect to the conduct of the project, as well as departmental constraintssuch as minimum and maximum levels of workload (in terms of supervision) for each supervisor. This situation results in an extremely time consuming process, and a suboptimal allocation due to a large and complex search space faced by human decision makers. Automating this process by applying artificial intelligence techniques may enhance the process in terms of satisfaction and performance of students with these individual projects. In this article, we present a genetic algorithm for matching students to supervisors accordingto both students' and supervisors' preferences and the constraints of the department. The rationale behind this problem is matching an appropriate student with a supervisor for the development of an individual project.The problem of matching students to supervisors, or students to projects [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13], is a subclass of the wider problem of matching between two sets, one of the most studied fields in computer sciencedue to its applications to a wide range of domains such as the hospital/residents (HR) or the college admission (CA) problem [14, 15, 16]. Particularly, the student-supervisor allocation problem solved in this article can be considered as an instance of the CA problem with lower and upper quotas, where the colleges are the supervisors, both colleges and students (i.e., supervisors andstudents in our case) have some representation of preferences on each other for the conduct of a project, and the minimum and maximum quotas are the minimum and maximum number of students to be supervised by staff members.


Multi-Tasking Evolutionary Algorithm (MTEA) for Single-Objective Continuous Optimization

arXiv.org Machine Learning

ULTI-task learning[3], [23] is a subfield of machine learning, particularly transfer learning [17], [22], [24], [25], which uses auxiliary data or knowledge from related/similar tasks to facilitate the learning in a new task. As a result, a learning model for the new task can be built with much less task-specific training data. Or, in other words, with the same amount of task-specific data, a much better model could be trained. In multi-task learning, multiple related learning tasks are performed simultaneously using a (partially) shared model representation. As a result, the common information contained in these related tasks can be exploited to improve the learning efficiency and generalization performance of each task-specific model. Multi-task optimization (MTO) [6], [12], [16], [19] applies multi-task learning to optimization to study how to effectively and efficiently tackle multiple optimization problems simultaneously. Evolutionarymulti-tasking [16], or multifactorial optimization (MFO)[12], is an emerging subfield of MTO, which integrates evolutionary computation and multi-task learning.


Embedding Push and Pull Search in the Framework of Differential Evolution for Solving Constrained Single-objective Optimization Problems

arXiv.org Artificial Intelligence

This paper proposes a push and pull search method in the framework of differential evolution (PPS-DE) to solve constrained single-objective optimization problems (CSOPs). More specifically, two sub-populations, including the top and bottom sub-populations, are collaborated with each other to search global optimal solutions efficiently. The top sub-population adopts the pull and pull search (PPS) mechanism to deal with constraints, while the bottom sub-population use the superiority of feasible solutions (SF) technique to deal with constraints. In the top sub-population, the search process is divided into two different stages --- push and pull stages.An adaptive DE variant with three trial vector generation strategies is employed in the proposed PPS-DE. In the top sub-population, all the three trial vector generation strategies are used to generate offsprings, just like in CoDE. In the bottom sub-population, a strategy adaptation, in which the trial vector generation strategies are periodically self-adapted by learning from their experiences in generating promising solutions in the top sub-population, is used to choose a suitable trial vector generation strategy to generate one offspring. Furthermore, a parameter adaptation strategy from LSHADE44 is employed in both sup-populations to generate scale factor $F$ and crossover rate $CR$ for each trial vector generation strategy. Twenty-eight CSOPs with 10-, 30-, and 50-dimensional decision variables provided in the CEC2018 competition on real parameter single objective optimization are optimized by the proposed PPS-DE. The experimental results demonstrate that the proposed PPS-DE has the best performance compared with the other seven state-of-the-art algorithms, including AGA-PPS, LSHADE44, LSHADE44+IDE, UDE, IUDE, $\epsilon$MAg-ES and C$^2$oDE.


Low-rank semidefinite programming for the MAX2SAT problem

arXiv.org Artificial Intelligence

This paper proposes a new algorithm for solving MAX2SAT problems based on combining search methods with semidefinite programming approaches. Semidefinite programming techniques are well-known as a theoretical tool for approximating maximum satisfiability problems, but their application has traditionally been very limited by their speed and randomized nature. Our approach overcomes this difficult by using a recent approach to low-rank semidefinite programming, specialized to work in an incremental fashion suitable for use in an exact search algorithm. The method can be used both within complete or incomplete solver, and we demonstrate on a variety of problems from recent competitions. Our experiments show that the approach is faster (sometimes by orders of magnitude) than existing state-of-the-art complete and incomplete solvers, representing a substantial advance in search methods specialized for MAX2SAT problems.


On the Convergence of Federated Optimization in Heterogeneous Networks

arXiv.org Machine Learning

Modern networks of remote devices, such as mobile phones, wearable devices, and autonomous vehicles, generate massive amounts of data each day. Federated learning involves training statistical models directly on these devices, and introduces novel statistical and systems challenges that require a fundamental departure from standard methods designed for distributed optimization in data center environments. From a statistical perspective, each device collects data in a non-identical and heterogeneous fashion, and the number of data points on each device may also vary significantly. Federated optimization methods must therefore be designed in a robust fashion in order to provably converge when dealing with heterogeneous statistical data. From a systems perspective, the size of the network and high cost of communication impose two additional constraints on federated optimization methods: (i) limited network participation, and (ii) high communication costs. In terms of participation, at each communication round, proposed methods should only require a small number of devices to be active. As most devices have only short windows of availability, communicating with the entire network at once can be prohibitively expensive. In terms of communication, proposed methods should allow for Preprint.


A Tutorial on Distance Metric Learning: Mathematical Foundations, Algorithms and Software

arXiv.org Machine Learning

This paper describes the discipline of distance metric learning, a branch of machine learning that aims to learn distances from the data. Distance metric learning can be useful to improve similarity learning algorithms, and also has applications in dimensionality reduction. We describe the distance metric learning problem and analyze its main mathematical foundations. We discuss some of the most popular distance metric learning techniques used in classification, showing their goals and the required information to understand and use them. Furthermore, we present a Python package that collects a set of 17 distance metric learning techniques explained in this paper, with some experiments to evaluate the performance of the different algorithms. Finally, we discuss several possibilities of future work in this topic.