Goto

Collaborating Authors

 Perez, Guillaume


Multi-level projection with exponential parallel speedup; Application to sparse auto-encoders neural networks

arXiv.org Artificial Intelligence

The $\ell_{1,\infty}$ norm is an efficient structured projection but the complexity of the best algorithm is unfortunately $\mathcal{O}\big(n m \log(n m)\big)$ for a matrix in $\mathbb{R}^{n\times m}$. In this paper, we propose a new bi-level projection method for which we show that the time complexity for the $\ell_{1,\infty}$ norm is only $\mathcal{O}\big(n m \big)$ for a matrix in $\mathbb{R}^{n\times m}$, and $\mathcal{O}\big(n + m \big)$ with full parallel power. We generalize our method to tensors and we propose a new multi-level projection, having an induced decomposition that yields a linear parallel speedup up to an exponential speedup factor, resulting in a time complexity lower-bounded by the sum of the dimensions, instead of the product of the dimensions. we provide a large base of implementation of our framework for bi-level and tri-level (matrices and tensors) for various norms and provides also the parallel implementation. Experiments show that our projection is $2$ times faster than the actual fastest Euclidean algorithms while providing same accuracy and better sparsity in neural networks applications.


A Constraint Programming Model for Scheduling the Unloading of Trains in Ports: Extended

arXiv.org Artificial Intelligence

In this paper, we propose a model to schedule the next 24 hours of operations in a bulk cargo port to unload bulk cargo trains onto stockpiles. It is a problem that includes multiple parts such as splitting long trains into shorter ones and the routing of bulk material through a configurable network of conveyors to the stockpiles. Managing such trains (up to three kilometers long) also requires specialized equipment. The real world nature of the problem specification implies the necessity to manage heterogeneous data. Indeed, when new equipment is added (e.g. dumpers) or a new type of wagon comes in use, older or different equipment will still be in use as well. All these details need to be accounted for. In fact, avoiding a full deadlock of the facility after a new but ineffective schedule is produced. In this paper, we provide a detailed presentation of this real world problem and its associated data. This allows us to propose an effective constraint programming model to solve this problem. We also discuss the model design and the different implementations of the propagators that we used in practice. Finally, we show how this model, coupled with a large neighborhood search, was able to find 24 hour schedules efficiently.


Near-Linear Time Projection onto the $\ell_{1,\infty}$ Ball; Application to Sparse Autoencoders

arXiv.org Artificial Intelligence

Looking for sparsity is nowadays crucial to speed up the training of large-scale neural networks. Projections onto the $\ell_{1,2}$ and $\ell_{1,\infty}$ are among the most efficient techniques to sparsify and reduce the overall cost of neural networks. In this paper, we introduce a new projection algorithm for the $\ell_{1,\infty}$ norm ball. The worst-case time complexity of this algorithm is $\mathcal{O}\big(nm+J\log(nm)\big)$ for a matrix in $\mathbb{R}^{n\times m}$. $J$ is a term that tends to 0 when the sparsity is high, and to $nm$ when the sparsity is low. Its implementation is easy and it is guaranteed to converge to the exact solution in a finite time. Moreover, we propose to incorporate the $\ell_{1,\infty}$ ball projection while training an autoencoder to enforce feature selection and sparsity of the weights. Sparsification appears in the encoder to primarily do feature selection due to our application in biology, where only a very small part ($<2\%$) of the data is relevant. We show that both in the biological case and in the general case of sparsity that our method is the fastest.


Constrained Machine Learning: The Bagel Framework

arXiv.org Artificial Intelligence

Machine learning models are widely used for real-world applications, such as document analysis and vision. Constrained machine learning problems are problems where learned models have to both be accurate and respect constraints. For continuous convex constraints, many works have been proposed, but learning under combinatorial constraints is still a hard problem. The goal of this paper is to broaden the modeling capacity of constrained machine learning problems by incorporating existing work from combinatorial optimization. We propose first a general framework called BaGeL (Branch, Generate and Learn) which applies Branch and Bound to constrained learning problems where a learning problem is generated and trained at each node until only valid models are obtained. Because machine learning has specific requirements, we also propose an extended table constraint to split the space of hypotheses.


Efficient Projection Algorithms onto the Weighted l1 Ball

arXiv.org Artificial Intelligence

Projected gradient descent has been proved efficient in many optimization and machine learning problems. The weighted $\ell_1$ ball has been shown effective in sparse system identification and features selection. In this paper we propose three new efficient algorithms for projecting any vector of finite length onto the weighted $\ell_1$ ball. The first two algorithms have a linear worst case complexity. The third one has a highly competitive performances in practice but the worst case has a quadratic complexity. These new algorithms are efficient tools for machine learning methods based on projected gradient descent such as compress sensing, feature selection. We illustrate this effectiveness by adapting an efficient compress sensing algorithm to weighted projections. We demonstrate the efficiency of our new algorithms on benchmarks using very large vectors. For instance, it requires only 8 ms, on an Intel I7 3rd generation, for projecting vectors of size $10^7$.


Parallel Algorithms for Operations on Multi-Valued Decision Diagrams

AAAI Conferences

Multi-valued Decision Diagrams (MDDs) have been extensively studied in the last ten years. Recently, efficient algorithms implementing operators such as reduction, union, intersection, difference, etc., have been designed. They directly deal with the graph structure of the MDD and a time reduction of several orders of magnitude in comparison to other existing algorithms have been observed. These operators have permitted a new look at MDDs, because extremely large MDDs can finally be manipulated as shown by the models used to solve complex application in music generation. However, MDDs become so large (50GB) that minutes are sometimes required to perform some operations. In order to accelerate the manipulation of MDDs, parallel algorithms are required. In this paper, we introduce such algorithms. We carefully design them in order to overcome inherent difficulties of the parallelization of sequential algorithms such as data dependencies, software lock-out, false sharing, or load balancing. As a result, we observe a speed-up , i.e. ratio between parallel and sequential runtimes, growing linearly with the number of cores.


Soft and Cost MDD Propagators

AAAI Conferences

Recent developments of efficient propagators, operations and creation methods for MDDs allow us to directly build efficient MDD-based models, without the need for intermediate data structures. In this paper, we take another step in this direction by improving the propagators of cost MDDs. In addition, we introduce a soft MDD propagator in order to deal with unsatisfiable problems. This directly offers cost and soft versions for table constraints and any constraints which can be represented by an MDD (regular, slide, knapsack...).


Efficient Operations On MDDs for Building Constraint Programming Models

AAAI Conferences

For instance, phrase generation problem involves domains having more than d 10, 000 values. Thus, We propose improved algorithms for defining the we cannot use an algorithm whose time or space complexity most common operations on Multi-Valued Decision is mainly based on Ω(nd), where n is the number of nodes Diagrams (MDDs): creation, reduction, complement, of the MDDs. Therefore, we need to improve the algorithms intersection, union, difference, symmetric performing the main operations on MDDs: creation, reduction difference, complement of union and complement and combinations. of intersection. Then, we show that with these algorithms The new creation algorithm we propose, exploits the origin and thanks to the recent development of an of the definition of the MDD. If the MDD represents an automaton efficient algorithm establishing arc consistency for (like with a regular constraint) or a repeated pattern MDD based constraints (MDD4R), we can simply (like with dynamic programming), then its creation may be solve some problems by modeling them as a set of sped-up.