AITopics | Optimization

Collaborating Authors

Optimization

News Overviews Instructional Materials AI-Alerts Classics

A New Optimization Layer for Real-Time Bidding Advertising Campaigns

Micchi, Gianluca, Soheily-Khah, Saeid, Turner, Jacob

arXiv.org Artificial IntelligenceAug-7-2018

While it is relatively easy to start an online advertising campaign, obtaining a high Key Performance Indicator (KPI) can be challenging. A large body of work on this subject has already been performed and platforms known as DSPs are available on the market that deal with such an optimization. From the advertiser's point of view, each DSP is a different black box, with its pros and cons, that needs to be configured. In order to take advantage of the pros of every DSP, advertisers are well-advised to use a combination of them when setting up their campaigns. In this paper, we propose an algorithm for advertisers to add an optimization layer on top of DSPs. The algorithm we introduce, called SKOTT, maximizes the chosen KPI by optimally configuring the DSPs and putting them in competition with each other. SKOTT is a highly specialized iterative algorithm loosely based on gradient descent that is made up of three independent sub-routines, each dealing with a different problem: partitioning the budget, setting the desired average bid, and preventing under-delivery. In particular, one of the novelties of our approach lies in our taking the perspective of the advertisers rather than the DSPs. Synthetic market data is used to evaluate the efficiency of SKOTT against other state-of-the-art approaches adapted from similar problems. The results illustrate the benefits of our proposals, which greatly outperforms the other methods.

algorithm, optimization problem, upstream oil & gas, (23 more...)

arXiv.org Artificial Intelligence

1808.03147

Country:

North America > United States (0.28)
Europe (0.28)

Genre: Research Report (1.00)

Industry:

Marketing (1.00)
Energy > Oil & Gas > Upstream (1.00)
Information Technology > Services (0.88)
Banking & Finance > Trading (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)
Information Technology > Data Science > Data Mining > Big Data (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

Hashing with Binary Matrix Pursuit

Cakir, Fatih, He, Kun, Sclaroff, Stan

arXiv.org Machine LearningAug-6-2018

We propose theoretical and empirical improvements for two-stage hashing methods. We first provide a theoretical analysis on the quality of the binary codes and show that, under mild assumptions, a residual learning scheme can construct binary codes that fit any neighborhood structure with arbitrary accuracy. Secondly, we show that with high-capacity hash functions such as CNNs, binary code inference can be greatly simplified for many standard neighborhood definitions, yielding smaller optimization problems and more robust codes. Incorporating our findings, we propose a novel two-stage hashing method that significantly outperforms previous hashing studies on widely used image retrieval benchmarks.

artificial intelligence, binary code, machine learning, (18 more...)

arXiv.org Machine Learning

1808.0199

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > Massachusetts > Middlesex County > Lexington (0.04)
Asia > Singapore (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.94)
(2 more...)

Add feedback

Fast Variance Reduction Method with Stochastic Batch Size

Liu, Xuanqing, Hsieh, Cho-Jui

arXiv.org Machine LearningAug-6-2018

In this paper we study a family of variance reduction methods with randomized batch size---at each step, the algorithm first randomly chooses the batch size and then selects a batch of samples to conduct a variance-reduced stochastic update. We give the linear convergence rate for this framework for composite functions, and show that the optimal strategy to achieve the optimal convergence rate per data access is to always choose batch size of 1, which is equivalent to the SAGA algorithm. However, due to the presence of cache/disk IO effect in computer architecture, the number of data access cannot reflect the running time because of 1) random memory access is much slower than sequential access, 2) when data is too big to fit into memory, disk seeking takes even longer time. After taking these into account, choosing batch size of $1$ is no longer optimal, so we propose a new algorithm called SAGA++ and show how to calculate the optimal average batch size theoretically. Our algorithm outperforms SAGA and other existing batched and stochastic solvers on real datasets. In addition, we also conduct a precise analysis to compare different update rules for variance reduction methods, showing that SAGA++ converges faster than SVRG in theory.

artificial intelligence, batch size, machine learning, (14 more...)

arXiv.org Machine Learning

1808.02169

Country:

North America > United States > California > Yolo County > Davis (0.14)
Europe > Sweden > Stockholm > Stockholm (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.31)

Add feedback

Designing Adaptive Neural Networks for Energy-Constrained Image Classification

Stamoulis, Dimitrios, Chin, Ting-Wu, Prakash, Anand Krishnan, Fang, Haocheng, Sajja, Sribhuvan, Bognar, Mitchell, Marculescu, Diana

arXiv.org Machine LearningAug-6-2018

As convolutional neural networks (CNNs) enable state-of-the-art computer vision applications, their high energy consumption has emerged as a key impediment to their deployment on embedded and mobile devices. Towards efficient image classification under hardware constraints, prior work has proposed adaptive CNNs, i.e., systems of networks with different accuracy and computation characteristics, where a selection scheme adaptively selects the network to be evaluated for each input image. While previous efforts have investigated different network selection schemes, we find that they do not necessarily result in energy savings when deployed on mobile systems. The key limitation of existing methods is that they learn only how data should be processed among the CNNs and not the network architectures, with each network being treated as a blackbox. To address this limitation, we pursue a more powerful design paradigm where the architecture settings of the CNNs are treated as hyper-parameters to be globally optimized. We cast the design of adaptive CNNs as a hyper-parameter optimization problem with respect to energy, accuracy, and communication constraints imposed by the mobile device. To efficiently solve this problem, we adapt Bayesian optimization to the properties of the design space, reaching near-optimal configurations in few tens of function evaluations. Our method reduces the energy consumed for image classification on a mobile device by up to 6x, compared to the best previously published work that uses CNNs as blackboxes. Finally, we evaluate two image classification practices, i.e., classifying all images locally versus over the cloud under energy and communication constraints.

artificial intelligence, machine learning, optimization, (17 more...)

arXiv.org Machine Learning

1808.0155

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
Europe (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Asia (0.04)

Genre: Research Report (0.82)

Industry:

Information Technology (0.69)
Energy (0.49)
Education (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

NIMFA: A Python Library for Nonnegative Matrix Factorization

Zitnik, Marinka, Zupan, Blaz

arXiv.org Machine LearningAug-6-2018

NIMFA is an open-source Python library that provides a unified interface to nonnegative matrix factorization algorithms. It includes implementations of state-of-the-art factorization methods, initialization approaches, and quality scoring. It supports both dense and sparse matrix representation. NIMFA's component-based implementation and hierarchical design should help the users to employ already implemented techniques or design and code new strategies for matrix factorization tasks.

artificial intelligence, machine learning, optimization problem, (15 more...)

arXiv.org Machine Learning

1808.01743

Country:

North America > United States (0.16)
North America > Canada > Ontario > Toronto (0.16)
Europe > Slovenia > Central Slovenia > Municipality of Ljubljana > Ljubljana (0.05)
(3 more...)

Genre: Research Report (0.40)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.47)

Add feedback

Multi-Objective Cognitive Model: a supervised approach for multi-subject fMRI analysis

Yousefnezhad, Muhammad, Zhang, Daoqiang

arXiv.org Machine LearningAug-5-2018

Neuroinform manuscript No. (will be inserted by the editor) Abstract In order to decode human brain, Multivariate Pattern (MVP) classification generates cognitive models by using functional Magnetic Resonance Imaging (fMRI) datasets. As a standard pipeline in the MVP analysis, brain patterns in multi-subject fMRI dataset must be mapped to a shared space and then a classification model is generated by employing the mapped patterns. However, the MVP models may not provide stable performance on a new fMRI dataset because the standard pipeline uses disjoint steps for generating these models. Indeed, each step in the pipeline includes an objective function with independent optimization approach, where the best solution of each step may not be optimum for the next steps. For tackling the mentioned issue, this paper introduces Multi-Objective Cognitive Model (MOCM) that utilizes an integrated objective function for MVP analysis rather than just using those disjoint steps. For solving the integrated problem, we proposed a customized multi-objective optimization approach, where all possible solutions are firstly generated, and then our method ranks and selects the robust solutions as the final results. Empirical studies confirm that the proposed method can generate superior performance in comparison with other techniques. Keywords Multi-Objective Cognitive Model · fMRI Analysis · Multivariate Pattern · Multi-Objective Optimization 1 Introduction One of the primary goals in neuroscience is to understand how the neural activities in the human brain can be mapped to different cognitive tasks. The authors are with the College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China. Magnetic Resonance Imaging (fMRI) data is an interdisciplinary technique.

artificial intelligence, machine learning, optimization problem, (17 more...)

arXiv.org Machine Learning

1808.01642

Country:

Asia > China > Jiangsu Province > Nanjing (0.44)
North America > United States > California (0.28)

Genre:

Research Report (1.00)
Workflow (0.93)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Multi-objective optimization to explicitly account for model complexity when learning Bayesian Networks

Cazzaniga, Paolo, Nobile, Marco S., Ramazzotti, Daniele

arXiv.org Machine LearningAug-3-2018

Bayesian Networks have been widely used in the last decades in many fields, to describe statistical dependencies among random variables. In general, learning the structure of such models is a problem with considerable theoretical interest that still poses many challenges. On the one hand, this is a well-known NP-complete problem, which is practically hardened by the huge search space of possible solutions. On the other hand, the phenomenon of I-equivalence, i.e., different graphical structures underpinning the same set of statistical dependencies, may lead to multimodal fitness landscapes further hindering maximum likelihood approaches to solve the task. Despite all these difficulties, greedy search methods based on a likelihood score coupled with a regularization term to account for model complexity, have been shown to be surprisingly effective in practice. In this paper, we consider the formulation of the task of learning the structure of Bayesian Networks as an optimization problem based on a likelihood score. Nevertheless, our approach do not adjust this score by means of any of the complexity terms proposed in the literature; instead, it accounts directly for the complexity of the discovered solutions by exploiting a multi-objective optimization procedure. To this extent, we adopt NSGA-II and define the first objective function to be the likelihood of a solution and the second to be the number of selected arcs. We thoroughly analyze the behavior of our method on a wide set of simulated data, and we discuss the performance considering the goodness of the inferred solutions both in terms of their objective functions and with respect to the retrieved structure. Our results show that NSGA-II can converge to solutions characterized by better likelihood and less arcs than classic approaches, although paradoxically frequently characterized by a lower similarity to the target network.

artificial intelligence, machine learning, nsga-ii, (17 more...)

arXiv.org Machine Learning

1808.01345

Country:

Europe > Italy > Lombardy > Milan (0.04)
North America > United States > Michigan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Inferring Parameters Through Inverse Multiobjective Optimization

Dong, Chaosheng, Zeng, Bo

arXiv.org Machine LearningAug-2-2018

Given a set of human's decisions that are observed, inverse optimization has been developed and utilized to infer the underlying decision making problem. The majority of existing studies assumes that the decision making problem is with a single objective function, and attributes data divergence to noises, errors or bounded rationality, which, however, could lead to a corrupted inference when decisions are tradeoffs among multiple criteria. In this paper, we take a data-driven approach and design a more sophisticated inverse optimization formulation to explicitly infer parameters of a multiobjective decision making problem from noisy observations. This framework, together with our mathematical analyses and advanced algorithm developments, demonstrates a strong capacity in estimating critical parameters, decoupling "interpretable" components from noises or errors, deriving the denoised \emph{optimal} decisions, and ensuring statistical significance. In particular, for the whole decision maker population, if suitable conditions hold, we will be able to understand the overall diversity and the distribution of their preferences over multiple criteria, which is important when a precise inference on every single decision maker is practically unnecessary or infeasible. Numerical results on a large number of experiments are reported to confirm the effectiveness of our unique inverse optimization model and the computational efficacy of the developed algorithms.

objective function, optimization problem, upstream oil & gas, (20 more...)

arXiv.org Machine Learning

1808.00935

Country: North America > United States (0.27)

Genre: Research Report (1.00)

Industry:

Energy > Oil & Gas > Upstream (0.46)
Banking & Finance > Trading (0.45)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

Add feedback

A Robust Genetic Algorithm for Learning Temporal Specifications from Data

Nenzi, Laura, Silvetti, Simone, Bartocci, Ezio, Bortolussi, Luca

arXiv.org Artificial IntelligenceAug-1-2018

We consider the problem of mining signal temporal logical requirements from a dataset of regular (good) and anomalous (bad) trajectories of a dynamical system. We assume the training set to be labeled by human experts and that we have access only to a limited amount of data, typically noisy. We provide a systematic approach to synthesize both the syntactical structure and the parameters of the temporal logic formula using a two-steps procedure: first, we leverage a novel evolutionary algorithm for learning the structure of the formula; second, we perform the parameter synthesis operating on the statistical emulation of the average robustness for a candidate formula w.r.t. its parameters. We compare our results with our previous work [{BufoBSBLB14] and with a recently proposed decision-tree [bombara_decision_2016] based method. We present experimental results on two case studies: an anomalous trajectory detection problem of a naval surveillance system and the characterization of an Ineffective Respiratory effort, showing the usefulness of our work.

evolutionary algorithm, formula, machine learning, (15 more...)

arXiv.org Artificial Intelligence

1711.06202

Country:

Europe > Austria > Vienna (0.14)
Europe > Italy > Friuli Venezia Giulia > Trieste Province > Trieste (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Predicting Solution Summaries to Integer Linear Programs under Imperfect Information with Machine Learning

Larsen, Eric, Lachapelle, Sébastien, Bengio, Yoshua, Frejinger, Emma, Lacoste-Julien, Simon, Lodi, Andrea

arXiv.org Machine LearningJul-31-2018

The paper provides a methodological contribution at the intersection of machine learning and operations research. Namely, we propose a methodology to quickly predict solution summaries (i.e., solution descriptions at a given level of detail) to discrete stochastic optimization problems. We approximate the solutions based on supervised learning and the training dataset consists of a large number of deterministic problems that have been solved independently and offline. Uncertainty regarding a missing subset of the inputs is addressed through sampling and aggregation methods. Our motivating application concerns booking decisions of intermodal containers on double-stack trains. Under perfect information, this is the so-called load planning problem and it can be formulated by means of integer linear programming. However, the formulation cannot be used for the application at hand because of the restricted computational budget and unknown container weights. The results show that standard deep learning algorithms allow one to predict descriptions of solutions with high accuracy in very short time (milliseconds or less).

artificial intelligence, container, machine learning, (17 more...)

arXiv.org Machine Learning

1807.11876

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report > New Finding (0.87)

Industry: Transportation > Freight & Logistics Services (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback