AITopics

Combinatorial optimization is a common theme in computer science which underlies a considerable variety of problems. In contrast to the continuous setting, combinatorial problems require special solution strategies, and it's hard to come by generic schemes like gradient methods for continuous domains. We follow a standard construction of a parametric sampling distribution that transforms the problem to the continuous domain, allowing us to optimize the expectation of a given objective using estimates of the gradient. In spite of the apparent generality, such constructions are known to suffer from highly variable gradient estimates, and thus require careful tuning that is done in a problem specific manner. We show that a simple trick of converting the objective values to their cumulative probabilities fixes the distribution of the objective, allowing us to derive an online optimization algorithm that can be applied in a generic fashion. As an experimental benchmark we use the task of finding cliques in undirected graphs, and we show that our method, even when blindly applied, consistently outperforms related methods. Notably, on the DIMACS clique benchmark, our method approaches the performance of the best clique finding algorithms without access to the graph structure, and only through objective function evaluations, thus providing significant evidence to the generality and effectivity of our method.

algorithm, artificial intelligence, machine learning, (18 more...)

1802.0903

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Wang, Yingfei, Nascimento, Juliana Martins Do, Powell, Warren

Dynamic Bidding for Advance Commitments in Truckload Brokerage Markets

Truckload brokerages, a $100 billion/year industry in the U.S., plays the critical role of matching shippers with carriers, often to move loads several days into the future. Brokerages not only have to find companies that will agree to move a load, the brokerage often has to find a price that both the shipper and carrier will agree to. The price not only varies by shipper and carrier, but also by the traffic lanes and other variables such as commodity type. Brokerages have to learn about shipper and carrier response functions by offering a price and observing whether each accepts the quote. We propose a knowledge gradient policy with bootstrap aggregation for high-dimensional contextual settings to guide price experimentation by maximizing the value of information. The learning policy is tested using a newly developed, carefully calibrated fleet simulator that includes a stochastic lookahead policy that simulates fleet movements, as well as the stochastic modeling of driver assignments and the carrier's load commitment policies with advance booking.

artificial intelligence, information, machine learning, (17 more...)

1802.08976

Country: North America > United States (0.66)

Genre: Research Report (1.00)

Industry:

Transportation > Freight & Logistics Services (1.00)
Transportation > Ground > Road (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Lê-Huu, D. Khuê, Paragios, Nikos

Continuous Relaxation of MAP Inference: A Nonconvex Perspective

In this paper, we study a nonconvex continuous relaxation of MAP inference in discrete Markov random fields (MRFs). We show that for arbitrary MRFs, this relaxation is tight, and a discrete stationary point of it can be easily reached by a simple block coordinate descent algorithm. In addition, we study the resolution of this relaxation using popular gradient methods, and further propose a more effective solution using a multilinear decomposition framework based on the alternating direction method of multipliers (ADMM). Experiments on many real-world problems demonstrate that the proposed ADMM significantly outperforms other nonconvex relaxation based methods, and compares favorably with state of the art MRF optimization algorithms in different settings.

artificial intelligence, machine learning, runtime 0, (19 more...)

1802.07796

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Wang, Yining, Du, Simon, Balakrishnan, Sivaraman, Singh, Aarti

Stochastic Zeroth-order Optimization in High Dimensions

We consider the problem of optimizing a high-dimensional convex function using stochastic zeroth-order queries. Under sparsity assumptions on the gradients or function values, we present two algorithms: a successive component/feature selection algorithm and a noisy mirror descent algorithm using Lasso gradient estimates, and show that both algorithms have convergence rates that de- pend only logarithmically on the ambient dimension of the problem. Empirical results confirm our theoretical findings and show that the algorithms we design outperform classical zeroth-order optimization methods in the high-dimensional setting.

artificial intelligence, machine learning, optimization, (16 more...)

1710.10551

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Baydin, Atilim Gunes, Cornish, Robert, Rubio, David Martinez, Schmidt, Mark, Wood, Frank

Online Learning Rate Adaptation with Hypergradient Descent

We introduce a general method for improving the convergence rate of gradient-based optimizers that is easy to implement and works well in practice. We demonstrate the effectiveness of the method in a range of optimization problems by applying it to stochastic gradient descent, stochastic gradient descent with Nesterov momentum, and Adam, showing that it significantly reduces the need for the manual tuning of the initial learning rate for these commonly used algorithms. Our method works by dynamically updating the learning rate during optimization using the gradient with respect to the learning rate of the update rule itself. Computing this "hypergradient" needs little additional computation, requires only one extra copy of the original gradient to be stored in memory, and relies upon nothing more than what is provided by reverse-mode automatic differentiation.

algorithm, artificial intelligence, machine learning, (15 more...)

1703.04782

Country:

North America > United States (0.28)
North America > Canada (0.28)
Europe > United Kingdom > England (0.14)

Genre:

Research Report (0.85)
Instructional Material (0.68)

Industry: Education > Educational Setting > Online (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

arXiv.org Artificial IntelligenceFeb-24-2018

PSO-based Fuzzy Markup Language for Student Learning Performance Evaluation and Educational Application

Lee, Chang-Shing, Wang, Mei-Hui, Wang, Chi-Shiang, Teytaud, Olivier, Liu, Jialin, Lin, Su-Wei, Hung, Pi-Hsia

This paper proposes an agent with particle swarm optimization (PSO) based on a Fuzzy Markup Language (FML) for students learning performance evaluation and educational applications, and the proposed agent is according to the response data from a conventional test and an item response theory. First, we apply a GS-based parameter estimation mechanism to estimate the items parameters according to the response data, and then to compare its results with those of an IRT-based Bayesian parameter estimation mechanism. In addition, we propose a static-IRT test assembly mechanism to assemble a form for the conventional test. The presented FML-based dynamic assessment mechanism infers the probability of making a correct response to the item for a student with various abilities. Moreover, this paper also proposes a novel PFML learning mechanism for optimizing the parameters between items and students. Finally, we adopt a K-fold cross validation mechanism to evaluate the performance of the proposed agent. Experimental results show that the novel PFML learning mechanism for the parameter estimation and learning optimization performs favorably. We believe the proposed PFML will be a reference for education research and pedagogy and an important co-learning mechanism for future human-machine educational applications.

artificial intelligence, evolutionary algorithm, machine learning, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TFUZZ.2018.2810814

1802.08822

Country:

Europe (0.67)
Asia > Taiwan (0.28)
North America > United States (0.28)

Genre: Research Report > New Finding (0.34)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(2 more...)

Chen, Yudong, Chi, Yuejie

Harnessing Structures in Big Data via Guaranteed Low-Rank Matrix Estimation

arXiv.org Machine LearningFeb-23-2018

Low-rank modeling plays a pivotal role in signal processing and machine learning, with applications ranging from collaborative filtering, video surveillance, medical imaging, to dimensionality reduction and adaptive filtering. Many modern high-dimensional data and interactions thereof can be modeled as lying approximately in a low-dimensional subspace or manifold, possibly with additional structures, and its proper exploitations lead to significant reduction of costs in sensing, computation and storage. In recent years, there is a plethora of progress in understanding how to exploit low-rank structures using computationally efficient procedures in a provable manner, including both convex and nonconvex approaches. On one side, convex relaxations such as nuclear norm minimization often lead to statistically optimal procedures for estimating low-rank matrices, where first-order methods are developed to address the computational challenges; on the other side, there is emerging evidence that properly designed nonconvex procedures, such as projected gradient descent, often provide globally optimal solutions with a much lower computational cost in many problems. This survey article will provide a unified overview of these recent advances on low-rank matrix estimation from incomplete measurements. Attention is paid to rigorous characterization of the performance of these algorithms, and to problems where the low-rank matrix have additional structural properties that require new algorithmic designs and theoretical analysis.

artificial intelligence, data mining, machine learning, (18 more...)

1802.08397

Country: North America > United States (0.46)

Genre: Overview (1.00)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.66)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.36)

Massias, Mathurin, Gramfort, Alexandre, Salmon, Joseph

Dual Extrapolation for Faster Lasso Solvers

arXiv.org Machine LearningFeb-22-2018

Convex sparsity-inducing regularizations are ubiquitous in high-dimension machine learning, but their non-differentiability requires the use of iterative solvers. To accelerate such solvers, state-of-the-art approaches consist in reducing the size of the optimization problem at hand. In the context of regression, this can be achieved either by discarding irrelevant features (screening techniques) or by prioritizing features likely to be included in the support of the solution (working set techniques). Duality comes into play at several steps in these techniques. Here, we propose an extrapolation technique starting from a sequence of iterates in the dual that leads to the construction of an improved dual point. This enables a tighter control of optimality as used in stopping criterion, as well as better screening performance of Gap Safe rules. Finally, we propose a working set strategy based on an aggressive use of Gap Safe rules and our new dual point construction, which improves state-of-the-art time performance on Lasso problems.

artificial intelligence, machine learning, optimization problem, (16 more...)

1802.07481

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)

Takada, Masaaki, Suzuki, Taiji, Fujisawa, Hironori

Independently Interpretable Lasso: A New Regularizer for Sparse Regression with Uncorrelated Variables

arXiv.org Machine LearningFeb-22-2018

Sparse regularization such as $\ell_1$ regularization is a quite powerful and widely used strategy for high dimensional learning problems. The effectiveness of sparse regularization has been supported practically and theoretically by several studies. However, one of the biggest issues in sparse regularization is that its performance is quite sensitive to correlations between features. Ordinary $\ell_1$ regularization can select variables correlated with each other, which results in deterioration of not only its generalization error but also interpretability. In this paper, we propose a new regularization method, "Independently Interpretable Lasso" (IILasso). Our proposed regularizer suppresses selecting correlated variables, and thus each active variable independently affects the objective variable in the model. Hence, we can interpret regression coefficients intuitively and also improve the performance by avoiding overfitting. We analyze theoretical property of IILasso and show that the proposed method is much advantageous for its sign recovery and achieves almost minimax optimal convergence rate. Synthetic and real data analyses also indicate the effectiveness of IILasso.

artificial intelligence, iilasso, machine learning, (16 more...)

1711.01796

Country: Asia > Japan (0.28)

Genre: Research Report (0.83)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Nespoli, Lorenzo, Medici, Vasco

An Unsupervised Method for Estimating the Global Horizontal Irradiance from Photovoltaic Power Measurements

arXiv.org Machine LearningFeb-22-2018

In this paper, we present a method to determine the global horizontal irradiance (GHI) from the power measurements of one or more PV systems, located in the same neighborhood. The method is completely unsupervised and is based on a physical model of a PV plant. The precise assessment of solar irradiance is pivotal for the forecast of the electric power generated by photovoltaic (PV) plants. However, on-ground measurements are expensive and are generally not performed for small and medium-sized PV plants. Satellite-based services represent a valid alternative to on site measurements, but their space-time resolution is limited. Results from two case studies located in Switzerland are presented. The performance of the proposed method at assessing GHI is compared with that of free and commercial satellite services. Our results show that the presented method is generally better than satellite-based services, especially at high temporal resolutions.

artificial intelligence, evolutionary algorithm, machine learning, (19 more...)

1706.06878

Country:

Europe (1.00)
North America > United States (0.68)

Genre: Research Report > New Finding (1.00)

Industry: Energy > Renewable > Solar (1.00)

Technology:

Information Technology > Data Science (0.97)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.46)