AITopics | Mathematical & Statistical Methods

Collaborating Authors

Mathematical & Statistical Methods

News Overviews Instructional Materials AI-Alerts Classics

MILP for the Multi-objective VM Reassignment Problem

Saber, Takfarinas, Ventresque, Anthony, Marques-Silva, Joao, Thorburn, James, Murphy, Liam

arXiv.org Artificial IntelligenceMar-18-2021

Machine Reassignment is a challenging problem for constraint programming (CP) and mixed-integer linear programming (MILP) approaches, especially given the size of data centres. The multi-objective version of the Machine Reassignment Problem is even more challenging and it seems unlikely for CP or MILP to obtain good results in this context. As a result, the first approaches to address this problem have been based on other optimisation methods, including metaheuristics. In this paper we study under which conditions a mixed-integer optimisation solver, such as IBM ILOG CPLEX, can be used for the Multi-objective Machine Reassignment Problem. We show that it is useful only for small or medium-scale data centres and with some relaxations, such as an optimality tolerance gap and a limited number of directions explored in the search space. Building on this study, we also investigate a hybrid approach, feeding a metaheuristic with the results of CPLEX, and we show that the gains are important in terms of quality of the set of Pareto solutions (+126.9% against the metaheuristic alone and +17.8% against CPLEX alone) and number of solutions (8.9 times more than CPLEX), while the processing time increases only by 6% in comparison to CPLEX for execution times larger than 100 seconds.

cplex, execution time, vector, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ICTAI.2015.20

2103.1041

Country:

Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
North America > Canada > Ontario > Toronto (0.04)
Europe > Portugal > Lisbon > Lisbon (0.04)

Genre: Research Report (1.00)

Industry: Information Technology > Services (0.56)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.54)

Add feedback

Bellman equation

#artificialintelligenceMar-15-2021, 20:36:01 GMT

A Bellman equation, named after Richard E. Bellman, is a necessary condition for optimality associated with the mathematical optimization method known as dynamic programming.[1] It writes the "value" of a decision problem at a certain point in time in terms of the payoff from some initial choices and the "value" of the remaining decision problem that results from those initial choices.[citation The Bellman equation was first applied to engineering control theory and to other topics in applied mathematics, and subsequently became an important tool in economic theory; though the basic concepts of dynamic programming are prefigured in John von Neumann and Oskar Morgenstern's Theory of Games and Economic Behavior and Abraham Wald's sequential analysis.[citation In continuous-time optimization problems, the analogous equation is a partial differential equation that is called the Hamilton–Jacobi–Bellman equation.[4][5] In discrete time any multi-stage optimization problem can be solved by analyzing the appropriate Bellman equation.

bellman equation, equation, optimization problem, (15 more...)

#artificialintelligence

Industry: Banking & Finance (0.71)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)

Add feedback

Escaping Saddle Points with Stochastically Controlled Stochastic Gradient Methods

Liang, Guannan, Tong, Qianqian, Zhu, Chunjiang, Bi, Jinbo

arXiv.org Machine LearningMar-12-2021

Stochastically controlled stochastic gradient (SCSG) methods have been proved to converge efficiently to first-order stationary points which, however, can be saddle points in nonconvex optimization. It has been observed that a stochastic gradient descent (SGD) step introduces anistropic noise around saddle points for deep learning and non-convex half space learning problems, which indicates that SGD satisfies the correlated negative curvature (CNC) condition for these problems. Therefore, we propose to use a separate SGD step to help the SCSG method escape from strict saddle points, resulting in the CNC-SCSG method. The SGD step plays a role similar to noise injection but is more stable. We prove that the resultant algorithm converges to a second-order stationary point with a convergence rate of $\tilde{O}( \epsilon^{-2} log( 1/\epsilon))$ where $\epsilon$ is the pre-specified error tolerance. This convergence rate is independent of the problem dimension, and is faster than that of CNC-SGD. A more general framework is further designed to incorporate the proposed CNC-SCSG into any first-order method for the method to escape saddle points. Simulation studies illustrate that the proposed algorithm can escape saddle points in much fewer epochs than the gradient descent methods perturbed by either noise injection or a SGD step.

algorithm, saddle point, stationary point, (16 more...)

arXiv.org Machine Learning

2103.04413

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Connecticut > Tolland County > Storrs (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)

Genre: Research Report (0.40)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

Empirical Mode Modeling: A data-driven approach to recover and forecast nonlinear dynamics from noisy data

Park, Joseph, Pao, Gerald M, Stabenau, Erik, Sugihara, George, Lorimer, Thomas

arXiv.org Machine LearningMar-10-2021

Data-driven, model-free analytics are natural choices for discovery and forecasting of complex, nonlinear systems. Methods that operate in the system state-space require either an explicit multidimensional state-space, or, one approximated from available observations. Since observational data are frequently sampled with noise, it is possible that noise can corrupt the state-space representation degrading analytical performance. Here, we evaluate the synthesis of empirical mode decomposition with empirical dynamic modeling, which we term empirical mode modeling, to increase the information content of state-space representations in the presence of noise. Evaluation of a mathematical, and, an ecologically important geophysical application across three different state-space representations suggests that empirical mode modeling may be a useful technique for data-driven, model-free, state-space analysis in the presence of noise.

imf, representation, salinity, (14 more...)

arXiv.org Machine Learning

2103.07281

Country:

North America > United States > Florida > Miami-Dade County > Homestead (0.04)
North America > United States > California > San Diego County > La Jolla (0.04)
North America > United States > California > San Diego County > San Diego (0.04)

Genre: Research Report (0.50)

Industry: Government > Regional Government (0.47)

Technology:

Information Technology > Data Science > Data Quality > Data Transformation (0.50)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.40)
Information Technology > Data Science > Data Quality > Data Cleaning (0.40)

Add feedback

Low-Rank Sinkhorn Factorization

Scetbon, Meyer, Cuturi, Marco, Peyré, Gabriel

arXiv.org Machine LearningMar-8-2021

Several recent applications of optimal transport (OT) theory to machine learning have relied on regularization, notably entropy and the Sinkhorn algorithm. Because matrix-vector products are pervasive in the Sinkhorn algorithm, several works have proposed to \textit{approximate} kernel matrices appearing in its iterations using low-rank factors. Another route lies instead in imposing low-rank constraints on the feasible set of couplings considered in OT problems, with no approximations on cost nor kernel matrices. This route was first explored by Forrow et al., 2018, who proposed an algorithm tailored for the squared Euclidean ground cost, using a proxy objective that can be solved through the machinery of regularized 2-Wasserstein barycenters. Building on this, we introduce in this work a generic approach that aims at solving, in full generality, the OT problem under low-rank constraints with arbitrary costs. Our algorithm relies on an explicit factorization of low rank couplings as a product of \textit{sub-coupling} factors linked by a common marginal; similar to an NMF approach, we alternatively updates these factors. We prove the non-asymptotic stationary convergence of this algorithm and illustrate its efficiency on benchmark experiments.

algorithm, diag, submission and formatting instruction, (12 more...)

arXiv.org Machine Learning

2103.04737

Country:

Europe > Russia (0.04)
Asia > Russia (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.46)

Add feedback

Signal Processing on the Permutahedron: Tight Spectral Frames for Ranked Data Analysis

Chen, Ellen, DeJong, Jennifer, Halverson, Tom, Shuman, David I

arXiv.org Machine LearningMar-6-2021

Ranked data sets, where m judges/voters specify a preference ranking of n objects/candidates, are increasingly prevalent in contexts such as political elections, computer vision, recommender systems, and bioinformatics. The vote counts for each ranking can be viewed as an n! data vector lying on the permutahedron, which is a Cayley graph of the symmetric group with vertices labeled by permutations and an edge when two permutations differ by an adjacent transposition. Leveraging combinatorial representation theory and recent progress in signal processing on graphs, we investigate a novel, scalable transform method to interpret and exploit structure in ranked data. We represent data on the permutahedron using an overcomplete dictionary of atoms, each of which captures both smoothness information about the data (typically the focus of spectral graph decomposition methods in graph signal processing) and structural information about the data (typically the focus of symmetry decomposition methods from representation theory). These atoms have a more naturally interpretable structure than any known basis for signals on the permutahedron, and they form a Parseval frame, ensuring beneficial numerical properties such as energy preservation. We develop specialized algorithms and open software that take advantage of the symmetry and structure of the permutahedron to improve the scalability of the proposed method, making it more applicable to the high-dimensional ranked data found in applications.

analysis coefficient, eigenvector, graph, (14 more...)

arXiv.org Machine Learning

2103.0415

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.05)
North America > United States > Minnesota > Ramsey County > Saint Paul (0.04)
North America > United States > California > Alameda County > Hayward (0.04)
(2 more...)

Genre: Research Report (0.63)

Industry: Government > Voting & Elections (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.45)

Add feedback

If We Draw Graphs Like This, We Can Change Computers Forever

#artificialintelligenceMar-4-2021, 14:55:21 GMT

Jacob Holm was flipping through proofs from an October 2019 research paper he and colleague Eva Rotenberg--an associate professor in the department of applied mathematics and computer science at the Technical University of Denmark--had published online, when he discovered their findings had unwittingly given away a solution to a centuries-old graph problem. Holm, an assistant professor of computer science at the University of Copenhagen, was relieved no one had caught the solution first. "It was a real'Eureka!' moment," he says. Holm and Rotenberg were trying to find a shortcut for determining whether a graph is "planar"--that is, if it could be drawn flat on a surface without any of its lines crossing each other (flat drawings of a graph are also called "embeddings"). "Putting it very bluntly, we formally quantified why something is a terrible drawing." To mathematicians, a graph often looks different than what most of us are taught in school.

graph, planarity, rotenberg, (14 more...)

#artificialintelligence

Country: Europe > Denmark > Capital Region > Copenhagen (0.25)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.41)

Add feedback

Contrastive learning of strong-mixing continuous-time stochastic processes

Liu, Bingbin, Ravikumar, Pradeep, Risteski, Andrej

arXiv.org Machine LearningMar-3-2021

One of the paradigms of learning from unlabeled data that has seen a lot of recent work in various application domains is "self-supervised learning". These methods supervise the training process with information inherent to the data without requiring human annotations, and have been applied across computer vision, natural language processing, reinforcement learning and scientific domains. Despite the popularity, they are still not very well understood--both on the theoretical and empirical front--often requiring extensive trial and error to find the right pairing of architecture and learning method. In particular, it is often hard to pin down what exactly these methods are trying to learn, and it is even harder to determine what is their statistical and algorithmic complexity. The specific family of self-supervised approaches we focus on in this work is contrastive learning, which constructs different types of tuples by utilizing certain structures in the data and trains the model to identify the types. For an example in vision, Chen et al. (2020) apply two random augmentations (e.g.

complexity, learning, transition kernel, (14 more...)

arXiv.org Machine Learning

2103.0274

Country:

Asia > Japan > Honshū > Tōhoku > Iwate Prefecture > Morioka (0.05)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.41)

Add feedback

"Number Theory," by Rosanna Warren

The New YorkerMar-1-2021, 11:00:00 GMT

The four-and-a-half-foot black-backed rat snake swayed up and across the kitchen screen door, seeking a way in. So we know we're living with a patient You sit taut in your chair, whispering, as you probe the gaps between prime numbers. The opening through which your thought will glide suddenly into a lit space and be at home. In a shaky house, where wasps gnaw the walls.

number theory, rosanna warren

The New Yorker

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.40)

Add feedback

A Stein Goodness of fit Test for Exponential Random Graph Models

Xu, Wenkai, Reinert, Gesine

arXiv.org Machine LearningFeb-28-2021

We propose and analyse a novel nonparametric goodness of fit testing procedure for exchangeable exponential random graph models (ERGMs) when a single network realisation is observed. The test determines how likely it is that the observation is generated from a target unnormalised ERGM density. Our test statistics are derived from a kernel Stein discrepancy, a divergence constructed via Steins method using functions in a reproducing kernel Hilbert space, combined with a discrete Stein operator for ERGMs. The test is a Monte Carlo test based on simulated networks from the target ERGM. We show theoretical properties for the testing procedure for a class of ERGMs. Simulation studies and real network applications are presented.

artificial intelligence, gkss, machine learning, (15 more...)

arXiv.org Machine Learning

2103.0058

Country:

Asia (0.28)
Europe > United Kingdom > England (0.14)
North America > United States > California (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Energy > Oil & Gas (0.46)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.72)
(2 more...)

Add feedback