Goto

Collaborating Authors

 Yorke-Smith, Neil


Generalized Decision Focused Learning under Imprecise Uncertainty--Theoretical Study

arXiv.org Artificial Intelligence

Decision Focused Learning has emerged as a critical paradigm for integrating machine learning with downstream optimisation. Despite its promise, existing methodologies predominantly rely on probabilistic models and focus narrowly on task objectives, overlooking the nuanced challenges posed by epistemic uncertainty, non-probabilistic modelling approaches, and the integration of uncertainty into optimisation constraints. This paper bridges these gaps by introducing innovative frameworks: (i) a non-probabilistic lens for epistemic uncertainty representation, leveraging intervals (the least informative uncertainty model), Contamination (hybrid model), and probability boxes (the most informative uncertainty model); (ii) methodologies to incorporate uncertainty into constraints, expanding Decision-Focused Learning's utility in constrained environments; (iii) the adoption of Imprecise Decision Theory for ambiguity-rich decision-making contexts; and (iv) strategies for addressing sparse data challenges. Empirical evaluations on benchmark optimisation problems demonstrate the efficacy of these approaches in improving decision quality and robustness and dealing with said gaps.


Learning optimal objective values for MILP

arXiv.org Artificial Intelligence

Modern Mixed Integer Linear Programming (MILP) solvers use the Branch-and-Bound algorithm together with a plethora of auxiliary components that speed up the search. In recent years, there has been an explosive development in the use of machine learning for enhancing and supporting these algorithmic components. Within this line, we propose a methodology for predicting the optimal objective value, or, equivalently, predicting if the current incumbent is optimal. For this task, we introduce a predictor based on a graph neural network (GNN) architecture, together with a set of dynamic features. Experimental results on diverse benchmarks demonstrate the efficacy of our approach, achieving high accuracy in the prediction task and outperforming existing methods. These findings suggest new opportunities for integrating ML-driven predictions into MILP solvers, enabling smarter decision-making and improved performance.


Machine Learning Augmented Branch and Bound for Mixed Integer Linear Programming

arXiv.org Artificial Intelligence

Mixed Integer Linear Programming (MILP) is a pillar of mathematical optimization that offers a powerful modeling language for a wide range of applications. During the past decades, enormous algorithmic progress has been made in solving MILPs, and many commercial and academic software packages exist. Nevertheless, the availability of data, both from problem instances and from solvers, and the desire to solve new problems and larger (real-life) instances, trigger the need for continuing algorithmic development. MILP solvers use branch and bound as their main component. In recent years, there has been an explosive development in the use of machine learning algorithms for enhancing all main tasks involved in the branch-and-bound algorithm, such as primal heuristics, branching, cutting planes, node selection and solver configuration decisions. This paper presents a survey of such approaches, addressing the vision of integration of machine learning and mathematical optimization as complementary technologies, and how this integration can benefit MILP solving. In particular, we give detailed attention to machine learning algorithms that automatically optimize some metric of branch-and-bound efficiency. We also address how to represent MILPs in the context of applying learning algorithms, MILP benchmarks and software.


Robust Losses for Decision-Focused Learning

arXiv.org Artificial Intelligence

Optimization models used to make discrete decisions often contain uncertain parameters that are context-dependent and are estimated through prediction. To account for the quality of the decision made based on the prediction, decision-focused learning (end-to-end predict-then-optimize) aims at training the predictive model to minimize regret, i.e., the loss incurred by making a suboptimal decision. Despite the challenge of this loss function being possibly non-convex and in general non-differentiable, effective gradient-based learning approaches have been proposed to minimize the expected loss, using the empirical loss as a surrogate. However, empirical regret can be an ineffective surrogate because the uncertainty in the optimization model makes the empirical regret unequal to the expected regret in expectation. To illustrate the impact of this inequality, we evaluate the effect of aleatoric and epistemic uncertainty on the accuracy of empirical regret as a surrogate. Next, we propose three robust loss functions that more closely approximate expected regret. Experimental results show that training two state-of-the-art decision-focused learning approaches using robust regret losses improves test-sample empirical regret in general while keeping computational time equivalent relative to the number of training epochs.


Adaptive parallelization of multi-agent simulations with localized dynamics

arXiv.org Artificial Intelligence

Agent-based modelling constitutes a versatile approach to representing and simulating complex systems. Studying large-scale systems is challenging because of the computational time required for the simulation runs: scaling is at least linear in system size (number of agents). Given the inherently modular nature of MABSs, parallel computing is a natural approach to overcoming this challenge. However, because of the shared information and communication between agents, parellelization is not simple. We present a protocol for shared-memory, parallel execution of MABSs. This approach is useful for models that can be formulated in terms of sequential computations, and that involve updates that are localized, in the sense of involving small numbers of agents. The protocol has a bottom-up and asynchronous nature, allowing it to deal with heterogeneous computation in an adaptive, yet graceful manner. We illustrate the potential performance gains on exemplar cultural dynamics and disease spreading MABSs.


On Training Neural Networks with Mixed Integer Programming

arXiv.org Machine Learning

Recent work has shown potential in using Mixed Integer Programming (MIP) solvers to optimize certain aspects of neural networks (NN). However little research has gone into training NNs with solvers. State of the art methods to train NNs are typically gradient-based and require significant data, computation on GPUs and extensive hyper-parameter tuning. In contrast, training with MIP solvers should not require GPUs or hyper-parameter tuning but can likely not handle large amounts of data. This work builds on recent advances that train binarized NNs using MIP solvers. We go beyond current work by formulating new MIP models to increase the amount of data that can be used and to train non-binary integer-valued networks. Our results show that comparable results to using gradient descent can be achieved when minimal data is available.


A Coupled Operational Semantics for Goals and Commitments

Journal of Artificial Intelligence Research

Commitments capture how an agent relates to another agent, whereas goals describe states of the world that an agent is motivated to bring about.  Commitments are elements of the social state of a set of agents whereas goals are elements of the private states of individual agents.  It makes intuitive sense that goals and commitments are understood as being complementary to each other. More importantly, an agent's goals and commitments ought to be coherent, in the sense that an agent's goals would lead it to adopt or modify relevant commitments and an agent's commitments would lead it to adopt or modify relevant goals.  However, despite the intuitive naturalness of the above connections, they have not been adequately studied in a formal framework. This article provides a combined operational semantics for goals and commitments by relating their respective life cycles as a basis for how these concepts (1) cohere for an individual agent and (2) engender cooperation among agents.  Our semantics yields important desirable properties of convergence of the configurations of cooperating agents, thereby delineating some theoretically well-founded yet practical modes of cooperation in a multiagent system.


TrustSVD: Collaborative Filtering with Both the Explicit and Implicit Influence of User Trust and of Item Ratings

AAAI Conferences

Collaborative filtering suffers from the problems of data sparsity and cold start, which dramatically degrade recommendation performance. To help resolve these issues, we propose TrustSVD, a trust-based matrix factorization technique. By analyzing the social trust data from four real-world data sets, we conclude that not only the explicit but also the implicit influence of both ratings and trust should be taken into consideration in a recommendation model. Hence, we build on top of a state-of-the-art recommendation algorithm SVD++ which inherently involves the explicit and implicit influence of rated items, by further incorporating both the explicit and implicit influence of trusted users on the prediction of items for an active user. To our knowledge, the work reported is the first to extend SVD++ with social trust information. Experimental results on the four data sets demonstrate that our approach TrustSVD achieves better accuracy than other ten counterparts, and can better handle the concerned issues.


Incrementally Solving STNs by Enforcing Partial Path Consistency

AAAI Conferences

Efficient management and propagation of temporal constraints is important for temporal planning as well as for scheduling. During plan development, new events and temporal constraints are added and existing constraints may be tightened; the consistency of the whole temporal network is frequently checked; and results of constraint propagation guide further search. Recent work shows that enforcing partial path consistency provides an efficient means of propagating temporal information for the popular Simple Temporal Network (STN). We show that partial path consistency can be enforced incrementally, thus exploiting the similarities of the constraint network between subsequent edge tightenings. We prove that the worst-case time complexity of our algorithm can be bounded both by the number of edges in the chordal graph (which is better than the previous bound of the number of vertices squared), and by the degree of the chordal graph times the number of vertices incident on updated edges. We show that for many sparse graphs, the latter bound is better than that of the previously best-known approaches. In addition, our algorithm requires space only linear in the number of edges of the chordal graph, whereas earlier work uses space quadratic in the number of vertices. Finally, empirical results show that when incrementally solving sparse STNs, stemming from problems such as Hierarchical Task Network planning, our approach outperforms extant algorithms.


Introduction to the Special Issue on "Usable AI"

AI Magazine

When creating algorithms or systems that are supposed to be used by people, we should be able to adopt a "binocular" view of users' interaction with intelligent systems: a view that regards the design of interaction and the design of intelligent algorithms as interrelated parts of a single design problem. This special issue offers a coherent set of articles on two levels of generality that illustrate the binocular view and help readers to adopt it.