Metareasoning
Dynamic operator management in meta-heuristics using reinforcement learning: an application to permutation flowshop scheduling problems
Mamaghan, Maryam Karimi, Mohammadi, Mehrdad, Dullaert, Wout, Vigo, Daniele, Pirayesh, Amir
Using a portfolio of multiple search operators with different characteristics has been shown to improve the exploration and exploitation ability and, consequently, to enhance the overall performance of the meta-heuristics in solving different combinatorial optimization problems (COPs) [1, 2, 3, 4, 5]. From a theoretical perspective, the search space of a COP represents a non-stationary environment, meaning that the performance of different search operators varies depending on the region of the search space being explored. An operator working well in one region might be less effective in another region. Accordingly, incorporating a portfolio of diverse operators into a meta-heuristic is expected to enhance its overall performance [6]. For every COP, numerous search operators are available in the literature (either variations of the same operator with different configurations or entirely distinct operators), with the possibility of proposing new ones. Since the operators' performance is not pre-determined but rather dependent on the algorithm's performance on specific problems/instances, predicting the operators' performance proves challenging. Even if the most efficient operators could be determined, the order in which these efficient operators should be involved during the search process remains undetermined. Hence, optimizing the performance of a metaheuristic with multiple operators for solving different problem instances is always challenging [6, 7, 8, 9]. We label this problem as operator management problem in meta-heuristics, wherein the user should address two questions: What operators should I include in the portfolio?, and How (in which order) should I involve the in-portfolio operators during the search process?
Metareasoning in uncertain environments: a meta-BAMDP framework
Godara, Prakhar, Aléman, Tilman Diego, Yu, Angela J.
In decision-making scenarios, \textit{reasoning} can be viewed as an algorithm $P$ that makes a choice of an action $a^* \in \mathcal{A}$, aiming to optimize some outcome such as maximizing the value function of a Markov decision process (MDP). However, executing $P$ itself may bear some costs (time, energy, limited capacity, etc.) and needs to be considered alongside explicit utility obtained by making the choice in the underlying decision problem. Such costs need to be taken into account in order to accurately model human behavior, as well as optimizing AI planning, as all physical systems are bound to face resource constraints. Finding the right $P$ can itself be framed as an optimization problem over the space of reasoning processes $P$, generally referred to as \textit{metareasoning}. Conventionally, human metareasoning models assume that the agent knows the transition and reward distributions of the underlying MDP. This paper generalizes such models by proposing a meta Bayes-Adaptive MDP (meta-BAMDP) framework to handle metareasoning in environments with unknown reward/transition distributions, which encompasses a far larger and more realistic set of planning problems that humans and AI systems face. As a first step, we apply the framework to two-armed Bernoulli bandit (TABB) tasks, which have often been used to study human decision making. Owing to the meta problem's complexity, our solutions are necessarily approximate, but nevertheless robust within a range of assumptions that are arguably realistic for human decision-making scenarios. These results offer a normative framework for understanding human exploration under cognitive constraints. This integration of Bayesian adaptive strategies with metareasoning enriches both the theoretical landscape of decision-making research and practical applications in designing AI systems that plan under uncertainty and resource constraints.
Algorithm selection by rational metareasoning as a model of human strategy selection
Selecting the right algorithm is an important problem in computer science, because the algorithm often has to exploit the structure of the input to be efficient. The human mind faces the same challenge. Therefore, solutions to the algorithm selection problem can inspire models of human strategy selection and vice versa. Here, we view the algorithm selection problem as a special case of metareasoning and derive a solution that outperforms existing methods in sorting algorithm selection. We apply our theory to model how people choose between cognitive strategies and test its prediction in a behavioral experiment. We find that people quickly learn to adaptively choose between cognitive strategies. People's choices in our experiment are consistent with our model but inconsistent with previous theories of human strategy selection. Rational metareasoning appears to be a promising framework for reverse-engineering how people choose among cognitive strategies and translating the results into better solutions to the algorithm selection problem.
A Meta-heuristic Approach to Estimate and Explain Classifier Uncertainty
Houston, Andrew, Cosma, Georgina
Trust is a crucial factor affecting the adoption of machine learning (ML) models. Qualitative studies have revealed that end-users, particularly in the medical domain, need models that can express their uncertainty in decision-making allowing users to know when to ignore the model's recommendations. However, existing approaches for quantifying decision-making uncertainty are not model-agnostic, or they rely on complex statistical derivations that are not easily understood by laypersons or end-users, making them less useful for explaining the model's decision-making process. This work proposes a set of class-independent meta-heuristics that can characterize the complexity of an instance in terms of factors are mutually relevant to both human and ML decision-making. The measures are integrated into a meta-learning framework that estimates the risk of misclassification. The proposed framework outperformed predicted probabilities in identifying instances at risk of being misclassified. The proposed measures and framework hold promise for improving model development for more complex instances, as well as providing a new means of model abstention and explanation.
Algorithm selection by rational metareasoning as a model of human strategy selection
Selecting the right algorithm is an important problem in computer science, because the algorithm often has to exploit the structure of the input to be efficient. The human mind faces the same challenge. Therefore, solutions to the algorithm selection problem can inspire models of human strategy selection and vice versa. Here, we view the algorithm selection problem as a special case of metareasoning and derive a solution that outperforms existing methods in sorting algorithm selection. We apply our theory to model how people choose between cognitive strategies and test its prediction in a behavioral experiment.
Cserna
When minimizing makespan during off-line planning, the fastest action sequence to reach a particular state is, by definition, preferred. When trying to reach a goal quickly in on-line planning, previous work has inherited that assumption: the faster of two paths that both reach the same state is usually considered to dominate the slower one. In this short paper, we point out that, when planning happens concurrently with execution, selecting a slower action can allow additional time for planning, leading to better plans. We present Slo'RTS, a metareasoning planning algorithm that estimates whether the expected improvement in future decision-making from this increased planning time is enough to make up for the increased duration of the selected action. Using simple benchmarks, we show that Slo'RTS can yield shorter time-to-goal than a conventional planner. This generalizes previous work on metareasoning in on-line planning and highlights the inherent uncertainty present in an on-line setting.
Ideal Partition of Resources for Metareasoning
We can achieve significant gains in the value of computation by metareasoning about the nature or extent of base-level problem solving before executing a solution. However, resources that are irrevocably committed to metareasoning are not available for executing a solution. Thus, it is important to determine the portion of resources we wish to apply to metareasoning and control versus to the execution of a solution plan. Recent research on rational agency has highlighted the importance of limiting the consumption of resources by metareasoning machinery. We shall introduce the metareasoning-partition problem–the problem of ideally apportioning costly reasoning resources to planning a solution versus applying resource to executing a solution to a problem. We exercise prototypical metareasoning-partition models to probe the relationships between time allocated to metareasoning and to execution for different problem classes.
Ideal Partition of Resources for Metareasoning
We can achieve significant gains in the value of computation by metareasoning about the nature or extent of base-level problem solving before executing a solution. However, resources that are irrevocably committed to metareasoning are not available for executing a solution. Thus, it is important to determine the portion of resources we wish to apply to metareasoning and control versus to the execution of a solution plan. Recent research on rational agency has highlighted the importance of limiting the consumption of resources by metareasoning machinery. We shall introduce the metareasoning-partition problem--the problem of ideally apportioning costly reasoning resources to planning a solution versus applying resource to executing a solution to a problem. We exercise prototypical metareasoning-partition models to probe the relationships between time allocated to metareasoning and to execution for different problem classes. Finally, we examine the value of metareasoning in the context of our functional analyses. This work was supported by a NASA Fellowship under Grant NCC-220-51, by the National Science Foundation under Grant IRI-8703710, and by the U.S. Army Research Office under Grant P-25514-EL. Computing facilities were provided by the SUMEX-AIM Resource under NLM Grant LM05208.
Automated Machine Learning, Bounded Rationality, and Rational Metareasoning
Hüllermeier, Eyke, Mohr, Felix, Tornede, Alexander, Wever, Marcel
The notion of bounded rationality originated from the insight that perfectly rational behavior cannot be realized by agents with limited cognitive or computational resources. Research on bounded rationality, mainly initiated by Herbert Simon, has a longstanding tradition in economics and the social sciences, but also plays a major role in modern AI and intelligent agent design. Taking actions under bounded resources requires an agent to reflect on how to use these resources in an optimal way - hence, to reason and make decisions on a meta-level. In this paper, we will look at automated machine learning (AutoML) and related problems from the perspective of bounded rationality, essentially viewing an AutoML tool as an agent that has to train a model on a given set of data, and the search for a good way of doing so (a suitable "ML pipeline") as deliberation on a meta-level.
The Multi-phase spatial meta-heuristic algorithm for public health emergency transportation
Irany, Fariba Afrin, Iyer, Arnav, Flores, Rubenia Borge, Mikler, Armin R.
The delivery of Medical Countermeasures(MCMs) for mass prophylaxis in the case of a bio-terrorist attack is an active research topic that has interested the research community over the past decades. The objective of this study is to design an efficient algorithm for the Receive Reload and Store Problem(RSS) in which we aim to find feasible routes to deliver MCMs to a target population considering time, physical, and human resources, and capacity limitations. For doing this, we adapt the p-median problem to the POD-based emergency response planning procedures and propose an efficient algorithm solution to perform the p-median in reasonable computational time. We present RE-PLAN, the Response PLan Analyzer system that contains some RSS solutions developed at The Center for Computational Epidemiology and Response Analysis (CeCERA) at the University of North Texas. Finally, we analyze a study case where we show how the computational performance of the algorithm can impact the process of decision making and emergency planning in the short and long terms.