AITopics

2310.12819

Country:

Europe > Finland (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
North America > Canada > Quebec > Montreal (0.04)
(8 more...)

Genre: Research Report > New Finding (0.93)

Industry: Leisure & Entertainment > Games (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Bairamian, Daniel, Marcotte, Philippe, Romoff, Joshua, Robert, Gabriel, Nowrouzezahrai, Derek

Minimax Exploiter: A Data Efficient Approach for Competitive Self-Play

arXiv.org Artificial IntelligenceNov-28-2023

Recent advances in Competitive Self-Play (CSP) have achieved, or even surpassed, human level performance in complex game environments such as Dota 2 and StarCraft II using Distributed Multi-Agent Reinforcement Learning (MARL). One core component of these methods relies on creating a pool of learning agents -- consisting of the Main Agent, past versions of this agent, and Exploiter Agents -- where Exploiter Agents learn counter-strategies to the Main Agents. A key drawback of these approaches is the large computational cost and physical time that is required to train the system, making them impractical to deploy in highly iterative real-life settings such as video game productions. In this paper, we propose the Minimax Exploiter, a game theoretic approach to exploiting Main Agents that leverages knowledge of its opponents, leading to significant increases in data efficiency. We validate our approach in a diversity of settings, including simple turn based games, the arcade learning environment, and For Honor, a modern video game. The Minimax Exploiter consistently outperforms strong baselines, demonstrating improved stability and data efficiency, leading to a robust CSP-MARL method that is both flexible and easy to deploy.

agent, exploiter, main agent, (13 more...)

2311.1719

Country:

North America > Canada > Quebec > Montreal (0.29)
North America > United States > California > San Francisco County > San Francisco (0.14)
Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report (0.41)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.89)

arXiv.org Machine LearningNov-28-2023

Optimal minimax rate of learning interaction kernels

Wang, Xiong, Seroussi, Inbar, Lu, Fei

Nonparametric estimation of nonlocal interaction kernels is crucial in various applications involving interacting particle systems. The inference challenge, situated at the nexus of statistical learning and inverse problems, comes from the nonlocal dependency. A central question is whether the optimal minimax rate of convergence for this problem aligns with the rate of $M^{-\frac{2\beta}{2\beta+1}}$ in classical nonparametric regression, where $M$ is the sample size and $\beta$ represents the smoothness exponent of the radial kernel. Our study confirms this alignment for systems with a finite number of particles. We introduce a tamed least squares estimator (tLSE) that attains the optimal convergence rate for a broad class of exchangeable distributions. The tLSE bridges the smallest eigenvalue of random matrices and Sobolev embedding. This estimator relies on nonasymptotic estimates for the left tail probability of the smallest eigenvalue of the normal matrix. The lower minimax rate is derived using the Fano-Tsybakov hypothesis testing method. Our findings reveal that provided the inverse problem in the large sample limit satisfies a coercivity condition, the left tail probability does not alter the bias-variance tradeoff, and the optimal minimax rate remains intact. Our tLSE method offers a straightforward approach for establishing the optimal minimax rate for models with either local or nonlocal dependency.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Machine Learning

2311.16852

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.45)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)

Ben-Michael, Eli, Imai, Kosuke, Jiang, Zhichao

Policy Learning with Asymmetric Counterfactual Utilities

arXiv.org Machine LearningNov-28-2023

Data-driven decision making plays an important role even in high stakes settings like medicine and public policy. Learning optimal policies from observed data requires a careful formulation of the utility function whose expected value is maximized across a population. Although researchers typically use utilities that depend on observed outcomes alone, in many settings the decision maker's utility function is more properly characterized by the joint set of potential outcomes under all actions. For example, the Hippocratic principle to "do no harm" implies that the cost of causing death to a patient who would otherwise survive without treatment is greater than the cost of forgoing life-saving treatment. We consider optimal policy learning with asymmetric counterfactual utility functions of this form that consider the joint set of potential outcomes. We show that asymmetric counterfactual utilities lead to an unidentifiable expected utility function, and so we first partially identify it. Drawing on statistical decision theory, we then derive minimax decision rules by minimizing the maximum expected utility loss relative to different alternative policies. We show that one can learn minimax loss decision rules from observed data by solving intermediate classification problems, and establish that the finite sample excess expected utility loss of this procedure is bounded by the regret of these intermediate classifiers. We apply this conceptual framework and methodology to the decision about whether or not to use right heart catheterization for patients with possible pulmonary hypertension.

artificial intelligence, decision support system, machine learning, (16 more...)

arXiv.org Machine Learning

2206.10479

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)

Genre: Research Report > New Finding (0.45)

Industry:

Law (1.00)
Health & Medicine > Therapeutic Area (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Decision Support Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
(3 more...)

Rizvee, Redwan Ahmed, Hassan, Raheeb, Khan, Md. Mosaddek

A Graph Neural Network-Based QUBO-Formulated Hamiltonian-Inspired Loss Function for Combinatorial Optimization using Reinforcement Learning

Quadratic Unconstrained Binary Optimization (QUBO) is a generic technique to model various NP-hard Combinatorial Optimization problems (CO) in the form of binary variables. Ising Hamiltonian is used to model the energy function of a system. QUBO to Ising Hamiltonian is regarded as a technique to solve various canonical optimization problems through quantum optimization algorithms. Recently, PI-GNN, a generic framework, has been proposed to address CO problems over graphs based on Graph Neural Network (GNN) architecture. They introduced a generic QUBO-formulated Hamiltonian-inspired loss function that was directly optimized using GNN. PI-GNN is highly scalable but there lies a noticeable decrease in the number of satisfied constraints when compared to problem-specific algorithms and becomes more pronounced with increased graph densities. Here, We identify a behavioral pattern related to it and devise strategies to improve its performance. Another group of literature uses Reinforcement learning (RL) to solve the aforementioned NP-hard problems using problem-specific reward functions. In this work, we also focus on creating a bridge between the RL-based solutions and the QUBO-formulated Hamiltonian. We formulate and empirically evaluate the compatibility of the QUBO-formulated Hamiltonian as the generic reward function in the RL-based paradigm in the form of rewards. Furthermore, we also introduce a novel Monty Carlo Tree Search-based strategy with GNN where we apply a guided search through manual perturbation of node labels during training. We empirically evaluated our methods and observed up to 44% improvement in the number of constraint violations compared to the PI-GNN.

graph, node, pi-gnn, (12 more...)

2311.16277

Country: Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Planning for the Efficient Updating of Mutual Fund Portfolios

de la Rosa, Tomás

Once there is a decision of rebalancing or updating a portfolio of funds, the process of changing the current portfolio to the target one, involves a set of transactions that are susceptible of being optimized. This is particularly relevant when managers have to handle the implications of different types of instruments. In this work we present linear programming and heuristic search approaches that produce plans for executing the update. The evaluation of our proposals shows cost improvements over the compared based strategy. The models can be easily extended to other realistic scenarios in which a holistic portfolio management is required

allocation, portfolio, transaction, (17 more...)

2311.16204

Country:

North America > United States > California > Los Angeles County > Santa Monica (0.04)
Europe > Spain > Galicia > Madrid (0.04)

Genre: Research Report (0.40)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.88)

A systematic study comparing hyperparameter optimization engines on tabular data

Kegl, Balazs

We run an independent comparison of all hyperparameter optimization (hyperopt) engines available in the Ray Tune library. We introduce two ways to normalize and aggregate statistics across data sets and models, one rank-based, and another one sandwiching the score between the random search score and the full grid search score. This affords us i) to rank the hyperopt engines, ii) to make generalized and statistically significant statements on how much they improve over random search, and iii) to make recommendations on which engine should be used to hyperopt a given learning algorithm. We find that most engines beat random search, but that only three of them (HEBO, AX, and BlendSearch) clearly stand out. We also found that some engines seem to specialize in hyperopting certain learning algorithms, which makes it tricky to use hyperopt in comparison studies, since the choice of the hyperopt technique may favor some of the models in the comparison.

engine, hyperparameter optimization engine, random search, (12 more...)

2311.15854

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Liu, Quanquan C., Srinivas, Vaidehi

The Predicted-Updates Dynamic Model: Offline, Incremental, and Decremental to Fully Dynamic Transformations

We formulate the predicted-updates dynamic model, one of the first beyond-worst-case models for dynamic algorithms, which generalizes a large set of well-studied dynamic models including the offline dynamic, incremental, and decremental models to the fully dynamic setting when given predictions about the update times of the elements. In the most basic form of our model, we receive a set of predicted update times for all of the updates that occur over the event horizon. We give a novel framework that "lifts" offline divide-and-conquer algorithms into the fully dynamic setting with little overhead. Using this, we are able to interpolate between the offline and fully dynamic settings; when the $\ell_1$ error of the prediction is linear in the number of updates, we achieve the offline runtime of the algorithm (up to $\mathrm{poly} \log n$ factors). Provided a fully dynamic backstop algorithm, our algorithm will never do worse than the backstop algorithm regardless of the prediction error. Furthermore, our framework achieves a smooth linear trade-off between $\ell_1$ error in the predictions and runtime. These correspond to the desiderata of consistency, robustness, and graceful degradation of the algorithms-with-predictions literature. We further extend our techniques to incremental and decremental settings, transforming algorithms in these settings when given predictions of only the deletion and insertion times, respectively. Our framework is general, and we apply it to obtain improved efficiency bounds over the state-of-the-art dynamic algorithms for a variety of problems including triconnectivity, planar digraph all pairs shortest paths, $k$-edge connectivity, and others, for prediction error of reasonable magnitude.

algorithm, dynamic algorithm, prediction, (16 more...)

2307.0889

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Virginia (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(10 more...)

Genre: Research Report (0.64)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.45)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.35)

ACHO: Adaptive Conformal Hyperparameter Optimization

Doyle, Riccardo

Identifying optimal model parameters is deeply desirable for high prediction performance in machine learning, but challenging due to non-convexity and expensive search costs. Common approaches involving grid search - exhaustive iterative search of a confined parameter interval - or random search [1] - random sampling from a broader parameter space - display complimentary weaknesses and form no expectation of hyperparameter performance ahead of search. Focus has instead centered on search frameworks capable of forming hyperperameter performance expectations prior to sampling, generally dominated by Sequential Model-Based Optimization (SMBO) [2]. Early applications [2, 3] resulted in positive outperformance on expert consensus across a range of benchmarked datasets, leveraging Gaussian Process or Tree-structured Parzen estimators. Further expansions of the framework included search cost inclusion as an optimization criterion [4], early forms of online resource allocation and distributed search [5], unwanted parameter space pruning [6], or replacement of single estimators with ensemble methods [7].

dataset, estimator, val, (12 more...)

2207.03017

Country:

North America > United States > Georgia > Fulton County > Atlanta (0.04)
North America > United States > California (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.51)

Xie, Zikai, Evangelopoulos, Xenophon, Thacker, Joseph, Cooper, Andrew

Domain Knowledge Injection in Bayesian Search for New Materials

arXiv.org Artificial IntelligenceNov-25-2023

In this paper we propose DKIBO, a Bayesian optimization (BO) algorithm that accommodates domain knowledge to tune exploration in the search space. Bayesian optimization has recently emerged as a sample-efficient optimizer for many intractable scientific problems. While various existing BO frameworks allow the input of prior beliefs to accelerate the search by narrowing down the space, incorporating such knowledge is not always straightforward and can often introduce bias and lead to poor performance. Here we propose a simple approach to incorporate structural knowledge in the acquisition function by utilizing an additional deterministic surrogate model to enrich the approximation power of the Gaussian process. This is suitably chosen according to structural information of the problem at hand and acts a corrective term towards a better-informed sampling. We empirically demonstrate the practical utility of the proposed method by successfully injecting domain knowledge in a materials design task. We further validate our method's performance on different experimental settings and ablation analyses.

acquisition function, knowledge, optimization, (16 more...)

doi: 10.3233/FAIA230587

2311.15162

Country:

North America > United States > New York (0.04)
Europe > United Kingdom > England > Merseyside > Liverpool (0.04)
Europe > Belgium > Flanders > East Flanders > Ghent (0.04)
Asia > China (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Robots (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.67)