AITopics

2605.30327

Country: Europe > Austria (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.88)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)

arXiv.org Machine LearningMar-17-2026

Bayesian Inference for Missing Physics

Strouwen, Arno

Model-based approaches for (bio)process systems often suffer from incomplete knowledge of the underlying physical, chemical, or biological laws. Universal differential equations, which embed neural networks within differential equations, have emerged as powerful tools to learn this missing physics from experimental data. However, neural networks are inherently opaque, motivating their post-processing via symbolic regression to obtain interpretable mathematical expressions. Genetic algorithm-based symbolic regression is a popular approach for this post-processing step, but provides only point estimates and cannot quantify the confidence we should place in a discovered equation. We address this limitation by applying Bayesian symbolic regression, which uses Reversible Jump Markov Chain Monte Carlo to sample from the posterior distribution over symbolic expression trees. This approach naturally quantifies uncertainty in the recovered model structure. We demonstrate the methodology on a Lotka-Volterra predator-prey system and then show how a well-designed experiment leads to lower uncertainty in a fed-batch bioreactor case study.

artificial intelligence, bayesian inference, machine learning, (19 more...)

2603.14918

Country: Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

arXiv.org Artificial IntelligenceMar-3-2025

Provably optimal decision trees with arbitrary splitting rules in polynomial time

He, Xi, Little, Max A.

In this paper, we introduce a generic data structure called decision trees, which integrates several well-known data structures, including binary search trees, K-D trees, binary space partition trees, and decision tree models from machine learning. We provide the first axiomatic definition of decision trees. These axioms establish a firm mathematical foundation for studying decision tree problems. We refer to decision trees that satisfy the axioms as proper decision trees. We prove that only proper decision trees can be uniquely characterized as K-permutations. Since permutations are among the most well-studied combinatorial structures, this characterization provides a fundamental basis for analyzing the combinatorial and algorithmic properties of decision trees. As a result of this advancement, we develop the first provably correct polynomial-time algorithm for solving the optimal decision tree problem. Our algorithm is derived using a formal program derivation framework, which enables step-by-step equational reasoning to construct side-effect-free programs with guaranteed correctness. The derived algorithm is correct by construction and is applicable to decision tree problems defined by any splitting rules that adhere to the axioms and any objective functions that can be specified in a given form. Examples include the decision tree problems where splitting rules are defined by axis-parallel hyperplanes, arbitrary hyperplanes, and hypersurfaces. By extending the axioms, we can potentially address a broader range of problems. Moreover, the derived algorithm can easily accommodate various constraints, such as tree depth and leaf size, and is amenable to acceleration techniques such as thinning method.

algorithm, decision tree, splitting rule, (16 more...)

2503.01455

Country:

Europe > United Kingdom > England > West Midlands > Birmingham (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
(3 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Consolo, Antonio, Amaldi, Edoardo, Manno, Andrea

Soft regression trees: a model variant and a decomposition training algorithm

arXiv.org Artificial IntelligenceJan-10-2025

Decision trees are widely used for classification and regression tasks in a variety of application fields due to their interpretability and good accuracy. During the past decade, growing attention has been devoted to globally optimized decision trees with deterministic or soft splitting rules at branch nodes, which are trained by optimizing the error function over all the tree parameters. In this work, we propose a new variant of soft multivariate regression trees (SRTs) where, for every input vector, the prediction is defined as the linear regression associated to a single leaf node, namely, the leaf node obtained by routing the input vector from the root along the branches with higher probability. SRTs exhibit the conditional computational property, i.e., each prediction depends on a small number of nodes (parameters), and our nonlinear optimization formulation for training them is amenable to decomposition. After showing a universal approximation result for SRTs, we present a decomposition training algorithm including a clustering-based initialization procedure and a heuristic for reassigning the input vectors along the tree. Under mild assumptions, we establish asymptotic convergence guarantees. Experiments on 15 wellknown datasets indicate that our SRTs and decomposition algorithm yield higher accuracy and robustness compared with traditional soft regression trees trained using the nonlinear optimization formulation of Blanquero et al., and a significant reduction in training times as well as a slightly better average accuracy compared with the mixed-integer optimization approach of Bertsimas and Dunn. We also report a comparison with the Random Forest ensemble method.

artificial intelligence, decision tree learning, machine learning, (18 more...)

2501.05942

Country:

Europe (0.45)
North America > United States > California (0.27)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Mao, Qiangqiang, Cao, Yankai

Can a Single Tree Outperform an Entire Forest?

arXiv.org Machine LearningNov-25-2024

The prevailing mindset is that a single decision tree underperforms classic random forests in testing accuracy, despite its advantages in interpretability and lightweight structure. This study challenges such a mindset by significantly improving the testing accuracy of an oblique regression tree through our gradient-based entire tree optimization framework, making its performance comparable to the classic random forest. Our approach reformulates tree training as a differentiable unconstrained optimization task, employing a scaled sigmoid approximation strategy. To ameliorate numerical instability, we propose an algorithmic scheme that solves a sequence of increasingly accurate approximations. Additionally, a subtree polish strategy is implemented to reduce approximation errors accumulated across the tree. Extensive experiments on 16 datasets demonstrate that our optimized tree outperforms the classic random forest by an average of $2.03\%$ improvements in testing accuracy.

accuracy, artificial intelligence, machine learning, (18 more...)

2411.17003

Country: North America > Canada > British Columbia (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Energy > Oil & Gas > Upstream (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceJan-14-2024

BET: Explaining Deep Reinforcement Learning through The Error-Prone Decisions

Liu, Xiao, Zhao, Jie, Chen, Wubing, Tan, Mao, Su, Yongxing

Despite the impressive capabilities of Deep Reinforcement Learning (DRL) agents in many challenging scenarios, their black-box decision-making process significantly limits their deployment in safety-sensitive domains. Several previous self-interpretable works focus on revealing the critical states of the agent's decision. However, they cannot pinpoint the error-prone states. To address this issue, we propose a novel self-interpretable structure, named Backbone Extract Tree (BET), to better explain the agent's behavior by identify the error-prone states. At a high level, BET hypothesizes that states in which the agent consistently executes uniform decisions exhibit a reduced propensity for errors. To effectively model this phenomenon, BET expresses these states within neighborhoods, each defined by a curated set of representative states. Therefore, states positioned at a greater distance from these representative benchmarks are more prone to error. We evaluate BET in various popular RL environments and show its superiority over existing self-interpretable models in terms of explanation fidelity. Furthermore, we demonstrate a use case for providing explanations for the agents in StarCraft II, a sophisticated multi-agent cooperative game. To the best of our knowledge, we are the first to explain such a complex scenarios using a fully transparent structure.

agent, bone, self-interpretable model, (15 more...)

2401.07263

Country:

Asia > Indonesia (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games > Computer Games (0.36)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Aldinucci, Tommaso, Lapucci, Matteo

Loss-Optimal Classification Trees: A Generalized Framework and the Logistic Case

arXiv.org Machine LearningDec-7-2023

The Classification Tree (CT) is one of the most common models in interpretable machine learning. Although such models are usually built with greedy strategies, in recent years, thanks to remarkable advances in Mixer-Integer Programming (MIP) solvers, several exact formulations of the learning problem have been developed. In this paper, we argue that some of the most relevant ones among these training models can be encapsulated within a general framework, whose instances are shaped by the specification of loss functions and regularizers. Next, we introduce a novel realization of this framework: specifically, we consider the logistic loss, handled in the MIP setting by a linear piece-wise approximation, and couple it with $\ell_1$-regularization terms. The resulting Optimal Logistic Tree model numerically proves to be able to induce trees with enhanced interpretability features and competitive generalization capabilities, compared to the state-of-the-art MIP-based approaches.

artificial intelligence, machine learning, node, (16 more...)

2306.00857

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts (0.04)
(4 more...)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.46)

D'Onofrio, Federico, Grani, Giorgio, Monaci, Marta, Palagi, Laura

Margin Optimal Classification Trees

arXiv.org Artificial IntelligenceOct-8-2023

In recent years, there has been growing attention to interpretable machine learning models which can give explanatory insights on their behaviour. Thanks to their interpretability, decision trees have been intensively studied for classification tasks and, due to the remarkable advances in mixed integer programming (MIP), various approaches have been proposed to formulate the problem of training an Optimal Classification Tree (OCT) as a MIP model. We present a novel mixed integer quadratic formulation for the OCT problem, which exploits the generalization capabilities of Support Vector Machines for binary classification. Our model, denoted as Margin Optimal Classification Tree (MARGOT), encompasses maximum margin multivariate hyperplanes nested in a binary tree structure. To enhance the interpretability of our approach, we analyse two alternative versions of MARGOT, which include feature selection constraints inducing sparsity of the hyperplanes' coefficients. First, MARGOT has been tested on non-linearly separable synthetic datasets in a 2-dimensional feature space to provide a graphical representation of the maximum margin approach. Finally, the proposed models have been tested on benchmark datasets from the UCI repository. The MARGOT formulation turns out to be easier to solve than other OCT approaches, and the generated tree better generalizes on new observations. The two interpretable versions effectively select the most relevant features, maintaining good prediction quality.

dataset, hyperparameter, node, (15 more...)

doi: 10.1016/j.cor.2023.106441

2210.10567

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Italy > Lazio > Rome (0.04)
South America > Uruguay > Maldonado > Maldonado (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.88)

arXiv.org Artificial IntelligenceJul-25-2023

Feature Importance Measurement based on Decision Tree Sampling

Huang, Chao, Das, Diptesh, Tsuda, Koji

Most of the existing methods are ad hoc, and do not have explicit Random forest is effective for prediction tasks control over the size and accuracy of a DT. For e.g., there are but the randomness of tree generation hinders interpretability greedy splitting-based (Quinlan, 1986; 2014; Breiman et al., in feature importance analysis. To 1984), Bayesian-based (Denison et al., 1998; Chipman et al., address this, we proposed DT-Sampler, a SATbased 1998; 2002; 2010; Letham et al., 2015), branch-and-bound method for measuring feature importance methods (Angelino et al., 2017) for DT construction.

artificial intelligence, decision tree learning, machine learning, (13 more...)

2307.13333

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.05)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
(4 more...)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.89)

arXiv.org Artificial IntelligenceNov-17-2021

The Faulty GPS Problem: Shortest Time Paths in Networks with Unreliable Directions

Alpern, Steve

This paper optimizes motion planning when there is a known risk that the road choice suggested by a Satnav (GPS) is not on a shortest path. At every branch node of a network Q, a Satnav (GPS) points to the arc leading to the destination, or home node, H - but only with a high known probability p. Always trusting the Satnav's suggestion may lead to an infinite cycle. If one wishes to reach H in least expected time, with what probability q=q(Q,p) should one trust the pointer (if not, one chooses randomly among the other arcs)? We call this the Faulty Satnav (GPS) Problem. We also consider versions where the trust probability q can depend on the degree of the current node and a `treasure hunt' where two searchers try to reach H first. The agent searching for H need not be a car, that is just a familiar example -- it could equally be a UAV receiving unreliable GPS information. This problem has its origin not in driver frustration but in the work of Fonio et al (2017) on ant navigation, where the pointers correspond to pheromone markers pointing to the nest. Neither the driver or ant will know the exact process by which a choice (arc) is suggested, which puts the problem into the domain of how much to trust an option suggested by AI.

arc, node, probability, (15 more...)

2111.09093

Genre: Research Report (0.64)

Technology:

Information Technology > Game Theory (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.46)
Information Technology > Artificial Intelligence > Robots (0.34)