AITopics | impurity

Collaborating Authors

impurity

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Empowering Decision Trees via Shape Function Branching

Neural Information Processing SystemsJun-21-2026, 22:32:59 GMT

Decision trees are prized for their interpretability and strong performance on tabular data. Yet, their reliance on simple axis-aligned linear splits often forces deep, complex structures to capture non-linear feature effects, undermining human comprehension of the constructed tree. To address this limitation, we propose a novel generalization of a decision tree, the Shape Generalized Tree (SGT), in which each internal node applies a learnable axis-aligned shape function to a single feature, enabling rich, non-linear partitioning in one split. As users can easily visualize each node's shape function, SGTs are inherently interpretable and provide intuitive, visual explanations of the model's decision mechanisms. To learn SGTs from data, we propose ShapeCART, an efficient induction algorithm for SGTs. We further extend the SGT framework to bivariate shape functions (S2GT) and multi-way trees (SGTK), and present Shape2CART and ShapeCARTK, extensions to ShapeCART for learning S2GTs and SGTKs, respectively. Experiments on various datasets show that SGTs achieve superior performance with reduced model size compared to traditional axis-aligned linear trees.

artificial intelligence, decision tree learning, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.67)

Genre: Research Report > Experimental Study (0.45)

Industry: Health & Medicine > Therapeutic Area (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.45)

Add feedback

ADebiasedMDIFeatureImportanceMeasurefor RandomForests

Neural Information Processing SystemsFeb-12-2026, 13:47:45 GMT

In particular, interpreting Random Forests (RFs) [2] and its variants [14, 28, 27, 29, 1, 12] has become an important area of research due to the wide ranging applications of RFs invarious scientific areas, such asgenome-wide association studies (GWAS)[7],gene expression microarray[13,23],andgeneregulatorynetworks[9].

artificial intelligence, machine learning, mdi-oob, (16 more...)

Neural Information Processing Systems

Country: North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.35)

Add feedback

702cafa3bb4c9c86e4a3b6834b45aedd-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-12-2026, 13:47:31 GMT

mdi-oob, oob sample, reviewer 2, (14 more...)

Neural Information Processing Systems

Technology:

Information Technology > Modeling & Simulation (0.36)
Information Technology > Artificial Intelligence > Machine Learning (0.33)

Add feedback

Principled Federated Random Forests for Heterogeneous Data

Khellaf, Rémi, Scornet, Erwan, Bellet, Aurélien, Josse, Julie

arXiv.org Machine LearningFeb-4-2026

Random Forests (RF) are among the most powerful and widely used predictive models for centralized tabular data, yet few methods exist to adapt them to the federated learning setting. Unlike most federated learning approaches, the piecewise-constant nature of RF prevents exact gradient-based optimization. As a result, existing federated RF implementations rely on unprincipled heuristics: for instance, aggregating decision trees trained independently on clients fails to optimize the global impurity criterion, even under simple distribution shifts. We propose FedForest, a new federated RF algorithm for horizontally partitioned data that naturally accommodates diverse forms of client data heterogeneity, from covariate shift to more complex outcome shift mechanisms. We prove that our splitting procedure, based on aggregating carefully chosen client statistics, closely approximates the split selected by a centralized algorithm. Moreover, FedForest allows splits on client indicators, enabling a non-parametric form of personalization that is absent from prior federated random forest methods. Empirically, we demonstrate that the resulting federated forests closely match centralized performance across heterogeneous benchmarks while remaining communication-efficient.

artificial intelligence, machine learning, principled federated random forest, (16 more...)

arXiv.org Machine Learning

2602.03258

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > California > Monterey County > Monterey (0.04)
Europe > Switzerland (0.04)
(2 more...)

Genre: Research Report (0.64)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Empowering Decision Trees via Shape Function Branching

Upadhya, Nakul, Cohen, Eldan

arXiv.org Artificial IntelligenceOct-23-2025

Decision trees are prized for their interpretability and strong performance on tabular data. Yet, their reliance on simple axis-aligned linear splits often forces deep, complex structures to capture non-linear feature effects, undermining human comprehension of the constructed tree. To address this limitation, we propose a novel generalization of a decision tree, the Shape Generalized Tree (SGT), in which each internal node applies a learnable axis-aligned shape function to a single feature, enabling rich, non-linear partitioning in one split. As users can easily visualize each node's shape function, SGTs are inherently interpretable and provide intuitive, visual explanations of the model's decision mechanisms. To learn SGTs from data, we propose ShapeCART, an efficient induction algorithm for SGTs. We further extend the SGT framework to bivariate shape functions (S$^2$GT) and multi-way trees (SGT$_K$), and present Shape$^2$CART and ShapeCART$_K$, extensions to ShapeCART for learning S$^2$GTs and SGT$_K$s, respectively. Experiments on various datasets show that SGTs achieve superior performance with reduced model size compared to traditional axis-aligned linear trees.

artificial intelligence, machine learning, sgt 3, (16 more...)

arXiv.org Artificial Intelligence

2510.1904

Country: North America > United States (0.67)

Genre:

Research Report > Experimental Study (0.45)
Research Report > New Finding (0.45)

Industry: Health & Medicine > Therapeutic Area (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.45)

Add feedback

ACT: Agentic Classification Tree

Grari, Vincent, Arni, Tim, Laugel, Thibault, Lamprier, Sylvain, Zou, James, Detyniecki, Marcin

arXiv.org Artificial IntelligenceOct-23-2025

When used in high-stakes settings, AI systems are expected to produce decisions that are transparent, interpretable, and auditable, a requirement increasingly expected by regulations. Decision trees such as CART provide clear and verifiable rules, but they are restricted to structured tabular data and cannot operate directly on unstructured inputs such as text. In practice, large language models (LLMs) are widely used for such data, yet prompting strategies such as chain-of-thought or prompt optimization still rely on free-form reasoning, limiting their ability to ensure trustworthy behaviors. We present the Agentic Classification Tree (ACT), which extends decision-tree methodology to unstructured inputs by formulating each split as a natural-language question, refined through impurity-based evaluation and LLM feedback via TextGrad. Experiments on text benchmarks show that ACT matches or surpasses prompting-based baselines while producing transparent and interpretable decision paths.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2509.26433

Country: Europe (1.00)

Genre: Research Report (1.00)

Industry:

Law (1.00)
Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Consumer Health (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.91)

Add feedback

702cafa3bb4c9c86e4a3b6834b45aedd-Paper.pdf

Neural Information Processing SystemsOct-2-2025, 23:38:27 GMT

artificial intelligence, decision tree learning, machine learning, (13 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report (0.68)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.69)

Technology:

Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.30)

Add feedback

comments, we organize our responses as follows

Neural Information Processing SystemsOct-2-2025, 23:38:13 GMT

We thank the reviewers for their valuable feedback that will significantly improve our paper. This is indeed a limitation of Theorem 1. The CHIP data included in our simulation studies shows that MDI-oob works in this setting. We plan to add this plot in our supplementary material. Reviewers 2 and 3: Give theoretical/empirical evidence that MDI-oob can "debias" MDI. Empirically, we compute the MDI-oob for the first simulation.

artificial intelligence, machine learning, oob sample, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Modeling & Simulation (0.36)
Information Technology > Artificial Intelligence > Machine Learning (0.33)

Add feedback

Human-AI Synergy in Adaptive Active Learning for Continuous Lithium Carbonate Crystallization Optimization

Masouleh, Shayan S. Mousavi, Sanz, Corey A., Jansonius, Ryan P., Cronin, Cara, Hein, Jason E., Hattrick-Simpers, Jason

arXiv.org Artificial IntelligenceJul-28-2025

As demand for high-purity lithium surges with the growth of the electric vehicle (EV) industry, cost-effective extraction from lower-grade North American sources like the Smackover Formation is critical. These resources, unlike high-purity South American brines, require innovative purification techniques to be economically viable. Continuous crystallization is a promising method for producing battery-grade lithium carbonate, but its optimization is challenged by a complex parameter space and limited data. This study introduces a Human-in-the-Loop (HITL) assisted active learning framework to optimize the continuous crystallization of lithium carbonate. By integrating human expertise with data-driven insights, our approach accelerates the optimization of lithium extraction from challenging sources. Our results demonstrate the framework's ability to rapidly adapt to new data, significantly improving the process's tolerance to critical impurities like magnesium from the industry standard of a few hundred ppm to as high as 6000 ppm. This breakthrough makes the exploitation of low-grade, impurity-rich lithium resources feasible, potentially reducing the need for extensive pre-refinement processes. By leveraging artificial intelligence, we have refined operational parameters and demonstrated that lower-grade materials can be used without sacrificing product quality. This advancement is a significant step towards economically harnessing North America's vast lithium reserves, such as those in the Smackover Formation, and enhancing the sustainability of the global lithium supply chain.

artificial intelligence, battery, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2507.19316

Country: