AITopics

2306.04646

Country:

Asia > Kazakhstan (0.62)
North America > United States > Maryland > Prince George's County > College Park (0.06)
South America > Argentina (0.04)
Africa > Sub-Saharan Africa (0.04)

Genre: Research Report (0.40)

Industry: Food & Agriculture > Agriculture (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.58)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.36)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.36)

Guillame-Bert, Mathieu, Bruch, Sebastian, Stotz, Richard, Pfeifer, Jan

Yggdrasil Decision Forests: A Fast and Extensible Decision Forests Library

Yggdrasil Decision Forests is a library for the training, serving and interpretation of decision forest models, targeted both at research and production work, implemented in C++, and available in C++, command line interface, Python (under the name TensorFlow Decision Forests), JavaScript, Go, and Google Sheets (under the name Simple ML for Sheets). The library has been developed organically since 2018 following a set of four design principles applicable to machine learning libraries and frameworks: simplicity of use, safety of use, modularity and high-level abstraction, and integration with other machine learning libraries. In this paper, we describe those principles in detail and present how they have been used to guide the design of the library. We then showcase the use of our library on a set of classical machine learning problems. Finally, we report a benchmark comparing our library to related solutions.

dataset, learner, library, (13 more...)

doi: 10.1145/3580305.3599933

2212.02934

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Austria > Vienna (0.14)
North America > United States > California > Los Angeles County > Long Beach (0.05)
(4 more...)

Genre: Research Report (0.50)

Industry: Education (0.87)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.71)

Alvarez, Jose M., Scott, Kristen M., Ruggieri, Salvatore, Berendt, Bettina

Domain Adaptive Decision Trees: Implications for Accuracy and Fairness

In uses of pre-trained machine learning models, it is a known issue that the target population in which the model is being deployed may not have been reflected in the source population with which the model was trained. This can result in a biased model when deployed, leading to a reduction in model performance. One risk is that, as the population changes, certain demographic groups will be under-served or otherwise disadvantaged by the model, even as they become more represented in the target population. The field of domain adaptation proposes techniques for a situation where label data for the target population does not exist, but some information about the target distribution does exist. In this paper we contribute to the domain adaptation literature by introducing domain-adaptive decision trees (DADT). We focus on decision trees given their growing popularity due to their interpretability and performance relative to other more complex models. With DADT we aim to improve the accuracy of models trained in a source domain (or training data) that differs from the target domain (or test data). We propose an in-processing step that adjusts the information gain split criterion with outside information corresponding to the distribution of the target population. We demonstrate DADT on real data and find that it improves accuracy over a standard decision tree when testing in a shifted target population. We also study the change in fairness under demographic parity and equal opportunity. Results show an improvement in fairness with the use of DADT.

artificial intelligence, knowledge, machine learning, (18 more...)

doi: 10.1145/3593013.3594008

2302.13846

Country:

North America > United States > Illinois > Cook County > Chicago (0.05)
Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
Europe > Italy > Tuscany > Pisa Province > Pisa (0.04)
(3 more...)

Genre: Research Report > New Finding (0.34)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)

Feature Selection on Sentinel-2 Multi-spectral Imagery for Efficient Tree Cover Estimation

Nazir, Usman, Uppal, Momin, Tahir, Muhammad, Khalid, Zubair

This paper proposes a multi-spectral random forest classifier with suitable feature selection and masking for tree cover estimation in urban areas. The key feature of the proposed classifier is filtering out the built-up region using spectral indices followed by random forest classification on the remaining mask with carefully selected features. Using Sentinel-2 satellite imagery, we evaluate the performance of the proposed technique on a specified area (approximately 82 acres) of Lahore University of Management Sciences (LUMS) and demonstrate that our method outperforms a conventional random forest classifier as well as state-of-the-art methods such as European Space Agency (ESA) WorldCover 10m 2020 product as well as a DeepLabv3 deep learning architecture.

artificial intelligence, imagery, machine learning, (17 more...)

2306.06073

Country:

Asia > Pakistan > Punjab > Lahore Division > Lahore (0.26)
Europe > Finland (0.05)
Africa > Ethiopia (0.05)

Genre: Research Report > Promising Solution (0.34)

Industry:

Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.39)
Government > Space Agency (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.78)

Waller, Madeleine, Rodrigues, Odinaldo, Cocarascu, Oana

Bias Mitigation Methods for Binary Classification Decision-Making Systems: Survey and Recommendations

Bias mitigation methods for binary classification decision-making systems have been widely researched due to the ever-growing importance of designing fair machine learning processes that are impartial and do not discriminate against individuals or groups based on protected personal characteristics. In this paper, we present a structured overview of the research landscape for bias mitigation methods, report on their benefits and limitations, and provide recommendations for the development of future bias mitigation methods for binary classification.

artificial intelligence, classification, machine learning, (15 more...)

2305.2002

Country:

Oceania > Australia > Western Australia > Perth (0.04)
North America > United States > West Virginia (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
(3 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.68)

Industry:

Law (1.00)
Information Technology > Security & Privacy (0.68)
Banking & Finance (0.67)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.69)

Amoukou, Salim I., Brunel, Nicolas J. B

Rethinking Counterfactual Explanations as Local and Regional Counterfactual Policies

From a more practical point of view, recent studies [Pawelczyk et al., 2022] show that the prescribed counterfactual recourses are often not implemented exactly by individuals and demonstrate that most state-of-the-art CE algorithms are very likely to fail in this noisy environment. To address these issues, we propose a probabilistic framework that gives a sparse local counterfactual rule for each observation, providing rules that give a range of values capable of changing decisions with high probability. These rules serve as a summary of diverse counterfactual explanations and yield robust recourses. We further aggregate these local rules into a regional counterfactual rule, identifying shared recourses for subgroups of the data. Our local and regional rules are derived from the Random Forest algorithm, which offers statistical guarantees and fidelity to data distribution by selecting recourses in high-density regions. Moreover, our rules are sparse as we first select the smallest set of variables having a high probability of changing the decision. We have conducted experiments to validate the effectiveness of our counterfactual rules in comparison to standard CE and recent similar attempts. Our methods are available as a Python package.

artificial intelligence, machine learning, natural language, (15 more...)

2209.14568

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Wisconsin (0.04)
North America > United States > California (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.68)

Patton, Andrew J., Simsek, Yasin

Generalized Autoregressive Score Trees and Forests

arXiv.org Machine LearningMay-30-2023

We propose methods to improve the forecasts from generalized autoregressive score (GAS) models (Creal et. al, 2013; Harvey, 2013) by localizing their parameters using decision trees and random forests. These methods avoid the curse of dimensionality faced by kernel-based approaches, and allow one to draw on information from multiple state variables simultaneously. We apply the new models to four distinct empirical analyses, and in all applications the proposed new methods significantly outperform the baseline GAS model. In our applications to stock return volatility and density prediction, the optimal GAS tree model reveals a leverage effect and a variance risk premium effect. Our study of stock-bond dependence finds evidence of a flight-to-quality effect in the optimal GAS forest forecasts, while our analysis of high-frequency trade durations uncovers a volume-volatility effect.

artificial intelligence, decision tree learning, machine learning, (20 more...)

arXiv.org Machine Learning

2305.18991

Country:

Asia > Middle East > Israel (0.04)
North America > United States > New York (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Industry:

Banking & Finance > Trading (1.00)
Banking & Finance > Economy (0.93)
Government > Regional Government > North America Government > United States Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Radhakrishna, Sriram, Balasubramanyam, Adithya

Pedestrian Intention Classifier using ID3 Modelled Decision Trees for IoT Edge Devices

arXiv.org Artificial IntelligenceMay-30-2023

Road accidents involving autonomous vehicles commonly occur in situations where a (pedestrian) obstacle presents itself in the path of the moving vehicle at very sudden time intervals, leaving the robot even lesser time to react to the change in scene. In order to tackle this issue, we propose a novel algorithmic implementation that classifies the intent of a single arbitrarily chosen pedestrian in a two dimensional frame into logic states in a procedural manner using quaternions generated from a MediaPipe pose estimation model. This bypasses the need to employ any relatively high latency deep-learning algorithms primarily due to the lack of necessity for depth perception as well as an implicit cap on the computational resources that most IoT edge devices present. The model was able to achieve an average testing accuracy of 83.56% with a reliable variance of 0.0042 while operating with an average latency of 48 milliseconds, demonstrating multiple notable advantages over the current standard of using spatio-temporal convolutional networks for these perceptive tasks.

algorithm, artificial intelligence, machine learning, (19 more...)

2304.00206

Country: Asia > India > Karnataka > Bengaluru (0.05)

Genre: Research Report (1.00)

Industry:

Information Technology (0.68)
Transportation > Ground > Road (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.93)

Sadaiappan, Balamurugan, Balakrishnan, Preethiya, CR, Vishal, Vijayan, Neethu T, Subramanian, Mahendran, Gauns, Mangesh U

Applications of Machine Learning in Chemical and Biological Oceanography

arXiv.org Artificial IntelligenceMay-29-2023

Machine learning (ML) refers to computer algorithms that predict a meaningful output or categorize complex systems based on a large amount of data. ML is applied in various areas including natural science, engineering, space exploration, and even gaming development. This review focuses on the use of machine learning in the field of chemical and biological oceanography. In the prediction of global fixed nitrogen levels, partial carbon dioxide pressure, and other chemical properties, the application of ML is a promising tool. Machine learning is also utilized in the field of biological oceanography to detect planktonic forms from various images (i.e., microscopy, FlowCAM, and video recorders), spectrometers, and other signal processing techniques. Moreover, ML successfully classified the mammals using their acoustics, detecting endangered mammalian and fish species in a specific environment. Most importantly, using environmental data, the ML proved to be an effective method for predicting hypoxic conditions and harmful algal bloom events, an essential measurement in terms of environmental monitoring. Furthermore, machine learning was used to construct a number of databases for various species that will be useful to other researchers, and the creation of new algorithms will help the marine research community better comprehend the chemistry and biology of the ocean.

accuracy, evolutionary algorithm, machine learning, (19 more...)

2209.11557

Country:

North America > United States (1.00)
Asia (1.00)
Atlantic Ocean (0.93)
Europe > United Kingdom > England (0.28)

Genre: Research Report (1.00)

Industry:

Materials > Chemicals (0.93)
Health & Medicine > Pharmaceuticals & Biotechnology (0.92)
Energy > Oil & Gas (0.92)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

State, Laura, Ruggieri, Salvatore, Turini, Franco

Reason to explain: Interactive contrastive explanations (REASONX)

arXiv.org Artificial IntelligenceMay-29-2023

Many high-performing machine learning models are not interpretable. As they are increasingly used in decision scenarios that can critically affect individuals, it is necessary to develop tools to better understand their outputs. Popular explanation methods include contrastive explanations. However, they suffer several shortcomings, among others an insufficient incorporation of background knowledge, and a lack of interactivity. While (dialogue-like) interactivity is important to better communicate an explanation, background knowledge has the potential to significantly improve their quality, e.g., by adapting the explanation to the needs of the end-user. To close this gap, we present REASONX, an explanation tool based on Constraint Logic Programming (CLP). REASONX provides interactive contrastive explanations that can be augmented by background knowledge, and allows to operate under a setting of under-specified information, leading to increased flexibility in the provided explanations. REASONX computes factual and constrative decision rules, as well as closest constrative examples. It provides explanations for decision trees, which can be the ML models under analysis, or global/local surrogate models of any ML model. While the core part of REASONX is built on CLP, we also provide a program layer that allows to compute the explanations via Python, making the tool accessible to a wider audience. We illustrate the capability of REASONX on a synthetic data set, and on a a well-developed example in the credit domain. In both cases, we can show how REASONX can be flexibly used and tailored to the needs of the user.

explanation, logic & formal reasoning, machine learning, (22 more...)

2305.18143

Country:

Europe > Italy > Tuscany > Pisa Province > Pisa (0.04)
North America > United States > New York (0.04)

Genre: Research Report (0.40)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.88)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.68)