AITopics | Brophy, Jonathan

Collaborating Authors

Brophy, Jonathan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Adapting and Evaluating Influence-Estimation Methods for Gradient-Boosted Decision Trees

Brophy, Jonathan, Hammoudeh, Zayd, Lowd, Daniel

arXiv.org Artificial IntelligenceMay-31-2023

Influence estimation analyzes how changes to the training data can lead to different model predictions; this analysis can help us better understand these predictions, the models making those predictions, and the data sets they're trained on. However, most influence-estimation techniques are designed for deep learning models with continuous parameters. Gradient-boosted decision trees (GBDTs) are a powerful and widely-used class of models; however, these models are black boxes with opaque decision-making processes. In the pursuit of better understanding GBDT predictions and generally improving these models, we adapt recent and popular influence-estimation methods designed for deep learning models to GBDTs. Specifically, we adapt representer-point methods and TracIn, denoting our new methods TREX and BoostIn, respectively; source code is available at https://github.com/jjbrophy47/tree_influence. We compare these methods to LeafInfluence and other baselines using 5 different evaluation measures on 22 real-world data sets with 4 popular GBDT implementations. These experiments give us a comprehensive overview of how different approaches to influence estimation work in GBDT models. We find BoostIn is an efficient influence-estimation method for GBDTs that performs equally well or better than existing work while being four orders of magnitude faster. Our evaluation also suggests the gold-standard approach of leave-one-out (LOO) retraining consistently identifies the single-most influential training example but performs poorly at finding the most influential set of training examples for a given target prediction.

artificial intelligence, machine learning, training example, (20 more...)

arXiv.org Artificial Intelligence

2205.00359

Country: North America > United States > Oregon > Lane County > Eugene (0.14)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Transportation (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.93)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.68)
Government > Regional Government > North America Government > United States Government (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)

Add feedback

TREX: Tree-Ensemble Representer-Point Explanations

Brophy, Jonathan, Lowd, Daniel

arXiv.org Machine LearningSep-24-2020

How can we identify the training examples that contribute most to the prediction of a tree ensemble? In this paper, we introduce TREX, an explanation system that provides instance-attribution explanations for tree ensembles, such as random forests and gradient boosted trees. TREX builds on the representer point framework previously developed for explaining deep neural networks. Since tree ensembles are non-differentiable, we define a kernel that captures the structure of the specific tree ensemble. By using this kernel in kernel logistic regression or a support vector machine, TREX builds a surrogate model that approximates the original tree ensemble. The weights in the kernel expansion of the surrogate model are used to define the global or local importance of each training example. Our experiments show that TREX's surrogate model accurately approximates the tree ensemble; its global importance weights are more effective in dataset debugging than the previous state-of-the-art; its explanations identify the most influential samples better than alternative methods under the remove and retrain evaluation framework; it runs orders of magnitude faster than alternative methods; and its local explanations can identify and explain errors due to domain mismatch.

deep learning, neural network, prediction, (20 more...)

arXiv.org Machine Learning

2009.0553

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Add feedback

DART: Data Addition and Removal Trees

Brophy, Jonathan, Lowd, Daniel

arXiv.org Machine LearningSep-11-2020

How can we update data for a machine learning model after it has already trained on that data? In this paper, we introduce DART, a variant of random forests that supports adding and removing training data with minimal retraining. Data updates in DART are exact, meaning that adding or removing examples from a DART model yields exactly the same model as retraining from scratch on updated data. DART uses two techniques to make updates efficient. The first is to cache data statistics at each node and training data at each leaf, so that only the necessary subtrees are retrained. The second is to choose the split variable randomly at the upper levels of each tree, so that the choice is completely independent of the data and never needs to change. At the lower levels, split variables are chosen to greedily maximize a split criterion such as Gini index or mutual information. By adjusting the number of random-split levels, DART can trade off between more accurate predictions and more efficient updates. In experiments on ten real-world datasets and one synthetic dataset, we find that DART is orders of magnitude faster than retraining from scratch while sacrificing very little in terms of predictive performance.

decision tree learning, node, us government, (18 more...)

arXiv.org Machine Learning

2009.05567

Country: North America > United States > California (0.14)

Genre: Research Report > New Finding (0.67)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.95)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.69)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.89)

Add feedback

Collective Classification of Social Network Spam

Brophy, Jonathan (University of Oregon) | Lowd, Daniel (University of Oregon)

AAAI ConferencesFeb-4-2017

Unsolicited or unwanted messages is a byproduct of virtually every popular social media website. Spammers have become increasingly proficient at bypassing conventional spam filters, prompting a stronger effort to develop new methods that accurately detect spam while simultaneously acting as a more robust classifier against users that modify their behavior in order to avoid detection. This paper shows the usefulness of a relational model that works in conjunction with an independent model. First, an independent model is built using features that characterize individual comments and users, capturing the cases where spam is obvious. Second, a relational model is built, taking advantage of the interconnected nature of users and their comments. By feeding our initial predictions from the independent model into the relational model, we can start to propagate information about spammers and spam comments to jointly infer the labels of all spam comments at the same time. This allows us to capture the obfuscated spam comments missed by the independent model that are only found by looking at the relational structure of the social network. The results from our experiments demonstrates the viability of our method, and shows that models utilizing the underlying structure of the social network are more effective at detecting spam than ones that do not.

artificial intelligence, relational model, social media, (19 more...)

AAAI Conferences

Workshops at the Thirty-First AAAI Conference on Artificial Intelligence

Country: North America > United States > Oregon > Lane County > Eugene (0.14)

Industry: Information Technology > Services (1.00)

Technology:

Information Technology > Security & Privacy > Spam Filtering (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Add feedback