AITopics | Wu, Ga

Collaborating Authors

Wu, Ga

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Towards Understanding the Feasibility of Machine Unlearning

Sarvmaili, Mahtab, Sajjad, Hassan, Wu, Ga

arXiv.org Artificial IntelligenceOct-3-2024

In light of recent privacy regulations, machine unlearning has attracted significant attention in the research community. However, current studies predominantly assess the overall success of unlearning approaches, overlooking the varying difficulty of unlearning individual training samples. As a result, the broader feasibility of machine unlearning remains under-explored. This paper presents a set of novel metrics for quantifying the difficulty of unlearning by jointly considering the properties of target model and data distribution. Specifically, we propose several heuristics to assess the conditions necessary for a successful unlearning operation, examine the variations in unlearning difficulty across different training samples, and present a ranking mechanism to identify the most challenging samples to unlearn. We highlight the effectiveness of the Kernelized Stein Discrepancy (KSD), a parameterized kernel function tailored to each model and dataset, as a heuristic for evaluating unlearning difficulty. Our approach is validated through multiple classification tasks and established machine unlearning algorithms, demonstrating the practical feasibility of unlearning operations across diverse scenarios. Machine Unlearning (MU) (Cao & Yang, 2015) refers to a process that enables machine learning (ML) models to remove specific training data and revert corresponding data influence on the trained models while preserving the models' generalization. Although existing machine unlearning studies vary based on diverse theoretical foundations, their performance evaluation metrics used are generally common, including 1) Data Erasure Completeness, 2) Unlearning Time Efficiency, 3) Resource Consumption, and 4) Privacy Preservation (Xu et al., 2024; Yang & Zhao, 2023; Shaik et al., 2023).

artificial intelligence, machine learning, similar data, (15 more...)

arXiv.org Artificial Intelligence

2410.03043

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Data-centric Prediction Explanation via Kernelized Stein Discrepancy

Sarvmaili, Mahtab, Sajjad, Hassan, Wu, Ga

arXiv.org Artificial IntelligenceMar-22-2024

Existing example-based prediction explanation methods often bridge test and training data points through the model's parameters or latent representations. While these methods offer clues to the causes of model predictions, they often exhibit innate shortcomings, such as incurring significant computational overhead or producing coarse-grained explanations. This paper presents a Highly-precise and Data-centric Explanation (HD-Explain), a straightforward prediction explanation method exploiting properties of Kernelized Stein Discrepancy (KSD). Specifically, the KSD uniquely defines a parameterized kernel function for a trained model that encodes model-dependent data correlation. By leveraging the kernel function, one can identify training samples that provide the best predictive support to a test point efficiently. We conducted thorough analyses and experiments across multiple classification domains, where we show that HD-Explain outperforms existing methods from various aspects, including 1) preciseness (fine-grained explanation), 2) consistency, and 3) computation efficiency, leading to a surprisingly simple, effective, and robust prediction explanation solution.

artificial intelligence, explanation, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2403.15576

Genre: Research Report (1.00)

Industry: Health & Medicine (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Within-basket Recommendation via Neural Pattern Associator

Luo, Kai, Shen, Tianshu, Yao, Lan, Wu, Ga, Liblong, Aaron, Fehervari, Istvan, An, Ruijian, Ahmed, Jawad, Mishra, Harshit, Pujari, Charu

arXiv.org Artificial IntelligenceJan-25-2024

Within-basket recommendation (WBR) refers to the task of recommending items to the end of completing a non-empty shopping basket during a shopping session. While the latest innovations in this space demonstrate remarkable performance improvement on benchmark datasets, they often overlook the complexity of user behaviors in practice, such as 1) co-existence of multiple shopping intentions, 2) multi-granularity of such intentions, and 3) interleaving behavior (switching intentions) in a shopping session. This paper presents Neural Pattern Associator (NPA), a deep item-association-mining model that explicitly models the aforementioned factors. Specifically, inspired by vector quantization, the NPA model learns to encode common user intentions (or item-combination patterns) as quantized representations (a.k.a. codebook), which permits identification of users's shopping intentions via attention-driven lookup during the reasoning phase. This yields coherent and self-interpretable recommendations. We evaluated the proposed NPA model across multiple extensive datasets, encompassing the domains of grocery e-commerce (shopping basket completion) and music (playlist extension), where our quantitative evaluations show that the NPA model significantly outperforms a wide range of existing WBR solutions, reflecting the benefit of explicitly modeling complex user intentions.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2401.16433

Country:

North America > Canada > Ontario > Toronto (0.15)
North America > Canada > Nova Scotia > Halifax Regional Municipality > Halifax (0.14)

Genre: Research Report (0.82)

Industry:

Consumer Products & Services > Food, Beverage, Tobacco & Cannabis (0.48)
Information Technology > Services > e-Commerce Services (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.68)

Add feedback

Self-supervised Representation Learning From Random Data Projectors

Sui, Yi, Wu, Tongzi, Cresswell, Jesse C., Wu, Ga, Stein, George, Huang, Xiao Shi, Zhang, Xiaochen, Volkovs, Maksims

arXiv.org Artificial IntelligenceOct-11-2023

Self-supervised representation learning~(SSRL) has advanced considerably by exploiting the transformation invariance assumption under artificially designed data augmentations. While augmentation-based SSRL algorithms push the boundaries of performance in computer vision and natural language processing, they are often not directly applicable to other data modalities, and can conflict with application-specific data augmentation constraints. This paper presents an SSRL approach that can be applied to any data modality and network architecture because it does not rely on augmentations or masking. Specifically, we show that high-quality data representations can be learned by reconstructing random data projections. We evaluate the proposed approach on a wide range of representation learning tasks that span diverse modalities and real-world applications. We show that it outperforms multiple state-of-the-art SSRL baselines. Due to its wide applicability and strong empirical results, we argue that learning from randomness is a fruitful research direction worthy of attention and further study.

artificial intelligence, natural language, self-supervised representation learning, (1 more...)

arXiv.org Artificial Intelligence

2310.07756

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language (0.87)

Add feedback

fAux: Testing Individual Fairness via Gradient Alignment

Castiglione, Giuseppe, Wu, Ga, Srinivasa, Christopher, Prince, Simon

arXiv.org Artificial IntelligenceOct-10-2022

Machine learning models are vulnerable to biases that result in unfair treatment of individuals from different populations. Recent work that aims to test a model's fairness at the individual level either relies on domain knowledge to choose metrics, or on input transformations that risk generating out-of-domain samples. We describe a new approach for testing individual fairness that does not have either requirement. We propose a novel criterion for evaluating individual fairness and develop a practical testing method based on this criterion which we call fAux (pronounced fox). This is based on comparing the derivatives of the predictions of the model to be tested with those of an auxiliary model, which predicts the protected variable from the observed data. We show that the proposed method effectively identifies discrimination on both synthetic and real-world datasets, and has quantitative and qualitative advantages over contemporary methods.

artificial intelligence, dataset, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2210.06288

Genre: Research Report (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Scalable Planning with Deep Neural Network Learned Transition Models

Wu, Ga (University of Toronto) | Say, Buser | Sanner, Scott

Journal of Artificial Intelligence ResearchJul-20-2020

In many complex planning problems with factored, continuous state and action spaces such as Reservoir Control, Heating Ventilation and Air Conditioning (HVAC), and Navigation domains, it is difficult to obtain a model of the complex nonlinear dynamics that govern state evolution. However, the ubiquity of modern sensors allows us to collect large quantities of data from each of these complex systems and build accurate, nonlinear deep neural network models of their state transitions. But there remains one major problem for the task of control - how can we plan with deep network learned transition models without resorting to Monte Carlo Tree Search and other black-box transition model techniques that ignore model structure and do not easily extend to continuous domains? In this paper, we introduce two types of planning methods that can leverage deep neural network learned transition models: Hybrid Deep MILP Planner (HD-MILP-Plan) and Tensorflow Planner (TF-Plan). In HD-MILP-Plan, we make the critical observation that the Rectified Linear Unit (ReLU) transfer function for deep networks not only allows faster convergence of model learning, but also permits a direct compilation of the deep network transition model to a Mixed-Integer Linear Program (MILP) encoding. Further, we identify deep network specific optimizations for HD-MILP-Plan that improve performance over a base encoding and show that we can plan optimally with respect to the learned deep networks. In TF-Plan, we take advantage of the efficiency of auto-differentiation tools and GPU-based computation where we encode a subclass of purely continuous planning problems as Recurrent Neural Networks and directly optimize the actions through backpropagation. We compare both planners and show that TF-Plan is able to approximate the optimal plans found by HD-MILP-Plan in less computation time. Hence this article offers two novel planners for continuous state and action domains with learned deep neural net transition models: one optimal method (HD-MILP-Plan) and a scalable alternative for large-scale problems (TF-Plan).

artificial intelligence, hvac, machine learning, (20 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.11829

AI Access Foundation

11829

Journal of Artificial Intelligence Research

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.67)

Industry:

Information Technology (0.67)
Energy > Oil & Gas > Upstream (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Scalable Nonlinear Planning with Deep Neural Network Learned Transition Models

Wu, Ga, Say, Buser, Sanner, Scott

arXiv.org Artificial IntelligenceApr-5-2019

In many real-world planning problems with factored, mixed discrete and continuous state and action spaces such as Reservoir Control, Heating Ventilation and Air Conditioning (HVAC), and Navigation domains, it is difficult to obtain a model of the complex nonlinear dynamics that govern state evolution. However, the ubiquity of modern sensors allows us to collect large quantities of data from each of these complex systems and build accurate, nonlinear deep neural network models of their state transitions. But there remains one major problem for the task of control - how can we plan with deep network learned transition models without resorting to Monte Carlo Tree Search and other black-box transition model techniques that ignore model structure and do not easily extend to mixed discrete and continuous domains? In this paper, we introduce two types of nonlinear planning methods that can leverage deep neural network learned transition models: Hybrid Deep MILP Planner (HD-MILP-Plan) and Tensorflow Planner (TF-Plan). In HD-MILP-Plan, we make the critical observation that the Rectified Linear Unit (ReLU) transfer function for deep networks not only allows faster convergence of model learning, but also permits a direct compilation of the deep network transition model to a Mixed-Integer Linear Program (MILP) encoding. Further, we identify deep network specific optimizations for HD-MILP-Plan that improve performance over a base encoding and show that we can plan optimally with respect to the learned deep networks. In TF-Plan, we take advantage of the efficiency of auto-differentiation tools and GPU-based computation where we encode a subclass of purely continuous planning problems as Recurrent Neural Networks and directly optimize the actions through backpropagation. We compare both planners and show that TF-Plan is able to approximate the optimal plans found by HD-MILP-Plan in less computation time. Hence this article offers two novel planners for learned deep neural net transition models: one optimal method for mixed discrete and continuous state and actions (HD-MILP-Plan) and a scalable alternative for large-scale purely continuous state and action problems (TF-Plan).

constraint, deep learning, upstream oil & gas, (18 more...)

arXiv.org Artificial Intelligence

1904.02873

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.81)

Industry:

Information Technology (0.67)
Energy > Oil & Gas > Upstream (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Conditional Inference in Pre-trained Variational Autoencoders via Cross-coding

Wu, Ga, Domke, Justin, Sanner, Scott

arXiv.org Machine LearningMay-20-2018

Variational Autoencoders (VAEs) are a popular generative model, but one in which conditional inference can be challenging. If the decomposition into query and evidence variables is fixed, conditional VAEs provide an attractive solution. To support arbitrary queries, one is generally reduced to Markov Chain Monte Carlo sampling methods that can suffer from long mixing times. In this paper, we propose an idea we term cross-coding to approximate the distribution over the latent variables after conditioning on an evidence assignment to some subset of the variables. This allows generating query samples without retraining the full VAE. We experimentally evaluate three variations of cross-coding showing that (i) two can be quickly trained for different decompositions of evidence and query and (ii) they quantitatively and qualitatively outperform Hamiltonian Monte Carlo.

artificial intelligence, inference, neural network, (16 more...)

arXiv.org Machine Learning

1805.07785

Country:

North America > United States > Massachusetts (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Scalable Planning with Tensorflow for Hybrid Nonlinear Domains

Wu, Ga, Say, Buser, Sanner, Scott

Neural Information Processing SystemsDec-31-2017

Given recent deep learning results that demonstrate the ability to effectively optimize high-dimensional non-convex functions with gradient descent optimization on GPUs, we ask in this paper whether symbolic gradient optimization tools such as Tensorflow can be effective for planning in hybrid (mixed discrete and continuous) nonlinear domains with high dimensional state and action spaces? To this end, we demonstrate that hybrid planning with Tensorflow and RMSProp gradient descent is competitive with mixed integer linear program (MILP) based optimization on piecewise linear planning domains (where we can compute optimal solutions) and substantially outperforms state-of-the-art interior point methods for nonlinear planning domains. Furthermore, we remark that Tensorflow is highly scalable, converging to a strong plan on a large-scale concurrent domain with a total of 576,000 continuous action parameters distributed over a horizon of 96 time steps and 100 parallel instances in only 4 minutes. We provide a number of insights that clarify such strong performance including observations that despite long horizons, RMSProp avoids both the vanishing and exploding gradient problems. Together these results suggest a new frontier for highly scalable planning in nonlinear hybrid domains by leveraging GPUs and the power of recent advances in gradient descent with highly optimized toolkits like Tensorflow.

neural network, tensorflow, upstream oil & gas, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > New Hampshire (0.14)
North America > United States > California (0.14)
North America > United States > Arizona (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report > New Finding (0.48)

Industry:

Information Technology (0.46)
Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.92)

Add feedback

Bayesian Model Averaging Naive Bayes (BMA-NB): Averaging over an Exponential Number of Feature Models in Linear Time

Wu, Ga (Australian National University) | Sanner, Scott (NICTA and Australian National University) | Oliveira, Rodrigo F.S.C. (University of Pernambuco)

AAAI ConferencesMar-6-2015

Naive Bayes (NB) is well-known to be a simple but effective classifier, especially when combined with feature selection. Unfortunately, feature selection methods are often greedy and thus cannot guarantee an optimal feature set is selected. An alternative to feature selection is to use Bayesian model averaging (BMA), which computes a weighted average over multiple predictors; when the different predictor models correspond to different feature sets, BMA has the advantage over feature selection that its predictions tend to have lower variance on average in comparison to any single model. In this paper, we show for the first time that it is possible to exactly evaluate BMA over the exponentially-sized powerset of NB feature models in linear-time in the number of features; this yields an algorithm about as expensive to train as a single NB model with all features, but yet provably converges to the globally optimal feature subset in the asymptotic limit of data. We evaluate this novel BMA-NB classifier on a range of datasets showing that it never underperforms NB (as expected) and sometimes offers performance competitive (or superior) to classifiers such as SVMs and logistic regression while taking a fraction of the time to train.

artificial intelligence, bma, machine learning, (16 more...)

AAAI Conferences

Twenty-Ninth AAAI Conference on Artificial Intelligence

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > New Finding (0.89)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback