AITopics | Tang, Weijing

Collaborating Authors

Tang, Weijing

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Versatile Influence Function for Data Attribution with Non-Decomposable Loss

Deng, Junwei, Tang, Weijing, Ma, Jiaqi W.

arXiv.org Machine LearningDec-2-2024

Influence function, a technique rooted in robust statistics, has been adapted in modern machine learning for a novel application: data attribution -- quantifying how individual training data points affect a model's predictions. However, the common derivation of influence functions in the data attribution literature is limited to loss functions that can be decomposed into a sum of individual data point losses, with the most prominent examples known as M-estimators. This restricts the application of influence functions to more complex learning objectives, which we refer to as non-decomposable losses, such as contrastive or ranking losses, where a unit loss term depends on multiple data points and cannot be decomposed further. In this work, we bridge this gap by revisiting the general formulation of influence function from robust statistics, which extends beyond M-estimators. Based on this formulation, we propose a novel method, the Versatile Influence Function (VIF), that can be straightforwardly applied to machine learning models trained with any non-decomposable loss. In comparison to the classical approach in statistics, the proposed VIF is designed to fully leverage the power of auto-differentiation, hereby eliminating the need for case-specific derivations of each loss function. We demonstrate the effectiveness of VIF across three examples: Cox regression for survival analysis, node embedding for network analysis, and listwise learning-to-rank for information retrieval. In all cases, the influence estimated by VIF closely resembles the results obtained by brute-force leave-one-out retraining, while being up to $10^3$ times faster to compute. We believe VIF represents a significant advancement in data attribution, enabling efficient influence-function-based attribution across a wide range of machine learning paradigms, with broad potential for practical use cases.

artificial intelligence, machine learning, versatile influence function, (13 more...)

arXiv.org Machine Learning

2412.01335

Country:

North America > United States > New York > New York County > New York City (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre:

Research Report > New Finding (0.36)
Research Report > Experimental Study (0.36)
Research Report > Promising Solution (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Minimax Regret Learning for Data with Heterogeneous Subgroups

Mo, Weibin, Tang, Weijing, Xue, Songkai, Liu, Yufeng, Zhu, Ji

arXiv.org Machine LearningMay-2-2024

Modern complex datasets often consist of various sub-populations. To develop robust and generalizable methods in the presence of sub-population heterogeneity, it is important to guarantee a uniform learning performance instead of an average one. In many applications, prior information is often available on which sub-population or group the data points belong to. Given the observed groups of data, we develop a min-max-regret (MMR) learning framework for general supervised learning, which targets to minimize the worst-group regret. Motivated from the regret-based decision theoretic framework, the proposed MMR is distinguished from the value-based or risk-based robust learning methods in the existing literature. The regret criterion features several robustness and invariance properties simultaneously. In terms of generalizability, we develop the theoretical guarantee for the worst-case regret over a super-population of the meta data, which incorporates the observed sub-populations, their mixtures, as well as other unseen sub-populations that could be approximated by the observed ones. We demonstrate the effectiveness of our method through extensive simulation studies and an application to kidney transplantation data from hundreds of transplant centers.

artificial intelligence, estimator, machine learning, (16 more...)

arXiv.org Machine Learning

2405.01709

Country: North America > United States > North Carolina (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Nephrology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

KL-divergence Based Deep Learning for Discrete Time Model

Liu, Li, Fang, Xiangeng, Wang, Di, Tang, Weijing, He, Kevin

arXiv.org Artificial IntelligenceApr-11-2023

Neural Network (Deep Learning) is a modern model in Artificial Intelligence and it has been exploited in Survival Analysis. Although several improvements have been shown by previous works, training an excellent deep learning model requires a huge amount of data, which may not hold in practice. To address this challenge, we develop a Kullback-Leibler-based (KL) deep learning procedure to integrate external survival prediction models with newly collected time-to-event data. Time-dependent KL discrimination information is utilized to measure the discrepancy between the external and internal data. This is the first work considering using prior information to deal with short data problem in Survival Analysis for deep learning. Simulation and real data results show that the proposed model achieves better performance and higher robustness compared with previous works.

deep learning, discrete time model, machine learning, (2 more...)

arXiv.org Artificial Intelligence

2208.051

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Learning-to-Rank with Partitioned Preference: Fast Estimation for the Plackett-Luce Model

Ma, Jiaqi, Yi, Xinyang, Tang, Weijing, Zhao, Zhe, Hong, Lichan, Chi, Ed H., Mei, Qiaozhu

arXiv.org Machine LearningOct-25-2020

The industry-scale ranking systems are typically applied to millions of items in a personalized way for billions of users. To We investigate the Plackett-Luce (PL) model meet the need of scalability and to exploit a huge based listwise learning-to-rank (LTR) on amount of user feedback data, learning-to-rank (LTR) data with partitioned preference, where a set has been the most popular paradigm for building the of items are sliced into ordered and disjoint ranking system. Existing LTR approaches can be categorized partitions, but the ranking of items within a into three groups: pointwise (Gey, 1994), pairwise partition is unknown. Given N items with (Burges et al., 2005), and listwise (Cao et al., M partitions, calculating the likelihood of 2007; Taylor et al., 2008) methods. The pointwise and data with partitioned preference under the pairwise LTR methods convert the ranking problem PL model has a time complexity of O(N S!), into regression or classification tasks on single or pairs where S is the maximum size of the top M 1 of items respectively.

artificial intelligence, exp, machine learning, (16 more...)

arXiv.org Machine Learning

2006.05067

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback

SODEN: A Scalable Continuous-Time Survival Model through Ordinary Differential Equation Networks

Tang, Weijing, Ma, Jiaqi, Mei, Qiaozhu, Zhu, Ji

arXiv.org Machine LearningAug-19-2020

In this paper, we propose a flexible model for survival analysis using neural networks along with scalable optimization algorithms. One key technical challenge for directly applying maximum likelihood estimation (MLE) to censored data is that evaluating the objective function and its gradients with respect to model parameters requires the calculation of integrals. To address this challenge, we recognize that the MLE for censored data can be viewed as a differential-equation constrained optimization problem, a novel perspective. Following this connection, we model the distribution of event time through an ordinary differential equation and utilize efficient ODE solvers and adjoint sensitivity analysis to numerically evaluate the likelihood and the gradients. Using this approach, we are able to 1) provide a broad family of continuous-time survival distributions without strong structural assumptions, 2) obtain powerful feature representations using neural networks, and 3) allow efficient estimation of the model in large-scale applications using stochastic gradient descent. Through both simulation studies and real-world data examples, we demonstrate the effectiveness of the proposed method in comparison to existing state-of-the-art deep learning survival analysis models.

deep learning, hazard function, neural network, (20 more...)

arXiv.org Machine Learning

2008.08637

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
North America > United States > New York > New York County > New York City (0.14)

Genre: Research Report > Experimental Study (0.93)

Industry:

Health & Medicine (1.00)
Law > Civil Rights & Constitutional Law (0.56)

Add feedback

A Flexible Generative Framework for Graph-based Semi-supervised Learning

Ma, Jiaqi, Tang, Weijing, Zhu, Ji, Mei, Qiaozhu

arXiv.org Machine LearningMay-26-2019

We consider a family of problems that are concerned about making predictions for the majority of unlabeled, graph-structured data samples based on a small proportion of labeled examples. Relational information among the data samples, often encoded in the graph or network structure, is shown to be helpful for these semi-supervised learning tasks. Conventional graph-based regularization methods and recent graph neural networks do not fully leverage the interrelations between the features, the graph, and the labels. We propose a flexible generative framework for graph-based semi-supervised learning, which approaches the joint distribution of the node features, labels, and the graph structure. Borrowing insights from random graph models in network science literature, this joint distribution can be instantiated using various distribution families. For the inference of missing labels, we exploit recent advances of scalable variational inference techniques to approximate the Bayesian posterior. We conduct thorough experiments on benchmark datasets for graph-based semi-supervised learning. Results show that the proposed methods outperform state-of-the-art models under most settings.

deep learning, generative model, neural network, (20 more...)

arXiv.org Machine Learning

1905.10769

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback