AITopics | Wu, Yulun

Plotting

Wu, Yulun

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

AuditVotes: A Framework Towards More Deployable Certified Robustness for Graph Neural Networks

Lai, Yuni, Zhu, Yulin, Sun, Yixuan, Wu, Yulun, Xiao, Bin, Li, Gaolei, Li, Jianhua, Zhou, Kai

arXiv.org Artificial IntelligenceMar-29-2025

Despite advancements in Graph Neural Networks (GNNs), adaptive attacks continue to challenge their robustness. Certified robustness based on randomized smoothing has emerged as a promising solution, offering provable guarantees that a model's predictions remain stable under adversarial perturbations within a specified range. However, existing methods face a critical trade-off between accuracy and robustness, as achieving stronger robustness requires introducing greater noise into the input graph. This excessive randomization degrades data quality and disrupts prediction consistency, limiting the practical deployment of certifiably robust GNNs in real-world scenarios where both accuracy and robustness are essential. To address this challenge, we propose \textbf{AuditVotes}, the first framework to achieve both high clean accuracy and certifiably robust accuracy for GNNs. It integrates randomized smoothing with two key components, \underline{au}gmentation and con\underline{dit}ional smoothing, aiming to improve data quality and prediction consistency. The augmentation, acting as a pre-processing step, de-noises the randomized graph, significantly improving data quality and clean accuracy. The conditional smoothing, serving as a post-processing step, employs a filtering function to selectively count votes, thereby filtering low-quality predictions and improving voting consistency. Extensive experimental results demonstrate that AuditVotes significantly enhances clean accuracy, certified robustness, and empirical robustness while maintaining high computational efficiency. Notably, compared to baseline randomized smoothing, AuditVotes improves clean accuracy by $437.1\%$ and certified accuracy by $409.3\%$ when the attacker can arbitrarily insert $20$ edges on the Cora-ML datasets, representing a substantial step toward deploying certifiably robust GNNs in real-world applications.

accuracy, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2503.22998

Country:

Asia (0.68)
North America > United States > New York (0.28)

Genre: Research Report > New Finding (0.87)

Industry: Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Zero-shot Meta-learning for Tabular Prediction Tasks with Adversarially Pre-trained Transformer

Wu, Yulun, Bergman, Doron L.

arXiv.org Artificial IntelligenceFeb-6-2025

We present an Adversarially Pre-trained Transformer (APT) that is able to perform zero-shot meta-learning on tabular prediction tasks without pre-training on any real-world dataset, extending on the recent development of Prior-Data Fitted Networks (PFNs) and TabPFN. Specifically, APT is pre-trained with adversarial synthetic data agents, who continue to shift their underlying data generating distribution and deliberately challenge the model with different synthetic datasets. In addition, we propose a mixture block architecture that is able to handle classification tasks with arbitrary number of classes, addressing the class size limitation -- a crucial weakness of prior deep tabular zero-shot learners. In experiments, we show that our framework matches state-of-the-art performance on small classification tasks without filtering on dataset characteristics such as number of classes and number of missing values, while maintaining an average runtime under one second. On common benchmark dataset suites in both classification and regression, we show that adversarial pre-training was able to enhance TabPFN's performance. In our analysis, we demonstrate that the adversarial synthetic data agents were able to generate a more diverse collection of data compared to the ordinary random generator in TabPFN. In addition, we demonstrate that our mixture block neural design has improved generalizability and greatly accelerated pre-training.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2502.04573

Country: North America > United States > California (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Counterfactual Generative Modeling with Variational Causal Inference

Wu, Yulun, McConnell, Louie, Iriondo, Claudia

arXiv.org Machine LearningOct-16-2024

Estimating an individual's potential outcomes under counterfactual treatments is a challenging task for traditional causal inference and supervised learning approaches when the outcome is high-dimensional (e.g. gene expressions, facial images) and covariates are relatively limited. In this case, to predict one's outcomes under counterfactual treatments, it is crucial to leverage individual information contained in its high-dimensional observed outcome in addition to the covariates. Prior works using variational inference in counterfactual generative modeling have been focusing on neural adaptations and model variants within the conditional variational autoencoder formulation, which we argue is fundamentally ill-suited to the notion of counterfactual in causal inference. In this work, we present a novel variational Bayesian causal inference framework and its theoretical backings to properly handle counterfactual generative modeling tasks, through which we are able to conduct counterfactual supervision end-to-end during training without any counterfactual samples, and encourage latent disentanglement that aids the correct identification of causal effect in counterfactual generations. In experiments, we demonstrate the advantage of our framework compared to state-of-the-art models in counterfactual generative modeling on multiple benchmarks.

artificial intelligence, conference paper, machine learning, (14 more...)

arXiv.org Machine Learning

2410.1273

Country: North America > United States > California (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Longitudinal Targeted Minimum Loss-based Estimation with Temporal-Difference Heterogeneous Transformer

Shirakawa, Toru, Li, Yi, Wu, Yulun, Qiu, Sky, Li, Yuxuan, Zhao, Mingduo, Iso, Hiroyasu, van der Laan, Mark

arXiv.org Machine LearningApr-5-2024

We propose Deep Longitudinal Targeted Minimum Loss-based Estimation (Deep LTMLE), a novel approach to estimate the counterfactual mean of outcome under dynamic treatment policies in longitudinal problem settings. Our approach utilizes a transformer architecture with heterogeneous type embedding trained using temporal-difference learning. After obtaining an initial estimate using the transformer, following the targeted minimum loss-based likelihood estimation (TMLE) framework, we statistically corrected for the bias commonly associated with machine learning algorithms. Furthermore, our method also facilitates statistical inference by enabling the provision of 95% confidence intervals grounded in asymptotic statistical theory. Simulation results demonstrate our method's superior performance over existing approaches, particularly in complex, long time-horizon scenarios. It remains effective in small-sample, short-duration contexts, matching the performance of asymptotically efficient estimators. To demonstrate our method in practice, we applied our method to estimate counterfactual mean outcomes for standard versus intensive blood pressure management strategies in a real-world cardiovascular epidemiology cohort study.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Machine Learning

2404.04399

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > California > Alameda County > Berkeley (0.14)
Asia > Japan > Honshū > Kantō (0.14)

Genre:

Research Report > New Finding (0.86)
Research Report > Strength Medium (0.66)
Research Report > Experimental Study (0.66)

Industry:

Health & Medicine > Epidemiology (0.48)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

MERTech: Instrument Playing Technique Detection Using Self-Supervised Pretrained Model With Multi-Task Finetuning

Li, Dichucheng, Ma, Yinghao, Wei, Weixing, Kong, Qiuqiang, Wu, Yulun, Che, Mingjin, Xia, Fan, Benetos, Emmanouil, Li, Wei

arXiv.org Artificial IntelligenceOct-15-2023

Instrument playing techniques (IPTs) constitute a pivotal component of musical expression. However, the development of automatic IPT detection methods suffers from limited labeled data and inherent class imbalance issues. In this paper, we propose to apply a self-supervised learning model pre-trained on large-scale unlabeled music data and finetune it on IPT detection tasks. This approach addresses data scarcity and class imbalance challenges. Recognizing the significance of pitch in capturing the nuances of IPTs and the importance of onset in locating IPT events, we investigate multi-task finetuning with pitch and onset detection as auxiliary tasks. Additionally, we apply a post-processing approach for event-level prediction, where an IPT activation initiates an event only if the onset output confirms an onset in that frame. Our method outperforms prior approaches in both frame-level and event-level metrics across multiple IPT benchmark datasets. Further experiments demonstrate the efficacy of multi-task finetuning on each IPT class.

detection, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2310.09853

Country: Asia > China (0.47)

Genre: Research Report (1.00)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Advancing Transformer's Capabilities in Commonsense Reasoning

Zhou, Yu, Han, Yunqiu, Zhou, Hanyu, Wu, Yulun

arXiv.org Artificial IntelligenceOct-10-2023

Recent advances in general purpose pre-trained language models have shown great potential in commonsense reasoning. However, current works still perform poorly on standard commonsense reasoning benchmarks including the Com2Sense Dataset. We argue that this is due to a disconnect with current cutting-edge machine learning methods. In this work, we aim to bridge the gap by introducing current ML-based methods to improve general purpose pre-trained language models in the task of commonsense reasoning. Specifically, we experiment with and systematically evaluate methods including knowledge transfer, model ensemble, and introducing an additional pairwise contrastive objective. Our best model outperforms the strongest previous works by ~15\% absolute gains in Pairwise Accuracy and ~8.7\% absolute gains in Standard Accuracy.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2310.06803

Country: North America > United States > Minnesota (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Predicting Cellular Responses with Variational Causal Inference and Refined Relational Information

Wu, Yulun, Barton, Robert A., Wang, Zichen, Ioannidis, Vassilis N., De Donno, Carlo, Price, Layne C., Voloch, Luis F., Karypis, George

arXiv.org Artificial IntelligenceApr-17-2023

Predicting the responses of a cell under perturbations may bring important benefits to drug discovery and personalized therapeutics. In this work, we propose a novel graph variational Bayesian causal inference framework to predict a cell's gene expressions under counterfactual perturbations (perturbations that this cell did not factually receive), leveraging information representing biological knowledge in the form of gene regulatory networks (GRNs) to aid individualized cellular response predictions. Aiming at a data-adaptive GRN, we also developed an adjacency matrix updating technique for graph convolutional networks and used it to refine GRNs during pre-training, which generated more insights on gene relations and enhanced model performance. Additionally, we propose a robust estimator within our framework for the asymptotically efficient estimation of marginal perturbation effect, which is yet to be carried out in previous works. With extensive experiments, we exhibited the advantage of our approach over state-of-the-art deep learning models for individual response prediction. Studying a cell's response to genetic, chemical, and physical perturbations is fundamental in understanding various biological processes and can lead to important applications such as drug discovery and personalized therapies.

artificial intelligence, machine learning, perturbation, (20 more...)

arXiv.org Artificial Intelligence

2210.00116

Country: North America > United States (0.28)

Genre: Research Report > Promising Solution (0.46)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Frame-Level Multi-Label Playing Technique Detection Using Multi-Scale Network and Self-Attention Mechanism

Li, Dichucheng, Che, Mingjin, Meng, Wenwu, Wu, Yulun, Yu, Yi, Xia, Fan, Li, Wei

arXiv.org Artificial IntelligenceMar-23-2023

With the advancements in deep learning, deep neural networks have been increasingly used in more recent work [8, 9]. In [10], a Instrument playing technique (IPT) is a key element in enhancing convolutional recurrent neural network (CRNN) based model was the vividness of musical performance. As shown by the Guzheng proposed to classify IPTs in audio sequences concatenated by cello numbered musical notation (a musical notation system widely used notes from 5 sound banks. To alleviate the computational redundancy in China) in Fig.1, a complete automatic music transcription (AMT) caused by the sliding window in [10], Wang et al. [11] proposed system should contain IPT information in addition to pitch and onset a fully convolutional network (FCN) based end-to-end method information. IPT detection aims to classify the types of IPTs and to detect IPTs in segments concatenated by isolated Erhu notes. In locate the associated IPT boundaries in audio. IPT detection and [12], an additional onset detector was used, and its output was fused modeling can be utilized in many applications of music information with IPT prediction in a post-processing step to improve the accuracy retrieval (MIR), like performance analysis [1] and AMT [2]. of IPT detection from monophonic audio sequences concatenated by The research on IPT detection is still in its early stage.

artificial intelligence, ipt, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2303.13272

Country: Asia > China (0.49)

Genre: Research Report (0.50)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Variational Causal Inference

Wu, Yulun, Price, Layne C., Wang, Zichen, Ioannidis, Vassilis N., Barton, Robert A., Karypis, George

arXiv.org Artificial IntelligenceJan-31-2023

artificial intelligence, estimator, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2209.05935

Genre: Research Report (0.82)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.89)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Spatial Graph Attention and Curiosity-driven Policy for Antiviral Drug Discovery

Wu, Yulun, Choma, Nicholas, Chen, Andrew, Cashman, Mikaela, Prates, Érica T., Shah, Manesh, Vergara, Verónica G. Melesse, Clyde, Austin, Brettin, Thomas S., de Jong, Wibe A., Kumar, Neeraj, Head, Martha S., Stevens, Rick L., Nugent, Peter, Jacobson, Daniel A., Brown, James B.

arXiv.org Artificial IntelligenceJun-8-2021

We developed Distilled Graph Attention Policy Networks (DGAPNs), a curiosity-driven reinforcement learning model to generate novel graph-structured chemical representations that optimize user-defined objectives by efficiently navigating a physically constrained domain. The framework is examined on the task of generating molecules that are designed to bind, noncovalently, to functional sites of SARS-CoV-2 proteins. We present a spatial Graph Attention Network (sGAT) that leverages self-attention over both node and edge attributes as well as encoding spatial structure -- this capability is of considerable interest in areas such as molecular and synthetic biology and drug discovery. An attentional policy network is then introduced to learn decision rules for a dynamic, fragment-based chemical environment, and state-of-the-art policy gradient techniques are employed to train the network with enhanced stability. Exploration is efficiently encouraged by incorporating innovation reward bonuses learned and proposed by random network distillation. In experiments, our framework achieved outstanding results compared to state-of-the-art algorithms, while increasing the diversity of proposed molecules and reducing the complexity of paths to chemical synthesis.

deep learning, immunology, molecule, (21 more...)

arXiv.org Artificial Intelligence

2106.0219

Country:

North America > United States > Tennessee (0.14)
North America > United States > California (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback