AITopics | Zhou, Kai

Plotting

Zhou, Kai

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

AuditVotes: A Framework Towards More Deployable Certified Robustness for Graph Neural Networks

Lai, Yuni, Zhu, Yulin, Sun, Yixuan, Wu, Yulun, Xiao, Bin, Li, Gaolei, Li, Jianhua, Zhou, Kai

arXiv.org Artificial IntelligenceMar-29-2025

Despite advancements in Graph Neural Networks (GNNs), adaptive attacks continue to challenge their robustness. Certified robustness based on randomized smoothing has emerged as a promising solution, offering provable guarantees that a model's predictions remain stable under adversarial perturbations within a specified range. However, existing methods face a critical trade-off between accuracy and robustness, as achieving stronger robustness requires introducing greater noise into the input graph. This excessive randomization degrades data quality and disrupts prediction consistency, limiting the practical deployment of certifiably robust GNNs in real-world scenarios where both accuracy and robustness are essential. To address this challenge, we propose \textbf{AuditVotes}, the first framework to achieve both high clean accuracy and certifiably robust accuracy for GNNs. It integrates randomized smoothing with two key components, \underline{au}gmentation and con\underline{dit}ional smoothing, aiming to improve data quality and prediction consistency. The augmentation, acting as a pre-processing step, de-noises the randomized graph, significantly improving data quality and clean accuracy. The conditional smoothing, serving as a post-processing step, employs a filtering function to selectively count votes, thereby filtering low-quality predictions and improving voting consistency. Extensive experimental results demonstrate that AuditVotes significantly enhances clean accuracy, certified robustness, and empirical robustness while maintaining high computational efficiency. Notably, compared to baseline randomized smoothing, AuditVotes improves clean accuracy by $437.1\%$ and certified accuracy by $409.3\%$ when the attacker can arbitrarily insert $20$ edges on the Cora-ML datasets, representing a substantial step toward deploying certifiably robust GNNs in real-world applications.

accuracy, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2503.22998

Country:

Asia (0.68)
North America > United States > New York (0.28)

Genre: Research Report > New Finding (0.87)

Industry: Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Physics-Conditioned Diffusion Models for Lattice Gauge Theory

Zhu, Qianteng, Aarts, Gert, Wang, Wei, Zhou, Kai, Wang, Lingxiao

arXiv.org Artificial IntelligenceFeb-8-2025

We develop diffusion models for simulating lattice gauge theories, where stochastic quantization is explicitly incorporated as a physical condition for sampling. We demonstrate the applicability of this novel sampler to U(1) gauge theory in two spacetime dimensions and find that a model trained at a small inverse coupling constant can be extrapolated to larger inverse coupling regions without encountering the topological freezing problem. Additionally, the trained model can be employed to sample configurations on different lattice sizes without requiring further training. The exactness of the generated samples is ensured by incorporating Metropolis-adjusted Langevin dynamics into the generation process. Furthermore, we demonstrate that this approach enables more efficient sampling of topological quantities compared to traditional algorithms such as Hybrid Monte Carlo and Langevin simulations.

artificial intelligence, configuration, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2502.05504

Country:

Asia > China > Guangdong Province (0.14)
Asia > Japan > Honshū (0.14)

Genre: Research Report > New Finding (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Physics-Driven Learning for Inverse Problems in Quantum Chromodynamics

Aarts, Gert, Fukushima, Kenji, Hatsuda, Tetsuo, Ipp, Andreas, Shi, Shuzhe, Wang, Lingxiao, Zhou, Kai

arXiv.org Artificial IntelligenceJan-9-2025

The integration of deep learning techniques and physics-driven designs is reforming the way we address inverse problems, in which accurate physical properties are extracted from complex data sets. This is particularly relevant for quantum chromodynamics (QCD), the theory of strong interactions, with its inherent limitations in observational data and demanding computational approaches. This perspective highlights advances and potential of physics-driven learning methods, focusing on predictions of physical quantities towards QCD physics, and drawing connections to machine learning(ML). It is shown that the fusion of ML and physics can lead to more efficient and reliable problem-solving strategies. Key ideas of ML, methodology of embedding physics priors, and generative models as inverse modelling of physical probability distributions are introduced. Specific applications cover first-principle lattice calculations, and QCD physics of hadrons, neutron stars, and heavy-ion collisions. These examples provide a structured and concise overview of how incorporating prior knowledge such as symmetry, continuity and equations into deep learning designs can address diverse inverse problems across different physical sciences.

artificial intelligence, doi, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1038/s42254-024-00798-x

2501.0558

Country:

Europe (1.00)
Asia > China (0.46)
Asia > Japan > Honshū (0.28)

Genre: Research Report (0.82)

Industry: Energy > Oil & Gas > Upstream (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Implementing Trust in Non-Small Cell Lung Cancer Diagnosis with a Conformalized Uncertainty-Aware AI Framework in Whole-Slide Images

Zhang, Xiaoge, Wang, Tao, Yan, Chao, Najdawi, Fedaa, Zhou, Kai, Ma, Yuan, Cheung, Yiu-ming, Malin, Bradley A.

arXiv.org Artificial IntelligenceDec-27-2024

Ensuring trustworthiness is fundamental to the development of artificial intelligence (AI) that is considered societally responsible, particularly in cancer diagnostics, where a misdiagnosis can have dire consequences. Current digital pathology AI models lack systematic solutions to address trustworthiness concerns arising from model limitations and data discrepancies between model deployment and development environments. To address this issue, we developed TRUECAM, a framework designed to ensure both data and model trustworthiness in non-small cell lung cancer subtyping with whole-slide images. TRUECAM integrates 1) a spectral-normalized neural Gaussian process for identifying out-of-scope inputs and 2) an ambiguity-guided elimination of tiles to filter out highly ambiguous regions, addressing data trustworthiness, as well as 3) conformal prediction to ensure controlled error rates. We systematically evaluated the framework across multiple large-scale cancer datasets, leveraging both task-specific and foundation models, illustrate that an AI model wrapped with TRUECAM significantly outperforms models that lack such guidance, in terms of classification accuracy, robustness, interpretability, and data efficiency, while also achieving improvements in fairness.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2501.00053

Country:

Asia (0.68)
North America > United States (0.67)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
Health & Medicine > Therapeutic Area > Oncology > Lung Cancer (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

MPPO: Multi Pair-wise Preference Optimization for LLMs with Arbitrary Negative Samples

Xie, Shuo, Zhu, Fangzhi, Wang, Jiahui, Wen, Lulu, Dai, Wei, Chen, Xiaowei, Zhu, Junxiong, Zhou, Kai, Zheng, Bo

arXiv.org Artificial IntelligenceDec-13-2024

Aligning Large Language Models (LLMs) with human feedback is crucial for their development. Existing preference optimization methods such as DPO and KTO, while improved based on Reinforcement Learning from Human Feedback (RLHF), are inherently derived from PPO, requiring a reference model that adds GPU memory resources and relies heavily on abundant preference data. Meanwhile, current preference optimization research mainly targets single-question scenarios with two replies, neglecting optimization with multiple replies, which leads to a waste of data in the application. This study introduces the MPPO algorithm, which leverages the average likelihood of model responses to fit the reward function and maximizes the utilization of preference data. Through a comparison of Point-wise, Pair-wise, and List-wise implementations, we found that the Pair-wise approach achieves the best performance, significantly enhancing the quality of model responses. Experimental results demonstrate MPPO's outstanding performance across various benchmarks. On MT-Bench, MPPO outperforms DPO, ORPO, and SimPO. Notably, on Arena-Hard, MPPO surpasses DPO and ORPO by substantial margins. These achievements underscore the remarkable advantages of MPPO in preference optimization tasks.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2412.15244

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Dynamic Ensemble Reasoning for LLM Experts

Hu, Jinwu, Wang, Yufeng, Zhang, Shuhai, Zhou, Kai, Chen, Guohao, Hu, Yu, Xiao, Bin, Tan, Mingkui

arXiv.org Artificial IntelligenceDec-10-2024

Ensemble reasoning for the strengths of different LLM experts is critical to achieving consistent and satisfactory performance on diverse inputs across a wide range of tasks. However, existing LLM ensemble methods are either computationally intensive or incapable of leveraging complementary knowledge among LLM experts for various inputs. In this paper, we propose a Dynamic Ensemble Reasoning paradigm, called DER to integrate the strengths of multiple LLM experts conditioned on dynamic inputs. Specifically, we model the LLM ensemble reasoning problem as a Markov Decision Process (MDP), wherein an agent sequentially takes inputs to request knowledge from an LLM candidate and passes the output to a subsequent LLM candidate. Moreover, we devise a reward function to train a DER-Agent to dynamically select an optimal answering route given the input questions, aiming to achieve the highest performance with as few computational resources as possible. Last, to fully transfer the expert knowledge from the prior LLMs, we develop a Knowledge Transfer Prompt (KTP) that enables the subsequent LLM candidates to transfer complementary knowledge effectively. Experiments demonstrate that our method uses fewer computational resources to achieve better performance compared to state-of-the-art baselines.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2412.07448

Country: Asia > China (0.28)

Genre: Research Report (0.40)

Industry: Health & Medicine > Consumer Health (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Diffusion models learn distributions generated by complex Langevin dynamics

Habibi, Diaa E., Aarts, Gert, Wang, Lingxiao, Zhou, Kai

arXiv.org Artificial IntelligenceDec-2-2024

The probability distribution effectively sampled by a complex Langevin process for theories with a sign problem is not known a priori and notoriously hard to understand. Diffusion models, a class of generative AI, can learn distributions from data. In this contribution, we explore the ability of diffusion models to learn the distributions created by a complex Langevin process.

artificial intelligence, diffusion model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2412.01919

Country:

Europe (0.68)
Asia > China (0.29)
Asia > Japan (0.28)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Add feedback

On learning higher-order cumulants in diffusion models

Aarts, Gert, Habibi, Diaa E., Wang, Lingxiao, Zhou, Kai

arXiv.org Artificial IntelligenceOct-28-2024

To analyse how diffusion models learn correlations beyond Gaussian ones, we study the behaviour of higher-order cumulants, or connected n-point functions, under both the forward and backward process. We derive explicit expressions for the moment- and cumulant-generating functionals, in terms of the distribution of the initial data and properties of forward process. It is shown analytically that during the forward process higher-order cumulants are conserved in models without a drift, such as the variance-expanding scheme, and that therefore the endpoint of the forward process maintains nontrivial correlations. We demonstrate that since these correlations are encoded in the score function, higher-order cumulants are learnt in the backward process, also when starting from a normal prior. We confirm our analytical results in an exactly solvable toy model with nonzero cumulants and in scalar lattice field theory.

artificial intelligence, cumulant, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2410.21212

Country:

Asia (0.93)
Europe > United Kingdom (0.67)

Genre: Research Report (0.81)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.46)

Add feedback

Defense-as-a-Service: Black-box Shielding against Backdoored Graph Models

Yang, Xiao, Zhou, Kai, Lai, Yuni, Li, Gaolei

arXiv.org Artificial IntelligenceOct-7-2024

With the trend of large graph learning models, business owners tend to employ a model provided by a third party to deliver business services to users. However, these models might be backdoored, and malicious users can submit trigger-embedded inputs to manipulate the model predictions. Current graph backdoor defenses have several limitations: 1) depending on model-related details, 2) requiring additional model fine-tuning, and 3) relying upon extra explainability tools, all of which are infeasible under stringent privacy policies. To address those limitations, we propose GraphProt, which allows resource-constrained business owners to rely on third parties to avoid backdoor attacks on GNN-based graph classifiers. Our GraphProt is model-agnostic and only relies on the input graph. The key insight is to leverage subgraph information for prediction, thereby mitigating backdoor effects induced by triggers. GraphProt comprises two components: clustering-based trigger elimination and robust subgraph ensemble. Specifically, we first propose feature-topology clustering that aims to remove most of the anomalous subgraphs (triggers). Moreover, we design subgraph sampling strategies based on feature-topology clustering to build a robust classifier via majority vote. Experimental results across three backdoor attacks and six benchmark datasets demonstrate that GraphProt significantly reduces the backdoor attack success rate while preserving the model accuracy on regular graph classification tasks.

artificial intelligence, graphprot, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2410.04916

Country: Asia > China (0.14)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (0.91)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Security & Privacy (0.91)

Add feedback

Spikewhisper: Temporal Spike Backdoor Attacks on Federated Neuromorphic Learning over Low-power Devices

Fu, Hanqing, Li, Gaolei, Wu, Jun, Li, Jianhua, Lin, Xi, Zhou, Kai, Liu, Yuchen

arXiv.org Artificial IntelligenceMar-27-2024

Federated neuromorphic learning (FedNL) leverages event-driven spiking neural networks and federated learning frameworks to effectively execute intelligent analysis tasks over amounts of distributed low-power devices but also perform vulnerability to poisoning attacks. The threat of backdoor attacks on traditional deep neural networks typically comes from time-invariant data. However, in FedNL, unknown threats may be hidden in time-varying spike signals. In this paper, we start to explore a novel vulnerability of FedNL-based systems with the concept of time division multiplexing, termed Spikewhisper, which allows attackers to evade detection as much as possible, as multiple malicious clients can imperceptibly poison with different triggers at different timeslices. In particular, the stealthiness of Spikewhisper is derived from the time-domain divisibility of global triggers, in which each malicious client pastes only one local trigger to a certain timeslice in the neuromorphic sample, and also the polarity and motion of each local trigger can be configured by attackers. Extensive experiments based on two different neuromorphic datasets demonstrate that the attack success rate of Spikewispher is higher than the temporally centralized attacks. Besides, it is validated that the effect of Spikewispher is sensitive to the trigger duration.

artificial intelligence, machine learning, spikewhisper, (15 more...)

arXiv.org Artificial Intelligence

2403.18607

Country: Asia > China (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback