AITopics | Zhang, Zhaoxi

Collaborating Authors

Zhang, Zhaoxi

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Not All Edges are Equally Robust: Evaluating the Robustness of Ranking-Based Federated Learning

Gong, Zirui, Zhang, Yanjun, Zhang, Leo Yu, Zhang, Zhaoxi, Xiang, Yong, Pan, Shirui

arXiv.org Artificial IntelligenceMar-11-2025

Federated Ranking Learning (FRL) is a state-of-the-art FL framework that stands out for its communication efficiency and resilience to poisoning attacks. It diverges from the traditional FL framework in two ways: 1) it leverages discrete rankings instead of gradient updates, significantly reducing communication costs and limiting the potential space for malicious updates, and 2) it uses majority voting on the server side to establish the global ranking, ensuring that individual updates have minimal influence since each client contributes only a single vote. These features enhance the system's scalability and position FRL as a promising paradigm for FL training. However, our analysis reveals that FRL is not inherently robust, as certain edges are particularly vulnerable to poisoning attacks. Through a theoretical investigation, we prove the existence of these vulnerable edges and establish a lower bound and an upper bound for identifying them in each layer. Based on this finding, we introduce a novel local model poisoning attack against FRL, namely the Vulnerable Edge Manipulation (VEM) attack. The VEM attack focuses on identifying and perturbing the most vulnerable edges in each layer and leveraging an optimization-based approach to maximize the attack's impact. Through extensive experiments on benchmark datasets, we demonstrate that our attack achieves an overall 53.23% attack impact and is 3.7x more impactful than existing methods. Our findings highlight significant vulnerabilities in ranking-based FL systems and underline the urgency for the development of new robust FL frameworks.

artificial intelligence, attack impact, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2503.08976

Genre: Research Report > New Finding (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

DuoLift-GAN:Reconstructing CT from Single-view and Biplanar X-Rays with Generative Adversarial Networks

Zhang, Zhaoxi, Ying, Yueliang

arXiv.org Artificial IntelligenceDec-11-2024

Computed tomography (CT) provides highly detailed three-dimensional (3D) medical images but is costly, time-consuming, and often inaccessible in intraoperative settings (Organization et al. 2011). Recent advancements have explored reconstructing 3D chest volumes from sparse 2D X-rays, such as single-view or orthogonal double-view images. However, current models tend to process 2D images in a planar manner, prioritizing visual realism over structural accuracy. In this work, we introduce DuoLift Generative Adversarial Networks (DuoLift-GAN), a novel architecture with dual branches that independently elevate 2D images and their features into 3D representations. These 3D outputs are merged into a unified 3D feature map and decoded into a complete 3D chest volume, enabling richer 3D information capture. We also present a masked loss function that directs reconstruction towards critical anatomical regions, improving structural accuracy and visual quality. This paper demonstrates that DuoLift-GAN significantly enhances reconstruction accuracy while achieving superior visual realism compared to existing methods.

artificial intelligence, machine learning, reconstruction, (18 more...)

arXiv.org Artificial Intelligence

2411.07941

Country:

Europe (0.28)
North America > United States > North Carolina (0.14)
Africa > Middle East > Algeria (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Nuclear Medicine (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

A Survey of Uncertainty Estimation in LLMs: Theory Meets Practice

Huang, Hsiu-Yuan, Yang, Yutong, Zhang, Zhaoxi, Lee, Sanwoo, Wu, Yunfang

arXiv.org Artificial IntelligenceOct-20-2024

As large language models (LLMs) continue to evolve, understanding and quantifying the uncertainty in their predictions is critical for enhancing application credibility. However, the existing literature relevant to LLM uncertainty estimation often relies on heuristic approaches, lacking systematic classification of the methods. In this survey, we clarify the definitions of uncertainty and confidence, highlighting their distinctions and implications for model predictions. On this basis, we integrate theoretical perspectives, including Bayesian inference, information theory, and ensemble strategies, to categorize various classes of uncertainty estimation methods derived from heuristic approaches. Additionally, we address challenges that arise when applying these methods to LLMs. We also explore techniques for incorporating uncertainty into diverse applications, including out-of-distribution detection, data annotation, and question clarification. Our review provides insights into uncertainty estimation from both definitional and theoretical angles, contributing to a comprehensive understanding of this critical aspect in LLMs. We aim to inspire the development of more reliable and effective uncertainty estimation approaches for LLMs in real-world scenarios.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2410.15326

Country:

North America > United States (0.46)
Asia > Middle East (0.28)
North America > Mexico (0.28)

Genre: Overview (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Surgical-LVLM: Learning to Adapt Large Vision-Language Model for Grounded Visual Question Answering in Robotic Surgery

Wang, Guankun, Bai, Long, Nah, Wan Jun, Wang, Jie, Zhang, Zhaoxi, Chen, Zhen, Wu, Jinlin, Islam, Mobarakol, Liu, Hongbin, Ren, Hongliang

arXiv.org Artificial IntelligenceMar-22-2024

Recent advancements in Surgical Visual Question Answering (Surgical-VQA) and related region grounding have shown great promise for robotic and medical applications, addressing the critical need for automated methods in personalized surgical mentorship. However, existing models primarily provide simple structured answers and struggle with complex scenarios due to their limited capability in recognizing long-range dependencies and aligning multimodal information. In this paper, we introduce Surgical-LVLM, a novel personalized large vision-language model tailored for complex surgical scenarios. Leveraging the pre-trained large vision-language model and specialized Visual Perception LoRA (VP-LoRA) blocks, our model excels in understanding complex visual-language tasks within surgical contexts. In addressing the visual grounding task, we propose the Token-Interaction (TIT) module, which strengthens the interaction between the grounding module and the language responses of the Large Visual Language Model (LVLM) after projecting them into the latent space. We demonstrate the effectiveness of Surgical-LVLM on several benchmarks, including EndoVis-17-VQLA, EndoVis-18-VQLA, and a newly introduced EndoVis Conversations dataset, which sets new performance standards. Our work contributes to advancing the field of automated surgical mentorship by providing a context-aware solution.

large language model, machine learning, question answering, (18 more...)

arXiv.org Artificial Intelligence

2405.10948

Country: Asia > China (0.47)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Surgery (1.00)
Health & Medicine > Health Care Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.71)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

TeleChat Technical Report

Wang, Zihan, Liu, Xinzhang, Liu, Shixuan, Yao, Yitong, Huang, Yuyao, He, Zhongjiang, Li, Xuelong, Li, Yongxiang, Che, Zhonghao, Zhang, Zhaoxi, Wang, Yan, Wang, Xin, Pu, Luwen, Xu, Huihan, Fang, Ruiyu, Zhao, Yu, Zhang, Jie, Huang, Xiaomeng, Lu, Zhilong, Peng, Jiaxin, Zheng, Wenjun, Wang, Shiquan, Yang, Bingkai, he, Xuewei, Jiang, Zhuoru, Xie, Qiyi, Zhang, Yanhan, Li, Zhongqiu, Shi, Lingling, Fu, Weiwei, Zhang, Yin, Huang, Zilu, Xiong, Sishi, Zhang, Yuxiang, Wang, Chao, Song, Shuangyong

arXiv.org Artificial IntelligenceJan-8-2024

In this technical report, we present TeleChat, a collection of large language models (LLMs) with parameters of 3 billion, 7 billion and 12 billion. It includes pretrained language models as well as fine-tuned chat models that is aligned with human preferences. TeleChat is initially pretrained on an extensive corpus containing a diverse collection of texts from both English and Chinese languages, including trillions of tokens. Subsequently, the model undergoes fine-tuning to align with human preferences, following a detailed methodology that we describe. We evaluate the performance of TeleChat on various tasks, including language understanding, mathematics, reasoning, code generation, and knowledge-based question answering. Our findings indicate that TeleChat achieves comparable performance to other open-source models of similar size across a wide range of public benchmarks. To support future research and applications utilizing LLMs, we release the fine-tuned model checkpoints of TeleChat's 7B and 12B variant, along with code and a portion of our pretraining data, to the public community.

arxiv preprint arxiv, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2401.03804

Country: Europe > United Kingdom > Scotland (0.14)

Genre: Research Report > New Finding (0.48)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Education (1.00)
Leisure & Entertainment (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Masked Language Model Based Textual Adversarial Example Detection

Zhang, Xiaomei, Zhang, Zhaoxi, Zhong, Qi, Zheng, Xufei, Zhang, Yanjun, Hu, Shengshan, Zhang, Leo Yu

arXiv.org Artificial IntelligenceApr-19-2023

Adversarial attacks are a serious threat to the reliable deployment of machine learning models in safety-critical applications. They can misguide current models to predict incorrectly by slightly modifying the inputs. Recently, substantial work has shown that adversarial examples tend to deviate from the underlying data manifold of normal examples, whereas pre-trained masked language models can fit the manifold of normal NLP data. To explore how to use the masked language model in adversarial detection, we propose a novel textual adversarial example detection method, namely Masked Language Model-based Detection (MLMD), which can produce clearly distinguishable signals between normal examples and adversarial examples by exploring the changes in manifolds induced by the masked language model. MLMD features a plug and play usage (i.e., no need to retrain the victim model) for adversarial defense and it is agnostic to classification tasks, victim model's architectures, and to-be-defended attack methods. We evaluate MLMD on various benchmark textual datasets, widely studied machine learning models, and state-of-the-art (SOTA) adversarial attacks (in total $3*4*4 = 48$ settings). Experimental results show that MLMD can achieve strong performance, with detection accuracy up to 0.984, 0.967, and 0.901 on AG-NEWS, IMDB, and SST-2 datasets, respectively. Additionally, MLMD is superior, or at least comparable to, the SOTA detection defenses in detection accuracy and F1 score. Among many defenses based on the off-manifold assumption of adversarial examples, this work offers a new angle for capturing the manifold change. The code for this work is openly accessible at \url{https://github.com/mlmddetection/MLMDdetection}.

machine learning, manifold, natural language, (19 more...)

arXiv.org Artificial Intelligence

2304.08767

Country:

Europe (1.00)
North America > Canada (0.93)
Asia (0.68)
(2 more...)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

The Modeling of SDL Aiming at Knowledge Acquisition in Automatic Driving

Gu, Zecang, Liang, Yin, Zhang, Zhaoxi

arXiv.org Artificial IntelligenceDec-7-2018

In this paper we proposed an ultimate theory to solve the multi-target control problem through its introduction to the machine learning framework in automatic driving, which explored the implementation of excellent drivers' knowledge acquisition. Nowadays there exist some core problems that have not been fully realized by the researchers in automatic driving, such as the optimal way to control the multi-target objective functions of energy saving, safe driving, headway distance control and comfort driving, as well as the resolvability of the networks that automatic driving relied on and the high-performance chips like GPU on the complex driving environments. According to these problems, we developed a new theory to map multitarget objective functions in different spaces into the same one and thus introduced a machine learning framework of SDL(Super Deep Learning) for optimal multi-targetcontrol based on knowledge acquisition. We will present in this paper the optimal multi-target control by combining the fuzzy relationship of each multi-target objective function and the implementation of excellent drivers' knowledge acquired by machine learning. Theoretically, the impact of this method will exceed that of the fuzzy control method used in automatic train.

artificial intelligence, neural network, optimization problem, (18 more...)

arXiv.org Artificial Intelligence

1812.03007

Country: Asia (0.14)

Genre: Research Report (0.40)

Industry: Transportation (0.30)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.48)

Add feedback