AITopics | Zhu, Zhihao

Collaborating Authors

Zhu, Zhihao

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Salient Temporal Encoding for Dynamic Scene Graph Generation

Zhu, Zhihao

arXiv.org Artificial IntelligenceMar-15-2025

Representing a dynamic scene using a structured spatial-temporal scene graph is a novel and particularly challenging task. To tackle this task, it is crucial to learn the temporal interactions between objects in addition to their spatial relations. Due to the lack of explicitly annotated temporal relations in current benchmark datasets, most of the existing spatial-temporal scene graph generation methods build dense and abstract temporal connections among all objects across frames. However, not all temporal connections are encoding meaningful temporal dynamics. We propose a novel spatial-temporal scene graph generation method that selectively builds temporal connections only between temporal-relevant objects pairs and represents the temporal relations as explicit edges in the scene graph. The resulting sparse and explicit temporal representation allows us to improve upon strong scene graph generation baselines by up to $4.4\%$ in Scene Graph Detection. In addition, we show that our approach can be leveraged to improve downstream vision tasks. Particularly, applying our approach to action recognition, shows 0.6\% gain in mAP in comparison to the state-of-the-art

artificial intelligence, machine learning, spatial reasoning, (19 more...)

arXiv.org Artificial Intelligence

2503.14524

Country:

North America > Canada (0.14)
Europe > Germany (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.93)

Add feedback

TDDBench: A Benchmark for Training data detection

Zhu, Zhihao, Yang, Yi, Lian, Defu

arXiv.org Artificial IntelligenceNov-5-2024

Metric-based methods rely on the analysis of certain statistical properties of a target model's output, such as confidence scores, prediction probabilities, or loss values, to distinguish between training data and non-training data. Specifically, Metric-loss (Yeom et al., 2018) is the first metricbased detection method, predicting that data points with a loss below a certain threshold are part of the training data for the target model. Similarly, other works have proposed using the maximum confidence of the target model output (denoted as Metric-conf (Song et al., 2019)), the correctness of the target model output (denoted as Metric-corr (Leino & Fredrikson, 2020)), the entropy of prediction probability distributions (denoted as Metric-ent (Shokri et al., 2017; Song & Mittal, 2021)), and modified entropy of the prediction (denoted as Metric-ment (Song & Mittal, 2021)). Learning-based methods involve training an auxiliary classifier (meta-classifier) to distinguish between training data and non-training data. In the literature, neural networks (NNs) are often employed as the auxiliary classifier. The primary differences between learning-based TDD methods lie in the choice of input features for the auxiliary classifier. Earlier work (Shokri et al., 2017) has proposed using the original prediction vector of the target model (denoted as Learn-original). Other works have suggested using the top-3 prediction confidences (denoted as Learn-top3 (Salem et al., 2019)), the sorted prediction vector (denoted as Learn-sorted (Salem et al., 2019)), the true label of the example combined with the prediction vector (denoted as Learn-label

artificial intelligence, machine learning, target model, (16 more...)

arXiv.org Artificial Intelligence

2411.03363

Genre: Research Report > New Finding (0.93)

Industry:

Information Technology > Security & Privacy (1.00)
Education (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

Understanding Privacy Risks of Embeddings Induced by Large Language Models

Zhu, Zhihao, Shao, Ninglu, Lian, Defu, Wu, Chenwang, Liu, Zheng, Yang, Yi, Chen, Enhong

arXiv.org Artificial IntelligenceApr-25-2024

Large language models (LLMs) show early signs of artificial general intelligence but struggle with hallucinations. One promising solution to mitigate these hallucinations is to store external knowledge as embeddings, aiding LLMs in retrieval-augmented generation. However, such a solution risks compromising privacy, as recent studies experimentally showed that the original text can be partially reconstructed from text embeddings by pre-trained language models. The significant advantage of LLMs over traditional pre-trained models may exacerbate these concerns. To this end, we investigate the effectiveness of reconstructing original knowledge and predicting entity attributes from these embeddings when LLMs are employed. Empirical findings indicate that LLMs significantly improve the accuracy of two evaluated tasks over those from pre-trained models, regardless of whether the texts are in-distribution or out-of-distribution. This underscores a heightened potential for LLMs to jeopardize user privacy, highlighting the negative consequences of their widespread use. We further discuss preliminary strategies to mitigate this risk.

attack model, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2404.16587

Country: North America > United States (0.48)

Genre: Research Report > Experimental Study (0.46)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

AgentBoard: An Analytical Evaluation Board of Multi-turn LLM Agents

Ma, Chang, Zhang, Junlei, Zhu, Zhihao, Yang, Cheng, Yang, Yujiu, Jin, Yaohui, Lan, Zhenzhong, Kong, Lingpeng, He, Junxian

arXiv.org Artificial IntelligenceJan-23-2024

Evaluating large language models (LLMs) as general-purpose agents is essential for understanding their capabilities and facilitating their integration into practical applications. However, the evaluation process presents substantial challenges. A primary obstacle is the benchmarking of agent performance across diverse scenarios within a unified framework, especially in maintaining partially-observable environments and ensuring multi-round interactions. Moreover, current evaluation frameworks mostly focus on the final success rate, revealing few insights during the process and failing to provide a deep understanding of the model abilities. To address these challenges, we introduce AgentBoard, a pioneering comprehensive benchmark and accompanied open-source evaluation framework tailored to analytical evaluation of LLM agents. AgentBoard offers a fine-grained progress rate metric that captures incremental advancements as well as a comprehensive evaluation toolkit that features easy assessment of agents for multi-faceted analysis through interactive visualization. This not only sheds light on the capabilities and limitations of LLM agents but also propels the interpretability of their performance to the forefront. Ultimately, AgentBoard serves as a significant step towards demystifying agent behaviors and accelerating the development of stronger LLM agents.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2401.13178

Country:

North America > United States (0.28)
Asia > Middle East > UAE (0.14)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment (0.67)
Media (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.45)

Add feedback

Emergency Localization for Mobile Ground Users: An Adaptive UAV Trajectory Planning Method

Zhu, Zhihao, He, Jiafan, Hou, Luyang, Xu, Lianming, Zhu, Wendi, Wang, Li

arXiv.org Artificial IntelligenceJan-14-2024

In emergency search and rescue scenarios, the quick location of trapped people is essential. However, disasters can render the Global Positioning System (GPS) unusable. Unmanned aerial vehicles (UAVs) with localization devices can serve as mobile anchors due to their agility and high line-of-sight (LoS) probability. Nonetheless, the number of available UAVs during the initial stages of disaster relief is limited, and innovative methods are needed to quickly plan UAV trajectories to locate non-uniformly distributed dynamic targets while ensuring localization accuracy. To address this challenge, we design a single UAV localization method without hovering, use the maximum likelihood estimation (MLE) method to estimate the location of mobile users and define the upper bound of the localization error by considering users' movement.Combining this localization method and localization error-index, we utilize the enhanced particle swarm optimization (EPSO) algorithm and edge access strategy to develop a low complexity localization-oriented adaptive trajectory planning algorithm. Simulation results demonstrate that our method outperforms other baseline algorithms, enabling faster localization without compromising localization accuracy.

evolutionary algorithm, machine learning, uav, (18 more...)

arXiv.org Artificial Intelligence

2401.07256

Country: Asia > China (0.29)

Genre: Research Report > New Finding (0.48)

Industry:

Information Technology > Robotics & Automation (0.34)
Aerospace & Defense > Aircraft (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.86)
(2 more...)

Add feedback

Model Stealing Attack against Recommender System

Zhu, Zhihao, Fan, Rui, Wu, Chenwang, Yang, Yi, Lian, Defu, Chen, Enhong

arXiv.org Artificial IntelligenceDec-26-2023

Recent studies have demonstrated the vulnerability of recommender systems to data privacy attacks. However, research on the threat to model privacy in recommender systems, such as model stealing attacks, is still in its infancy. Some adversarial attacks have achieved model stealing attacks against recommender systems, to some extent, by collecting abundant training data of the target model (target data) or making a mass of queries. In this paper, we constrain the volume of available target data and queries and utilize auxiliary data, which shares the item set with the target data, to promote model stealing attacks. Although the target model treats target and auxiliary data differently, their similar behavior patterns allow them to be fused using an attention mechanism to assist attacks. Besides, we design stealing functions to effectively extract the recommendation list obtained by querying the target model. Experimental results show that the proposed methods are applicable to most recommender systems and various scenarios and exhibit excellent attack performance on multiple datasets.

artificial intelligence, machine learning, target model, (17 more...)

arXiv.org Artificial Intelligence

2312.11571

Country: North America > United States (0.31)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Model Stealing Attack against Graph Classification with Authenticity, Uncertainty and Diversity

Zhu, Zhihao, Wu, Chenwang, Fan, Rui, Yang, Yi, Lian, Defu, Chen, Enhong

arXiv.org Artificial IntelligenceDec-26-2023

Recent research demonstrates that GNNs are vulnerable to the model stealing attack, a nefarious endeavor geared towards duplicating the target model via query permissions. However, they mainly focus on node classification tasks, neglecting the potential threats entailed within the domain of graph classification tasks. Furthermore, their practicality is questionable due to unreasonable assumptions, specifically concerning the large data requirements and extensive model knowledge. To this end, we advocate following strict settings with limited real data and hard-label awareness to generate synthetic data, thereby facilitating the stealing of the target model. Specifically, following important data generation principles, we introduce three model stealing attacks to adapt to different actual scenarios: MSA-AU is inspired by active learning and emphasizes the uncertainty to enhance query value of generated samples; MSA-AD introduces diversity based on Mixup augmentation strategy to alleviate the query inefficiency issue caused by over-similar samples generated by MSA-AU; MSA-AUD combines the above two strategies to seamlessly integrate the authenticity, uncertainty, and diversity of the generated samples. Finally, extensive experiments consistently demonstrate the superiority of the proposed methods in terms of concealment, query efficiency, and stealing performance.

artificial intelligence, machine learning, target model, (18 more...)

arXiv.org Artificial Intelligence

2312.10943

Country: Asia > Middle East > Israel (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

C-Eval: A Multi-Level Multi-Discipline Chinese Evaluation Suite for Foundation Models

Huang, Yuzhen, Bai, Yuzhuo, Zhu, Zhihao, Zhang, Junlei, Zhang, Jinghan, Su, Tangjun, Liu, Junteng, Lv, Chuancheng, Zhang, Yikai, Lei, Jiayi, Fu, Yao, Sun, Maosong, He, Junxian

arXiv.org Artificial IntelligenceNov-6-2023

New NLP benchmarks are urgently needed to align with the rapid development of large language models (LLMs). We present C-Eval, the first comprehensive Chinese evaluation suite designed to assess advanced knowledge and reasoning abilities of foundation models in a Chinese context. C-Eval comprises multiple-choice questions across four difficulty levels: middle school, high school, college, and professional. The questions span 52 diverse disciplines, ranging from humanities to science and engineering. C-Eval is accompanied by C-Eval Hard, a subset of very challenging subjects in C-Eval that requires advanced reasoning abilities to solve. We conduct a comprehensive evaluation of the most advanced LLMs on C-Eval, including both English- and Chinese-oriented models. Results indicate that only GPT-4 could achieve an average accuracy of over 60%, suggesting that there is still significant room for improvement for current LLMs. We anticipate C-Eval will help analyze important strengths and shortcomings of foundation models, and foster their development and growth for Chinese users.

accuracy, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2305.08322

Country:

Europe (0.46)
Asia > China (0.29)
North America > United States > Maryland (0.14)

Genre: Research Report (1.00)

Industry:

Education > Curriculum > Subject-Specific Education (1.00)
Education > Educational Setting (0.74)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Resisting Graph Adversarial Attack via Cooperative Homophilous Augmentation

Zhu, Zhihao, Wu, Chenwang, Zhou, Min, Liao, Hao, Lian, Defu, Chen, Enhong

arXiv.org Artificial IntelligenceNov-15-2022

Recent studies show that Graph Neural Networks(GNNs) are vulnerable and easily fooled by small perturbations, which has raised considerable concerns for adapting GNNs in various safety-critical applications. In this work, we focus on the emerging but critical attack, namely, Graph Injection Attack(GIA), in which the adversary poisons the graph by injecting fake nodes instead of modifying existing structures or node attributes. Inspired by findings that the adversarial attacks are related to the increased heterophily on perturbed graphs (the adversary tends to connect dissimilar nodes), we propose a general defense framework CHAGNN against GIA through cooperative homophilous augmentation of graph data and model. Specifically, the model generates pseudo-labels for unlabeled nodes in each round of training to reduce heterophilous edges of nodes with distinct labels. The cleaner graph is fed back to the model, producing more informative pseudo-labels. In such an iterative manner, model robustness is then promisingly enhanced. We present the theoretical analysis of the effect of homophilous augmentation and provide the guarantee of the proposal's validity. Experimental results empirically demonstrate the effectiveness of CHAGNN in comparison with recent state-of-the-art defense methods on diverse real-world datasets.

artificial intelligence, graph, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2211.08068

Genre: Research Report > New Finding (0.66)

Industry:

Information Technology > Security & Privacy (1.00)
Government (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.90)

Add feedback

Automatic Graphics Program Generation using Attention-Based Hierarchical Decoder

Zhu, Zhihao, Xue, Zhan, Yuan, Zejian

arXiv.org Machine LearningOct-26-2018

Recent progress on deep learning has made it possible to automatically transform the screenshot of Graphic User Interface (GUI) into code by using the encoder-decoder framework. While the commonly adopted image encoder (e.g., CNN network), might be capable of extracting image features to the desired level, interpreting these abstract image features into hundreds of tokens of code puts a particular challenge on the decoding power of the RNN-based code generator. Considering the code used for describing GUI is usually hierarchically structured, we propose a new attention-based hierarchical code generation model, which can describe GUI images in a finer level of details, while also being able to generate hierarchically structured code in consistency with the hierarchical layout of the graphic elements in the GUI. Our model follows the encoder-decoder framework, all the components of which can be trained jointly in an end-to-end manner. The experimental results show that our method outperforms other current state-of-the-art methods on both a publicly available GUI-code dataset as well as a dataset established by our own.

deep learning, lstm, neural network, (21 more...)

arXiv.org Machine Learning

1810.11536

Country:

Europe (1.00)
North America > United States (0.47)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Graphics (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Add feedback