AITopics | Zhou, Hanxu

Collaborating Authors

Zhou, Hanxu

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

White-box Multimodal Jailbreaks Against Large Vision-Language Models

Wang, Ruofan, Ma, Xingjun, Zhou, Hanxu, Ji, Chuanjun, Ye, Guangnan, Jiang, Yu-Gang

arXiv.org Artificial IntelligenceMay-28-2024

Recent advancements in Large Vision-Language Models (VLMs) have underscored their superiority in various multimodal tasks. However, the adversarial robustness of VLMs has not been fully explored. Existing methods mainly assess robustness through unimodal adversarial attacks that perturb images, while assuming inherent resilience against text-based attacks. Different from existing attacks, in this work we propose a more comprehensive strategy that jointly attacks both text and image modalities to exploit a broader spectrum of vulnerability within VLMs. Specifically, we propose a dual optimization objective aimed at guiding the model to generate affirmative responses with high toxicity. Our attack method begins by optimizing an adversarial image prefix from random noise to generate diverse harmful responses in the absence of text input, thus imbuing the image with toxic semantics. Subsequently, an adversarial text suffix is integrated and co-optimized with the adversarial image prefix to maximize the probability of eliciting affirmative responses to various harmful instructions. The discovered adversarial image prefix and text suffix are collectively denoted as a Universal Master Key (UMK). When integrated into various malicious queries, UMK can circumvent the alignment defenses of VLMs and lead to the generation of objectionable content, known as jailbreaks. The experimental results demonstrate that our universal attack strategy can effectively jailbreak MiniGPT-4 with a 96% success rate, highlighting the vulnerability of VLMs and the urgent need for new alignment strategies.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2405.17894

Genre: Research Report (0.70)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Understanding Time Series Anomaly State Detection through One-Class Classification

Zhou, Hanxu, Zhang, Yuan, Leng, Guangjie, Wang, Ruofan, Xu, Zhi-Qin John

arXiv.org Artificial IntelligenceFeb-2-2024

For a long time, research on time series anomaly detection has mainly focused on finding outliers within a given time series. Admittedly, this is consistent with some practical problems, but in other practical application scenarios, people are concerned about: assuming a standard time series is given, how to judge whether another test time series deviates from the standard time series, which is more similar to the problem discussed in one-class classification (OCC). Therefore, in this article, we try to re-understand and define the time series anomaly detection problem through OCC, which we call 'time series anomaly state detection problem'. We first use stochastic processes and hypothesis testing to strictly define the 'time series anomaly state detection problem', and its corresponding anomalies. Then, we use the time series classification dataset to construct an artificial dataset corresponding to the problem. We compile 38 anomaly detection algorithms and correct some of the algorithms to adapt to handle this problem. Finally, through a large number of experiments, we fairly compare the actual performance of various time series anomaly detection algorithms, providing insights and directions for future research by researchers.

artificial intelligence, data mining, machine learning, (12 more...)

arXiv.org Artificial Intelligence

2402.02007

Country:

Asia > China (0.14)
Asia > Thailand (0.14)
Asia > Taiwan (0.14)

Genre: Research Report (1.00)

Industry:

Energy (0.67)
Information Technology > Security & Privacy (0.45)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Understanding the Initial Condensation of Convolutional Neural Networks

Zhou, Zhangchen, Zhou, Hanxu, Li, Yuqing, Xu, Zhi-Qin John

arXiv.org Artificial IntelligenceMay-17-2023

Previous research has shown that fully-connected networks with small initialization and gradient-based training methods exhibit a phenomenon known as condensation during training. This phenomenon refers to the input weights of hidden neurons condensing into isolated orientations during training, revealing an implicit bias towards simple solutions in the parameter space. However, the impact of neural network structure on condensation has not been investigated yet. In this study, we focus on the investigation of convolutional neural networks (CNNs). Our experiments suggest that when subjected to small initialization and gradient-based training methods, kernel weights within the same CNN layer also cluster together during training, demonstrating a significant degree of condensation. Theoretically, we demonstrate that in a finite training period, kernels of a two-layer CNN with small initialization will converge to one or a few directions. This work represents a step towards a better understanding of the non-linear training behavior exhibited by neural networks with specialized structures.

artificial intelligence, machine learning, neural network, (18 more...)

arXiv.org Artificial Intelligence

2305.09947

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Towards Understanding the Condensation of Two-layer Neural Networks at Initial Training

Xu, Zhi-Qin John, Zhou, Hanxu, Luo, Tao, Zhang, Yaoyu

arXiv.org Artificial IntelligenceMay-29-2021

Studying the implicit regularization effect of the nonlinear training dynamics of neural networks (NNs) is important for understanding why over-parameterized neural networks often generalize well on real dataset. Empirically, existing works have shown that weights of NNs condense on isolated orientations with a small initialization. The condensation dynamics implies that NNs can learn features from the training data with a network configuration effectively equivalent to a much smaller network during the training. In this work, we show that the multiple roots of activation function at origin is a key factor to understanding the condensation at the initial stage of training. Our experiments suggest that the maximal number of condensed orientations is twice of the multiplicity. Our theoretical analysis confirms experiments for two cases, one is for the activation function of multiplicity one and the other is for the one-dimensional input. This work makes a step towards understanding how small initialization implicitly leads NNs to condensation at initial stage of training, which lays a solid foundation for the future study of the nonlinear dynamics of NNs and its implicit regularization effect at a later stage of training.

condensation, neural network, télécommunications, (17 more...)

arXiv.org Artificial Intelligence

2105.11686

Country:

North America > Canada (0.29)
North America > United States (0.28)

Genre: Research Report > Experimental Study (0.34)

Industry:

Telecommunications > Networks (0.34)
Information Technology > Networks (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Deep frequency principle towards understanding why deeper learning is faster

Xu, Zhi-Qin John, Zhou, Hanxu

arXiv.org Machine LearningJul-28-2020

Understanding the effect of depth in deep learning is a critical problem. In this work, we utilize the Fourier analysis to empirically provide a promising mechanism to understand why deeper learning is faster. To this end, we separate a deep neural network into two parts, one is a pre-condition component and the other is a learning component, in which the output of the pre-condition one is the input of the learning one. Based on experiments of deep networks and real dataset, we propose a deep frequency principle, that is, the effective target function for a deeper hidden layer has a bias towards a function with more low frequency during the training. Therefore, the learning component effectively learns a lower frequency function if the pre-condition component has more layers. Due to the well-studied frequency principle, i.e., deep neural networks learn lower frequency functions faster, the deep frequency principle provides a reasonable explanation to why deeper learning is faster. We believe these empirical studies would be valuable for future theoretical studies of the effect of depth in deep learning.

deep learning, frequency principle, neural network, (14 more...)

arXiv.org Machine Learning

2007.14313

Country: Asia > China (0.29)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback