AITopics

2305.15708

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(6 more...)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Artificial IntelligenceMay-24-2023

AdvFunMatch: When Consistent Teaching Meets Adversarial Robustness

Wu, Zihui, Gao, Haichang, Zhou, Bingqian, Wang, Ping

\emph{Consistent teaching} is an effective paradigm for implementing knowledge distillation (KD), where both student and teacher models receive identical inputs, and KD is treated as a function matching task (FunMatch). However, one limitation of FunMatch is that it does not account for the transfer of adversarial robustness, a model's resistance to adversarial attacks. To tackle this problem, we propose a simple but effective strategy called Adversarial Function Matching (AdvFunMatch), which aims to match distributions for all data points within the $\ell_p$-norm ball of the training data, in accordance with consistent teaching. Formulated as a min-max optimization problem, AdvFunMatch identifies the worst-case instances that maximizes the KL-divergence between teacher and student model outputs, which we refer to as "mismatched examples," and then matches the outputs on these mismatched examples. Our experimental results show that AdvFunMatch effectively produces student models with both high clean accuracy and robustness. Furthermore, we reveal that strong data augmentations (\emph{e.g.}, AutoAugment) are beneficial in AdvFunMatch, whereas prior works have found them less effective in adversarial training. Code is available at \url{https://gitee.com/zihui998/adv-fun-match}.

mismatched example, student model, teacher model, (16 more...)

2305.147

Country:

North America > United States > California > Los Angeles County > Long Beach (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > California > Santa Clara County > San Jose (0.04)
(17 more...)

Genre: Research Report > New Finding (0.48)

Industry:

Education (1.00)
Information Technology > Security & Privacy (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

arXiv.org Artificial IntelligenceMay-24-2023

Topic-Guided Self-Introduction Generation for Social Media Users

Xu, Chunpu, Li, Jing, Li, Piji, Yang, Min

Millions of users are active on social media. To allow users to better showcase themselves and network with others, we explore the auto-generation of social media self-introduction, a short sentence outlining a user's personal interests. While most prior work profiles users with tags (e.g., ages), we investigate sentence-level self-introductions to provide a more natural and engaging way for users to know each other. Here we exploit a user's tweeting history to generate their self-introduction. The task is non-trivial because the history content may be lengthy, noisy, and exhibit various personal interests. To address this challenge, we propose a novel unified topic-guided encoder-decoder (UTGED) framework; it models latent topics to reflect salient user interest, whose topic mixture then guides encoding a user's history and topic words control decoding their self-introduction. For experiments, we collect a large-scale Twitter dataset, and extensive results show the superiority of our UTGED to the advanced encoder-decoder models without topic modeling.

computational linguistic, machine learning, natural language, (17 more...)

2305.15138

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Italy > Tuscany > Florence (0.04)
Asia > China > Hong Kong (0.04)
(27 more...)

Genre: Research Report > New Finding (0.48)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Information Technology (1.00)
Education > Educational Setting (0.94)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Artificial IntelligenceMay-24-2023

Pre-training Multi-party Dialogue Models with Latent Discourse Inference

Li, Yiyang, Huang, Xinting, Bi, Wei, Zhao, Hai

Multi-party dialogues are more difficult for models to understand than one-to-one two-party dialogues, since they involve multiple interlocutors, resulting in interweaving reply-to relations and information flows. To step over these obstacles, an effective way is to pre-train a model that understands the discourse structure of multi-party dialogues, namely, to whom each utterance is replying. However, due to the lack of explicitly annotated discourse labels in multi-party dialogue corpora, previous works fail to scale up the pre-training process by putting aside the unlabeled multi-party conversational data for nothing. To fully utilize the unlabeled data, we propose to treat the discourse structures as latent variables, then jointly infer them and pre-train the discourse-aware model by unsupervised latent variable inference methods. Experiments on multiple downstream tasks show that our pre-trained model outperforms strong baselines by large margins and achieves state-of-the-art (SOTA) results, justifying the effectiveness of our method. The official implementation of this paper is available at https://github.com/EricLee8/MPD_EMVI.

addressee, machine learning, natural language, (19 more...)

2305.15175

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Austria > Vienna (0.14)
Asia > China > Shanghai > Shanghai (0.04)
(14 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Chertkov, Andrei, Tsymboi, Olga, Pautov, Mikhail, Oseledets, Ivan

Translate your gibberish: black-box adversarial attack on machine translation systems

Neural networks are deployed widely in natural language processing tasks on the industrial scale, and perhaps the most often they are used as compounds of automatic machine translation systems. In this work, we present a simple approach to fool state-of-the-art machine translation tools in the task of translation from Russian to English and vice versa. Using a novel black-box gradient-free tensor-based optimizer, we show that many online translation tools, such as Google, DeepL, and Yandex, may both produce wrong or offensive translations for nonsensical adversarial input queries and refuse to translate seemingly benign input phrases. This vulnerability may interfere with understanding a new language and simply worsen the user's experience while using machine translation systems, and, hence, additional improvements of these tools are required to establish better translation.

artificial intelligence, machine learning, natural language, (16 more...)

2303.10974

Country:

Asia > Russia (0.14)
Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.05)
Oceania > Australia > Victoria > Melbourne (0.04)
(8 more...)

Genre: Research Report (0.40)

Industry:

Education (0.68)
Transportation > Air (0.62)
Information Technology > Security & Privacy (0.43)
Government > Military (0.43)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

OPORP: One Permutation + One Random Projection

Li, Ping, Li, Xiaoyun

Consider two $D$-dimensional data vectors (e.g., embeddings): $u, v$. In many embedding-based retrieval (EBR) applications where the vectors are generated from trained models, $D=256\sim 1024$ are common. In this paper, OPORP (one permutation + one random projection) uses a variant of the ``count-sketch'' type of data structures for achieving data reduction/compression. With OPORP, we first apply a permutation on the data vectors. A random vector $r$ is generated i.i.d. with moments: $E(r_i) = 0, E(r_i^2)=1, E(r_i^3) =0, E(r_i^4)=s$. We multiply (as dot product) $r$ with all permuted data vectors. Then we break the $D$ columns into $k$ equal-length bins and aggregate (i.e., sum) the values in each bin to obtain $k$ samples from each data vector. One crucial step is to normalize the $k$ samples to the unit $l_2$ norm. We show that the estimation variance is essentially: $(s-1)A + \frac{D-k}{D-1}\frac{1}{k}\left[ (1-\rho^2)^2 -2A\right]$, where $A\geq 0$ is a function of the data ($u,v$). This formula reveals several key properties: (1) We need $s=1$. (2) The factor $\frac{D-k}{D-1}$ can be highly beneficial in reducing variances. (3) The term $\frac{1}{k}(1-\rho^2)^2$ is a substantial improvement compared with $\frac{1}{k}(1+\rho^2)$, which corresponds to the un-normalized estimator. We illustrate that by letting the $k$ in OPORP to be $k=1$ and repeat the procedure $m$ times, we exactly recover the work of ``very spars random projections'' (VSRP). This immediately leads to a normalized estimator for VSRP which substantially improves the original estimator of VSRP. In summary, with OPORP, the two key steps: (i) the normalization and (ii) the fixed-length binning scheme, have considerably improved the accuracy in estimating the cosine similarity, which is a routine (and crucial) task in modern embedding-based retrieval (EBR) applications.

data mining, machine learning, natural language, (19 more...)

2302.03505

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(36 more...)

Genre: Research Report (0.81)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Impact of Light and Shadow on Robustness of Deep Neural Networks

Hu, Chengyin, Shi, Weiwen, Li, Chao, Sun, Jialiang, Wang, Donghua, Wu, Junqi, Tang, Guijian

Deep neural networks (DNNs) have made remarkable strides in various computer vision tasks, including image classification, segmentation, and object detection. However, recent research has revealed a vulnerability in advanced DNNs when faced with deliberate manipulations of input data, known as adversarial attacks. Moreover, the accuracy of DNNs is heavily influenced by the distribution of the training dataset. Distortions or perturbations in the color space of input images can introduce out-of-distribution data, resulting in misclassification. In this work, we propose a brightness-variation dataset, which incorporates 24 distinct brightness levels for each image within a subset of ImageNet. This dataset enables us to simulate the effects of light and shadow on the images, so as is to investigate the impact of light and shadow on the performance of DNNs. In our study, we conduct experiments using several state-of-the-art DNN architectures on the aforementioned dataset. Through our analysis, we discover a noteworthy positive correlation between the brightness levels and the loss of accuracy in DNNs. Furthermore, we assess the effectiveness of recently proposed robust training techniques and strategies, including AugMix, Revisit, and Free Normalizer, using the ResNet50 architecture on our brightness-variation dataset. Our experimental results demonstrate that these techniques can enhance the robustness of DNNs against brightness variation, leading to improved performance when dealing with images exhibiting varying brightness levels.

artificial intelligence, machine learning, neural network, (16 more...)

2305.14165

Country:

Asia > China > Shanghai > Shanghai (0.04)
Asia > China > Shaanxi Province > Xi'an (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(8 more...)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (0.35)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Hu, Chengyin, Shi, Weiwen

Adversarial Catoptric Light: An Effective, Stealthy and Robust Physical-World Attack to DNNs

Deep neural networks (DNNs) have demonstrated exceptional success across various tasks, underscoring the need to evaluate the robustness of advanced DNNs. However, traditional methods using stickers as physical perturbations to deceive classifiers present challenges in achieving stealthiness and suffer from printing loss. Recent advancements in physical attacks have utilized light beams such as lasers and projectors to perform attacks, where the optical patterns generated are artificial rather than natural. In this study, we introduce a novel physical attack, adversarial catoptric light (AdvCL), where adversarial perturbations are generated using a common natural phenomenon, catoptric light, to achieve stealthy and naturalistic adversarial attacks against advanced DNNs in a black-box setting. We evaluate the proposed method in three aspects: effectiveness, stealthiness, and robustness. Quantitative results obtained in simulated environments demonstrate the effectiveness of the proposed method, and in physical scenarios, we achieve an attack success rate of 83.5%, surpassing the baseline. We use common catoptric light as a perturbation to enhance the stealthiness of the method and make physical samples appear more natural. Robustness is validated by successfully attacking advanced and robust DNNs with a success rate over 80% in all cases. Additionally, we discuss defense strategy against AdvCL and put forward some light-based physical attacks.

artificial intelligence, deep learning, machine learning, (16 more...)

2209.11739

Country:

North America > United States > California > Los Angeles County > Long Beach (0.14)
Europe > Austria > Vienna (0.14)
North America > United States > Washington > King County > Seattle (0.05)
(18 more...)

Genre: Research Report > New Finding (0.66)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Hu, Chengyin, Shi, Weiwen

Impact of Colour Variation on Robustness of Deep Neural Networks

Deep neural networks (DNNs) have have shown state-of-the-art performance for computer vision applications like image classification, segmentation and object detection. Whereas recent advances have shown their vulnerability to manual digital perturbations in the input data, namely adversarial attacks. The accuracy of the networks is significantly affected by the data distribution of their training dataset. Distortions or perturbations on color space of input images generates out-of-distribution data, which make networks more likely to misclassify them. In this work, we propose a color-variation dataset by distorting their RGB color on a subset of the ImageNet with 27 different combinations. The aim of our work is to study the impact of color variation on the performance of DNNs. We perform experiments on several state-of-the-art DNN architectures on the proposed dataset, and the result shows a significant correlation between color variation and loss of accuracy. Furthermore, based on the ResNet50 architecture, we demonstrate some experiments of the performance of recently proposed robust training techniques and strategies, such as Augmix, revisit, and free normalizer, on our proposed dataset. Experimental results indicate that these robust training techniques can improve the robustness of deep networks to color variation.

artificial intelligence, machine learning, neural network, (17 more...)

2209.02832

Country:

North America > United States > Nevada > Clark County > Las Vegas (0.05)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
(15 more...)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology > Security & Privacy (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Message Intercommunication for Inductive Relation Reasoning

Liang, Ke, Meng, Lingyuan, Zhou, Sihang, Wang, Siwei, Tu, Wenxuan, Liu, Yue, Liu, Meng, Liu, Xinwang

Inductive relation reasoning for knowledge graphs, aiming to infer missing links between brand-new entities, has drawn increasing attention. The models developed based on Graph Inductive Learning, called GraIL-based models, have shown promising potential for this task. However, the uni-directional message-passing mechanism hinders such models from exploiting hidden mutual relations between entities in directed graphs. Besides, the enclosing subgraph extraction in most GraIL-based models restricts the model from extracting enough discriminative information for reasoning. Consequently, the expressive ability of these models is limited. To address the problems, we propose a novel GraIL-based inductive relation reasoning model, termed MINES, by introducing a Message Intercommunication mechanism on the Neighbor-Enhanced Subgraph. Concretely, the message intercommunication mechanism is designed to capture the omitted hidden mutual information. It introduces bi-directed information interactions between connected entities by inserting an undirected/bi-directed GCN layer between uni-directed RGCN layers. Moreover, inspired by the success of involving more neighbors in other graph-based tasks, we extend the neighborhood area beyond the enclosing subgraph to enhance the information collection for inductive relation reasoning. Extensive experiments on twelve inductive benchmark datasets demonstrate that our MINES outperforms existing state-of-the-art models, and show the effectiveness of our intercommunication mechanism and reasoning on the neighbor-enhanced subgraph.

data mining, machine learning, subgraph, (20 more...)

2305.14074

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Austria > Vienna (0.14)
North America > United States > New York > New York County > New York City (0.04)
(15 more...)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
(3 more...)