AITopics | Yang, Min

Plotting

Yang, Min

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

JADE: A Linguistics-based Safety Evaluation Platform for Large Language Models

Zhang, Mi, Pan, Xudong, Yang, Min

arXiv.org Artificial IntelligenceDec-10-2023

In this paper, we present JADE, a targeted linguistic fuzzing platform which strengthens the linguistic complexity of seed questions to simultaneously and consistently break a wide range of widely-used LLMs categorized in three groups: eight open-sourced Chinese, six commercial Chinese and four commercial English LLMs. JADE generates three safety benchmarks for the three groups of LLMs, which contain unsafe questions that are highly threatening: the questions simultaneously trigger harmful generation of multiple LLMs, with an average unsafe generation ratio of $70\%$ (please see the table below), while are still natural questions, fluent and preserving the core unsafe semantics. We release the benchmark demos generated for commercial English LLMs and open-sourced English LLMs in the following link: https://github.com/whitzard-ai/jade-db. For readers who are interested in evaluating on more questions generated by JADE, please contact us. JADE is based on Noam Chomsky's seminal theory of transformational-generative grammar. Given a seed question with unsafe intention, JADE invokes a sequence of generative and transformational rules to increment the complexity of the syntactic structure of the original question, until the safety guardrail is broken. Our key insight is: Due to the complexity of human language, most of the current best LLMs can hardly recognize the invariant evil from the infinite number of different syntactic structures which form an unbound example space that can never be fully covered. Technically, the generative/transformative rules are constructed by native speakers of the languages, and, once developed, can be used to automatically grow and transform the parse tree of a given question, until the guardrail is broken. For more evaluation results and demo, please check our website: https://whitzard-ai.github.io/jade.html.

large language model, linguistics-based safety evaluation platform, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2311.00286

Country:

Oceania > Australia (0.14)
Europe > United Kingdom > England (0.14)
Europe > Italy (0.14)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry: Law (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

BELT: Old-School Backdoor Attacks can Evade the State-of-the-Art Defense with Backdoor Exclusivity Lifting

Qiu, Huming, Sun, Junjie, Zhang, Mi, Pan, Xudong, Yang, Min

arXiv.org Artificial IntelligenceDec-8-2023

Deep neural networks (DNNs) are susceptible to backdoor attacks, where malicious functionality is embedded to allow attackers to trigger incorrect classifications. Old-school backdoor attacks use strong trigger features that can easily be learned by victim models. Despite robustness against input variation, the robustness however increases the likelihood of unintentional trigger activations. This leaves traces to existing defenses, which find approximate replacements for the original triggers that can activate the backdoor without being identical to the original trigger via, e.g., reverse engineering and sample overlay. In this paper, we propose and investigate a new characteristic of backdoor attacks, namely, backdoor exclusivity, which measures the ability of backdoor triggers to remain effective in the presence of input variation. Building upon the concept of backdoor exclusivity, we propose Backdoor Exclusivity LifTing (BELT), a novel technique which suppresses the association between the backdoor and fuzzy triggers to enhance backdoor exclusivity for defense evasion. Extensive evaluation on three popular backdoor benchmarks validate, our approach substantially enhances the stealthiness of four old-school backdoor attacks, which, after backdoor exclusivity lifting, is able to evade six state-of-the-art backdoor countermeasures, at almost no cost of the attack success rate and normal utility. For example, one of the earliest backdoor attacks BadNet, enhanced by BELT, evades most of the state-of-the-art defenses including ABS and MOTH which would otherwise recognize the backdoored model.

artificial intelligence, backdoor, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2312.04902

Country:

Europe > Netherlands (0.14)
Europe > Germany (0.14)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

AutoAlign: Fully Automatic and Effective Knowledge Graph Alignment enabled by Large Language Models

Zhang, Rui, Su, Yixin, Trisedya, Bayu Distiawan, Zhao, Xiaoyan, Yang, Min, Cheng, Hong, Qi, Jianzhong

arXiv.org Artificial IntelligenceNov-13-2023

The task of entity alignment between knowledge graphs (KGs) aims to identify every pair of entities from two different KGs that represent the same entity. Many machine learning-based methods have been proposed for this task. However, to our best knowledge, existing methods all require manually crafted seed alignments, which are expensive to obtain. In this paper, we propose the first fully automatic alignment method named AutoAlign, which does not require any manually crafted seed alignments. Specifically, for predicate embeddings, AutoAlign constructs a predicate-proximity-graph with the help of large language models to automatically capture the similarity between predicates across two KGs. For entity embeddings, AutoAlign first computes the entity embeddings of each KG independently using TransE, and then shifts the two KGs' entity embeddings into the same vector space by computing the similarity between entities based on their attributes. Thus, both predicate alignment and entity alignment can be done without manually crafted seed alignments. AutoAlign is not only fully automatic, but also highly effective. Experiments using real-world KGs show that AutoAlign improves the performance of entity alignment significantly compared to state-of-the-art methods.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2307.11772

Country: Asia > China (0.46)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Physics-Informed Generator-Encoder Adversarial Networks with Latent Space Matching for Stochastic Differential Equations

Gao, Ruisong, Yang, Min, Zhang, Jin

arXiv.org Artificial IntelligenceNov-3-2023

We propose a new class of physics-informed neural networks, called Physics-Informed Generator-Encoder Adversarial Networks, to effectively address the challenges posed by forward, inverse, and mixed problems in stochastic differential equations. In these scenarios, while the governing equations are known, the available data consist of only a limited set of snapshots for system parameters. Our model consists of two key components: the generator and the encoder, both updated alternately by gradient descent. In contrast to previous approaches of directly matching the approximated solutions with real snapshots, we employ an indirect matching that operates within the lower-dimensional latent feature space. This method circumvents challenges associated with high-dimensional inputs and complex data distributions, while yielding more accurate solutions compared to existing neural network solvers. In addition, the approach also mitigates the training instability issues encountered in previous adversarial frameworks in an efficient manner. Numerical results provide compelling evidence of the effectiveness of the proposed method in solving different types of stochastic differential equations.

artificial intelligence, machine learning, physics-informed generator-encoder adversarial network, (2 more...)

arXiv.org Artificial Intelligence

2311.01708

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.73)

Add feedback

Expression Syntax Information Bottleneck for Math Word Problems

Xiong, Jing, Li, Chengming, Yang, Min, Hu, Xiping, Hu, Bin

arXiv.org Artificial IntelligenceOct-24-2023

Math Word Problems (MWP) aims to automatically solve mathematical questions given in texts. Previous studies tend to design complex models to capture additional information in the original text so as to enable the model to gain more comprehensive features. In this paper, we turn our attention in the opposite direction, and work on how to discard redundant features containing spurious correlations for MWP. To this end, we design an Expression Syntax Information Bottleneck method for MWP (called ESIB) based on variational information bottleneck, which extracts essential features of expression syntax tree while filtering latent-specific redundancy containing syntax-irrelevant features. The key idea of ESIB is to encourage multiple models to predict the same expression syntax tree for different problem representations of the same problem by mutual learning so as to capture consistent information of expression syntax tree and discard latent-specific redundancy. To improve the generalization ability of the model and generate more diverse expressions, we design a self-distillation loss to encourage the model to rely more on the expression syntax information in the latent space. Experimental results on two large-scale benchmarks show that our model not only achieves state-of-the-art results but also generates more diverse solutions. The code is available.

information, machine learning, natural language, (13 more...)

arXiv.org Artificial Intelligence

2310.15664

Country: Asia > China > Guangdong Province (0.30)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Constructive Large Language Models Alignment with Diverse Feedback

Yu, Tianshu, Lin, Ting-En, Wu, Yuchuan, Yang, Min, Huang, Fei, Li, Yongbin

arXiv.org Artificial IntelligenceOct-11-2023

In recent research on large language models (LLMs), there has been a growing emphasis on aligning these models with human values to reduce the impact of harmful content. However, current alignment methods often rely solely on singular forms of human feedback, such as preferences, annotated labels, or natural language critiques, overlooking the potential advantages of combining these feedback types. This limitation leads to suboptimal performance, even when ample training data is available. In this paper, we introduce Constructive and Diverse Feedback (CDF) as a novel method to enhance LLM alignment, inspired by constructivist learning theory. Our approach involves collecting three distinct types of feedback tailored to problems of varying difficulty levels within the training dataset. Specifically, we exploit critique feedback for easy problems, refinement feedback for medium problems, and preference feedback for hard problems. By training our model with this diversified feedback, we achieve enhanced alignment performance while using less training data. To assess the effectiveness of CDF, we evaluate it against previous methods in three downstream tasks: question answering, dialog generation, and text summarization. Experimental results demonstrate that CDF achieves superior performance even with a smaller training dataset.

language model alignment, large language model, machine learning, (3 more...)

arXiv.org Artificial Intelligence

2310.0645

Genre: Research Report > New Finding (0.53)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Valley: Video Assistant with Large Language model Enhanced abilitY

Luo, Ruipu, Zhao, Ziwang, Yang, Min, Dong, Junwei, Li, Da, Lu, Pengcheng, Wang, Tao, Hu, Linmei, Qiu, Minghui, Wei, Zhongyu

arXiv.org Artificial IntelligenceOct-8-2023

Large language models (LLMs), with their remarkable conversational capabilities, have demonstrated impressive performance across various applications and have emerged as formidable AI assistants. In view of this, it raises an intuitive question: Can we harness the power of LLMs to build multimodal AI assistants for visual applications? Recently, several multi-modal models have been developed for this purpose. They typically pre-train an adaptation module to align the semantics of the vision encoder and language model, followed by fine-tuning on instruction-following data. However, despite the success of this pipeline in image and language understanding, its effectiveness in joint video and language understanding has not been widely explored. In this paper, we aim to develop a novel multi-modal foundation model capable of comprehending video, image, and language within a general framework. To achieve this goal, we introduce Valley, a Video Assistant with Large Language model Enhanced abilitY. The Valley consists of a LLM, a temporal modeling module, a visual encoder, and a simple projection module designed to bridge visual and textual modes. To empower Valley with video comprehension and instruction-following capabilities, we construct a video instruction dataset and adopt a two-stage tuning procedure to train it. Specifically, we employ ChatGPT to facilitate the construction of task-oriented conversation data encompassing various tasks, including multi-shot captions, long video descriptions, action recognition, causal relationship inference, etc. Subsequently, we adopt a pre-training-then-instructions-tuned pipeline to align visual and textual modalities and improve the instruction-following capability of Valley. Qualitative experiments demonstrate that Valley has the potential to function as a highly effective video assistant that can make complex video understanding scenarios easy.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2306.07207

Country: Asia > China (0.14)

Genre: Research Report (0.50)

Industry: Education > Curriculum > Subject-Specific Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.38)

Add feedback

Self-Explanation Prompting Improves Dialogue Understanding in Large Language Models

Gao, Haoyu, Lin, Ting-En, Li, Hangyu, Yang, Min, Wu, Yuchuan, Ma, Wentao, Li, Yongbin

arXiv.org Artificial IntelligenceSep-22-2023

Task-oriented dialogue (TOD) systems facilitate users in executing various activities via multi-turn dialogues, but Large Language Models (LLMs) often struggle to comprehend these intricate contexts. In this study, we propose a novel "Self-Explanation" prompting strategy to enhance the comprehension abilities of LLMs in multi-turn dialogues. This task-agnostic approach requires the model to analyze each dialogue utterance before task execution, thereby improving performance across various dialogue-centric tasks. Experimental results from six benchmark datasets confirm that our method consistently outperforms other zero-shot prompts and matches or exceeds the efficacy of few-shot prompts, demonstrating its potential as a powerful tool in enhancing LLMs' comprehension in complex dialogue tasks.

artificial intelligence, large language model, natural language

arXiv.org Artificial Intelligence

2309.1294

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

VDialogUE: A Unified Evaluation Benchmark for Visually-grounded Dialogue

Li, Yunshui, Hui, Binyuan, Yin, Zhaochao, He, Wanwei, Luo, Run, Long, Yuxing, Yang, Min, Huang, Fei, Li, Yongbin

arXiv.org Artificial IntelligenceSep-13-2023

Visually-grounded dialog systems, which integrate multiple modes of communication such as text and visual inputs, have become an increasingly popular area of investigation. However, the absence of a standardized evaluation framework poses a challenge in assessing the development of this field. To this end, we propose \textbf{VDialogUE}, a \textbf{V}isually-grounded \textbf{Dialog}ue benchmark for \textbf{U}nified \textbf{E}valuation. It defines five core multi-modal dialogue tasks and covers six datasets. Furthermore, in order to provide a comprehensive assessment of the model's performance across all tasks, we developed a novel evaluation metric called VDscore, which is based on the Analytic Hierarchy Process~(AHP) method. Additionally, we present a straightforward yet efficient baseline model, named \textbf{VISIT}~(\textbf{VIS}ually-grounded d\textbf{I}alog \textbf{T}ransformer), to promote the advancement of general multi-modal dialogue systems. It progressively builds its multi-modal foundation and dialogue capability via a two-stage pre-training strategy. We believe that the VDialogUE benchmark, along with the evaluation scripts and our baseline models, will accelerate the development of visually-grounded dialog systems and lead to the development of more sophisticated and effective pre-trained models.

artificial intelligence, unified evaluation benchmark, visually-grounded dialogue, (1 more...)

arXiv.org Artificial Intelligence

2309.07387

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

MIRA: Cracking Black-box Watermarking on Deep Neural Networks via Model Inversion-based Removal Attacks

Lu, Yifan, Li, Wenxuan, Zhang, Mi, Pan, Xudong, Yang, Min

arXiv.org Artificial IntelligenceSep-6-2023

To protect the intellectual property of well-trained deep neural networks (DNNs), black-box DNN watermarks, which are embedded into the prediction behavior of DNN models on a set of specially-crafted samples, have gained increasing popularity in both academy and industry. Watermark robustness is usually implemented against attackers who steal the protected model and obfuscate its parameters for watermark removal. Recent studies empirically prove the robustness of most black-box watermarking schemes against known removal attempts. In this paper, we propose a novel Model Inversion-based Removal Attack (\textsc{Mira}), which is watermark-agnostic and effective against most of mainstream black-box DNN watermarking schemes. In general, our attack pipeline exploits the internals of the protected model to recover and unlearn the watermark message. We further design target class detection and recovered sample splitting algorithms to reduce the utility loss caused by \textsc{Mira} and achieve data-free watermark removal on half of the watermarking schemes. We conduct comprehensive evaluation of \textsc{Mira} against ten mainstream black-box watermarks on three benchmark datasets and DNN architectures. Compared with six baseline removal attacks, \textsc{Mira} achieves strong watermark removal effects on the covered watermarks, preserving at least $90\%$ of the stolen model utility, under more relaxed or even no assumptions on the dataset availability.

artificial intelligence, machine learning, model inversion-based removal attack, (3 more...)

arXiv.org Artificial Intelligence

2309.03466

Genre: Research Report (1.00)

Industry:

Transportation > Air (1.00)
Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.60)

Add feedback