AITopics | Banff

Collaborating Authors

Banff

Wasserstein Auto-encoded MDPs: Formal Verification of Efficiently Distilled RL Policies with Many-sided Guarantees

Delgrange, Florent, Nowé, Ann, Pérez, Guillermo A.

arXiv.org Artificial IntelligenceApr-21-2023

Although deep reinforcement learning (DRL) has many success stories, the large-scale deployment of policies learned through these advanced techniques in safety-critical scenarios is hindered by their lack of formal guarantees. Variational Markov Decision Processes (VAE-MDPs) are discrete latent space models that provide a reliable framework for distilling formally verifiable controllers from any RL policy. While the related guarantees address relevant practical aspects such as the satisfaction of performance and safety properties, the VAE approach suffers from several learning flaws (posterior collapse, slow learning speed, poor dynamics estimates), primarily due to the absence of abstraction and representation guarantees to support latent optimization. We introduce the Wasserstein auto-encoded MDP (WAE-MDP), a latent space model that fixes those issues by minimizing a penalized form of the optimal transport between the behaviors of the agent executing the original policy and the distilled policy, for which the formal guarantees apply. Our approach yields bisimulation guarantees while learning the distilled policy, allowing concrete optimization of the abstraction and representation model quality. Our experiments show that, besides distilling policies up to 10 times faster, the latent model quality is indeed better in general. Moreover, we present experiments from a simple time-to-failure verification algorithm on the latent space. The fact that our approach enables such simple verification techniques highlights its applicability.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2303.12558

Country:

North America > United States > California > Los Angeles County > Long Beach (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > Austria > Vienna (0.14)
(17 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
(2 more...)

Add feedback

Classification and Uncertainty Quantification of Corrupted Data using Semi-Supervised Autoencoders

Joppich, Philipp, Dorn, Sebastian, De Candido, Oliver, Utschick, Wolfgang, Knollmüller, Jakob

arXiv.org Artificial IntelligenceApr-20-2023

Parametric and non-parametric classifiers often have to deal with real-world data, where corruptions like noise, occlusions, and blur are unavoidable - posing significant challenges. We present a probabilistic approach to classify strongly corrupted data and quantify uncertainty, despite the model only having been trained with uncorrupted data. A semi-supervised autoencoder trained on uncorrupted data is the underlying architecture. We use the decoding part as a generative model for realistic data and extend it by convolutions, masking, and additive Gaussian noise to describe imperfections. This constitutes a statistical inference task in terms of the optimal latent space activations of the underlying uncorrupted datum. We solve this problem approximately with Metric Gaussian Variational Inference (MGVI). The supervision of the autoencoder's latent space allows us to classify corrupted data directly under uncertainty with the statistically inferred latent space activations. Furthermore, we demonstrate that the model uncertainty strongly depends on whether the classification is correct or wrong, setting a basis for a statistical "lie detector" of the classification. Independent of that, we show that the generative model can optimally restore the uncorrupted datum by decoding the inferred latent space activations.

artificial intelligence, classification, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.3390/psf2022005012

2105.13393

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.05)
North America > Canada > Quebec > Montreal (0.05)
Oceania > Australia > New South Wales > Sydney (0.04)
(8 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.98)

Add feedback

Masked Language Model Based Textual Adversarial Example Detection

Zhang, Xiaomei, Zhang, Zhaoxi, Zhong, Qi, Zheng, Xufei, Zhang, Yanjun, Hu, Shengshan, Zhang, Leo Yu

arXiv.org Artificial IntelligenceApr-19-2023

Adversarial attacks are a serious threat to the reliable deployment of machine learning models in safety-critical applications. They can misguide current models to predict incorrectly by slightly modifying the inputs. Recently, substantial work has shown that adversarial examples tend to deviate from the underlying data manifold of normal examples, whereas pre-trained masked language models can fit the manifold of normal NLP data. To explore how to use the masked language model in adversarial detection, we propose a novel textual adversarial example detection method, namely Masked Language Model-based Detection (MLMD), which can produce clearly distinguishable signals between normal examples and adversarial examples by exploring the changes in manifolds induced by the masked language model. MLMD features a plug and play usage (i.e., no need to retrain the victim model) for adversarial defense and it is agnostic to classification tasks, victim model's architectures, and to-be-defended attack methods. We evaluate MLMD on various benchmark textual datasets, widely studied machine learning models, and state-of-the-art (SOTA) adversarial attacks (in total $3*4*4 = 48$ settings). Experimental results show that MLMD can achieve strong performance, with detection accuracy up to 0.984, 0.967, and 0.901 on AG-NEWS, IMDB, and SST-2 datasets, respectively. Additionally, MLMD is superior, or at least comparable to, the SOTA detection defenses in detection accuracy and F1 score. Among many defenses based on the off-manifold assumption of adversarial examples, this work offers a new angle for capturing the manifold change. The code for this work is openly accessible at \url{https://github.com/mlmddetection/MLMDdetection}.

machine learning, manifold, natural language, (19 more...)

arXiv.org Artificial Intelligence

2304.08767

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > China > Chongqing Province > Chongqing (0.04)
(21 more...)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Evil from Within: Machine Learning Backdoors through Hardware Trojans

Warnecke, Alexander, Speith, Julian, Möller, Jan-Niklas, Rieck, Konrad, Paar, Christof

arXiv.org Artificial IntelligenceApr-18-2023

Backdoors pose a serious threat to machine learning, as they can compromise the integrity of security-critical systems, such as self-driving cars. While different defenses have been proposed to address this threat, they all rely on the assumption that the hardware on which the learning models are executed during inference is trusted. In this paper, we challenge this assumption and introduce a backdoor attack that completely resides within a common hardware accelerator for machine learning. Outside of the accelerator, neither the learning model nor the software is manipulated, so that current defenses fail. To make this attack practical, we overcome two challenges: First, as memory on a hardware accelerator is severely limited, we introduce the concept of a minimal backdoor that deviates as little as possible from the original model and is activated by replacing a few model parameters only. Second, we develop a configurable hardware trojan that can be provisioned with the backdoor and performs a replacement only when the specific target model is processed. We demonstrate the practical feasibility of our attack by implanting our hardware trojan into the Xilinx Vitis AI DPU, a commercial machine-learning accelerator. We configure the trojan with a minimal backdoor for a traffic-sign recognition system. The backdoor replaces only 30 (0.069%) model parameters, yet it reliably manipulates the recognition once the input contains a backdoor trigger. Our attack expands the hardware circuit of the accelerator by 0.24% and induces no run-time overhead, rendering a detection hardly possible. Given the complex and highly distributed manufacturing process of current hardware, our work points to a new threat in machine learning that is inaccessible to current security mechanisms and calls for hardware to be manufactured only in fully trusted environments.

artificial intelligence, backdoor, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2304.08411

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.14)
North America > United States > District of Columbia > Washington (0.14)
(38 more...)

Genre: Research Report (1.00)

Industry:

Semiconductors & Electronics (1.00)
Information Technology > Security & Privacy (1.00)
Transportation > Ground > Road (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.86)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Order-Disorder: Imitation Adversarial Attacks for Black-box Neural Ranking Models

Liu, Jiawei, Kang, Yangyang, Tang, Di, Song, Kaisong, Sun, Changlong, Wang, Xiaofeng, Lu, Wei, Liu, Xiaozhong

arXiv.org Artificial IntelligenceApr-18-2023

Neural text ranking models have witnessed significant advancement and are increasingly being deployed in practice. Unfortunately, they also inherit adversarial vulnerabilities of general neural models, which have been detected but remain underexplored by prior studies. Moreover, the inherit adversarial vulnerabilities might be leveraged by blackhat SEO to defeat better-protected search engines. In this study, we propose an imitation adversarial attack on black-box neural passage ranking models. We first show that the target passage ranking model can be transparentized and imitated by enumerating critical queries/candidates and then train a ranking imitation model. Leveraging the ranking imitation model, we can elaborately manipulate the ranking results and transfer the manipulation attack to the target ranking model. For this purpose, we propose an innovative gradient-based attack method, empowered by the pairwise objective function, to generate adversarial triggers, which causes premeditated disorderliness with very few tokens. To equip the trigger camouflages, we add the next sentence prediction loss and the language model fluency constraint to the objective function. Experimental results on passage ranking demonstrate the effectiveness of the ranking imitation attack model and adversarial triggers against various SOTA neural ranking models. Furthermore, various mitigation analyses and human evaluation show the effectiveness of camouflages when facing potential mitigation approaches. To motivate other scholars to further investigate this novel and important problem, we make the experiment data and code publicly available.

information retrieval, large language model, machine learning, (22 more...)

arXiv.org Artificial Intelligence

2209.06506

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.28)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
(24 more...)

Genre:

Research Report > New Finding (0.48)
Research Report > Experimental Study (0.34)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(4 more...)

Add feedback

Likelihood-Based Generative Radiance Field with Latent Space Energy-Based Model for 3D-Aware Disentangled Image Representation

Zhu, Yaxuan, Xie, Jianwen, Li, Ping

arXiv.org Machine LearningApr-16-2023

We propose the NeRF-LEBM, a likelihood-based top-down 3D-aware 2D image generative model that incorporates 3D representation via Neural Radiance Fields (NeRF) and 2D imaging process via differentiable volume rendering. The model represents an image as a rendering process from 3D object to 2D image and is conditioned on some latent variables that account for object characteristics and are assumed to follow informative trainable energy-based prior models. We propose two likelihood-based learning frameworks to train the NeRF-LEBM: (i) maximum likelihood estimation with Markov chain Monte Carlo-based inference and (ii) variational inference with the reparameterization trick. We study our models in the scenarios with both known and unknown camera poses. Experiments on several benchmark datasets demonstrate that the NeRF-LEBM can infer 3D object structures from 2D images, generate 2D images with novel views and objects, learn from incomplete 2D images, and learn from 2D images with known or unknown camera poses.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Machine Learning

2304.07918

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
Asia > South Korea > Seoul > Seoul (0.04)
(18 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)
(2 more...)

Add feedback

Mining ℰℒ⊥ Bases with Adaptable Role Depth

Guimarães, Ricardo (Department of Informatics, University of Bergen) | Ozaki, Ana (Department of Informatics, University of Bergen) | Persia, Cosimo (Department of Informatics, University of Bergen) | Sertkaya, Baris (a:1:{s:5:"en_US";s:9:"Prof. Dr.";})

Journal of Artificial Intelligence ResearchApr-13-2023

In Formal Concept Analysis, a base for a finite structure is a set of implications that characterizes all valid implications of the structure. This notion can be adapted to the context of Description Logic, where the base consists of a set of concept inclusions instead of implications. In this setting, concept expressions can be arbitrarily large. Thus, it is not clear whether a finite base exists and, if so, how large concept expressions may need to be. We first revisit results in the literature for mining ℰℒ⊥ bases from finite interpretations. Those mainly focus on finding a finite base or on fixing the role depth but potentially losing some of the valid concept inclusions with higher role depth. We then present a new strategy for mining ℰℒ⊥ bases which is adaptable in the sense that it can bound the role depth of concepts depending on the local structure of the interpretation. Our strategy guarantees to capture all ℰℒ⊥ concept inclusions holding in the interpretation, not only the ones up to a fixed role depth. We also consider the case of confident ℰℒ⊥ bases, which requires that some proportion of the domain of the interpretation satisfies the base, instead of the whole domain. This case is useful to cope with noisy data.

interpretation, mmsc, vertex, (15 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.13777

AI Access Foundation

13777

Journal of Artificial Intelligence Research

Country:

Europe > Germany > Saxony > Dresden (0.04)
Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
North America > United States > Tennessee > Davidson County > Nashville (0.04)
(9 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Description Logic (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (0.68)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.67)

Add feedback

On the Opportunities and Challenges of Foundation Models for Geospatial Artificial Intelligence

Mai, Gengchen, Huang, Weiming, Sun, Jin, Song, Suhang, Mishra, Deepak, Liu, Ninghao, Gao, Song, Liu, Tianming, Cong, Gao, Hu, Yingjie, Cundy, Chris, Li, Ziyuan, Zhu, Rui, Lao, Ni

arXiv.org Artificial IntelligenceApr-13-2023

Large pre-trained models, also known as foundation models (FMs), are trained in a task-agnostic manner on large-scale data and can be adapted to a wide range of downstream tasks by fine-tuning, few-shot, or even zero-shot learning. Despite their successes in language and vision tasks, we have yet seen an attempt to develop foundation models for geospatial artificial intelligence (GeoAI). In this work, we explore the promises and challenges of developing multimodal foundation models for GeoAI. We first investigate the potential of many existing FMs by testing their performances on seven tasks across multiple geospatial subdomains including Geospatial Semantics, Health Geography, Urban Geography, and Remote Sensing. Our results indicate that on several geospatial tasks that only involve text modality such as toponym recognition, location description recognition, and US state-level/county-level dementia time series forecasting, these task-agnostic LLMs can outperform task-specific fully-supervised models in a zero-shot or few-shot learning setting. However, on other geospatial tasks, especially tasks that involve multiple data modalities (e.g., POI-based urban function classification, street view image-based urban noise intensity classification, and remote sensing image scene classification), existing foundation models still underperform task-specific models. Based on these observations, we propose that one of the major challenges of developing a FM for GeoAI is to address the multimodality nature of geospatial tasks. After discussing the distinct challenges of each geospatial data modality, we suggest the possibility of a multimodal foundation model which can reason over various types of geospatial data through geospatial alignments. We conclude this paper by discussing the unique risks and challenges to develop such a model for GeoAI.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2304.06798

Country:

North America > United States > Washington (0.14)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.05)
North America > United States > California > Santa Barbara County (0.05)
(25 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Transportation > Ground > Road (1.00)
Information Technology (1.00)
Energy > Power Industry (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

PAD: Towards Principled Adversarial Malware Detection Against Evasion Attacks

Li, Deqiang, Cui, Shicheng, Li, Yun, Xu, Jia, Xiao, Fu, Xu, Shouhuai

arXiv.org Artificial IntelligenceApr-6-2023

Machine Learning (ML) techniques can facilitate the automation of malicious software (malware for short) detection, but suffer from evasion attacks. Many studies counter such attacks in heuristic manners, lacking theoretical guarantees and defense effectiveness. In this paper, we propose a new adversarial training framework, termed Principled Adversarial Malware Detection (PAD), which offers convergence guarantees for robust optimization methods. PAD lays on a learnable convex measurement that quantifies distribution-wise discrete perturbations to protect malware detectors from adversaries, whereby for smooth detectors, adversarial training can be performed with theoretical treatments. To promote defense effectiveness, we propose a new mixture of attacks to instantiate PAD to enhance deep neural network-based measurements and malware detectors. Experimental results on two Android malware datasets demonstrate: (i) the proposed method significantly outperforms the state-of-the-art defenses; (ii) it can harden ML-based malware detection against 27 evasion attacks with detection accuracies greater than 83.45%, at the price of suffering an accuracy decrease smaller than 2.16% in the absence of attacks; (iii) it matches or outperforms many anti-malware scanners in VirusTotal against realistic adversarial malware.

artificial intelligence, detector, machine learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TDSC.2023.3265665

2302.11328

Country:

Oceania > Australia > New South Wales > Sydney (0.14)
Asia > China > Jiangsu Province > Nanjing (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(14 more...)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Plug & Play Directed Evolution of Proteins with Gradient-based Discrete MCMC

Emami, Patrick, Perreault, Aidan, Law, Jeffrey, Biagioni, David, John, Peter C. St.

arXiv.org Artificial IntelligenceApr-6-2023

A long-standing goal of machine-learning-based protein engineering is to accelerate the discovery of novel mutations that improve the function of a known protein. We introduce a sampling framework for evolving proteins in silico that supports mixing and matching a variety of unsupervised models, such as protein language models, and supervised models that predict protein function from sequence. By composing these models, we aim to improve our ability to evaluate unseen mutations and constrain search to regions of sequence space likely to contain functional proteins. Our framework achieves this without any model fine-tuning or re-training by constructing a product of experts distribution directly in discrete protein space. Instead of resorting to brute force search or random sampling, which is typical of classic directed evolution, we introduce a fast MCMC sampler that uses gradients to propose promising mutations. We conduct in silico directed evolution experiments on wide fitness landscapes and across a range of different pre-trained unsupervised models, including a 650M parameter protein language model. Our results demonstrate an ability to efficiently discover variants with high evolutionary likelihood as well as estimated activity multiple mutations away from a wild type protein, suggesting our sampler provides a practical and effective new paradigm for machine-learning-based protein engineering.

machine learning, natural language, variant, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1088/2632-2153/accacd

2212.09925

Country:

Europe > France (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Energy (1.00)
Government > Regional Government > North America Government > United States Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)

Add feedback