AITopics | Paulus, Anselm

Plotting

Paulus, Anselm

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

LPGD: A General Framework for Backpropagation through Embedded Optimization Layers

Paulus, Anselm, Martius, Georg, Musil, Vít

arXiv.org Artificial IntelligenceJul-8-2024

Training such a parameterized optimization model is an Embedding parameterized optimization problems instance of bi-level optimization (Gould et al., 2016), as layers into machine learning architectures which is generally challenging. Whenever it is possible serves as a powerful inductive bias. Training to propagate gradients through the optimization problem such architectures with stochastic gradient via an informative derivative of the solution mapping, descent requires care, as degenerate derivatives the task is typically approached with standard stochastic of the embedded optimization problem often gradient descent (GD) (Amos & Kolter, 2017a; Agrawal render the gradients uninformative. We propose et al., 2019b). However, when the optimization problem has Lagrangian Proximal Gradient Descent (LPGD) discrete solutions, the derivatives are typically degenerate, a flexible framework for training architectures as small perturbations of the input do not affect the optimal with embedded optimization layers that seamlessly solution. Previous works have proposed several methods integrates into automatic differentiation to overcome this challenge, ranging from differentiable libraries. LPGD efficiently computes meaningful relaxations (Wang et al., 2019; Wilder et al., 2019a; Mandi replacements of the degenerate optimization & Guns, 2020; Djolonga & Krause, 2017) and stochastic layer derivatives by re-running the forward solver smoothing (Berthet et al., 2020; Dalle et al., 2022), over oracle on a perturbed input. LPGD captures proxy losses (Paulus et al., 2021), to finite-difference based various previously proposed methods as special techniques (Vlastelica et al., 2020).

artificial intelligence, lpgd, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2407.0592

Country:

Europe (1.00)
North America > United States > California (0.45)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
North America > United States > New York > New York County > New York City (0.14)

Genre: Research Report (1.00)

Industry: Energy > Oil & Gas > Upstream (0.86)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.76)

Add feedback

AdvPrompter: Fast Adaptive Adversarial Prompting for LLMs

Paulus, Anselm, Zharmagambetov, Arman, Guo, Chuan, Amos, Brandon, Tian, Yuandong

arXiv.org Artificial IntelligenceApr-21-2024

While recently Large Language Models (LLMs) have achieved remarkable successes, they are vulnerable to certain jailbreaking attacks that lead to generation of inappropriate or harmful content. Manual red-teaming requires finding adversarial prompts that cause such jailbreaking, e.g. by appending a suffix to a given instruction, which is inefficient and time-consuming. On the other hand, automatic adversarial prompt generation often leads to semantically meaningless attacks that can easily be detected by perplexity-based filters, may require gradient information from the TargetLLM, or do not scale well due to time-consuming discrete optimization processes over the token space. In this paper, we present a novel method that uses another LLM, called the AdvPrompter, to generate human-readable adversarial prompts in seconds, $\sim800\times$ faster than existing optimization-based approaches. We train the AdvPrompter using a novel algorithm that does not require access to the gradients of the TargetLLM. This process alternates between two steps: (1) generating high-quality target adversarial suffixes by optimizing the AdvPrompter predictions, and (2) low-rank fine-tuning of the AdvPrompter with the generated adversarial suffixes. The trained AdvPrompter generates suffixes that veil the input instruction without changing its meaning, such that the TargetLLM is lured to give a harmful response. Experimental results on popular open source TargetLLMs show state-of-the-art results on the AdvBench dataset, that also transfer to closed-source black-box LLM APIs. Further, we demonstrate that by fine-tuning on a synthetic dataset generated by AdvPrompter, LLMs can be made more robust against jailbreaking attacks while maintaining performance, i.e. high MMLU scores.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2404.16873

Country:

North America > United States (0.14)
Europe > Germany (0.14)

Genre: Research Report > New Finding (0.92)

Industry:

Law Enforcement & Public Safety (1.00)
Information Technology > Security & Privacy (1.00)
Government (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Backpropagation through Combinatorial Algorithms: Identity with Projection Works

Sahoo, Subham Sekhar, Paulus, Anselm, Vlastelica, Marin, Musil, Vít, Kuleshov, Volodymyr, Martius, Georg

arXiv.org Artificial IntelligenceMar-17-2023

Embedding discrete solvers as differentiable layers has given modern deep learning architectures combinatorial expressivity and discrete reasoning capabilities. The derivative of these solvers is zero or undefined, therefore a meaningful replacement is crucial for effective gradient-based learning. Prior works rely on smoothing the solver with input perturbations, relaxing the solver to continuous problems, or interpolating the loss landscape with techniques that typically require additional solver calls, introduce extra hyper-parameters, or compromise performance. We propose a principled approach to exploit the geometry of the discrete solution space to treat the solver as a negative identity on the backward pass and further provide a theoretical justification. Our experiments demonstrate that such a straightforward hyper-parameter-free approach is able to compete with previous more complex methods on numerous experiments such as backpropagation through discrete samplers, deep graph matching, and image retrieval. Furthermore, we substitute the previously proposed problem-specific and label-dependent margin with a generic regularization procedure that prevents cost collapse and increases robustness.

artificial intelligence, combinatorial algorithm, machine learning, (2 more...)

arXiv.org Artificial Intelligence

2205.15213

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.60)

Add feedback

CombOptNet: Fit the Right NP-Hard Problem by Learning Integer Programming Constraints

Paulus, Anselm, Rolínek, Michal, Musil, Vít, Amos, Brandon, Martius, Georg

arXiv.org Artificial IntelligenceApr-11-2022

Bridging logical and algorithmic reasoning with modern machine learning techniques is a fundamental challenge with potentially transformative impact. On the algorithmic side, many NP-hard problems can be expressed as integer programs, in which the constraints play the role of their "combinatorial specification." In this work, we aim to integrate integer programming solvers into neural network architectures as layers capable of learning both the cost terms and the constraints. The resulting end-to-end trainable architectures jointly extract features from raw data and solve a suitable (learned) combinatorial problem with state-of-the-art integer programming solvers. We demonstrate the potential of such layers with an extensive performance analysis on synthetic data and with a demonstration on a competitive computer vision keypoint matching benchmark.

artificial intelligence, constraint, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2105.02343

Country: Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.94)

Add feedback