AITopics

2501.0313

Country:

North America > United States (0.14)
North America > Canada (0.14)
Europe > Spain (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine (1.00)
Banking & Finance > Trading (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

arXiv.org Artificial IntelligenceAug-9-2023

Causal Fourier Analysis on Directed Acyclic Graphs and Posets

Seifert, Bastian, Wendler, Chris, Püschel, Markus

We present a novel form of Fourier analysis, and associated signal processing concepts, for signals (or data) indexed by edge-weighted directed acyclic graphs (DAGs). This means that our Fourier basis yields an eigendecomposition of a suitable notion of shift and convolution operators that we define. DAGs are the common model to capture causal relationships between data values and in this case our proposed Fourier analysis relates data with its causes under a linearity assumption that we define. The definition of the Fourier transform requires the transitive closure of the weighted DAG for which several forms are possible depending on the interpretation of the edge weights. Examples include level of influence, distance, or pollution distribution. Our framework is different from prior GSP: it is specific to DAGs and leverages, and extends, the classical theory of Moebius inversion from combinatorics. For a prototypical application we consider DAGs modeling dynamic networks in which edges change over time. Specifically, we model the spread of an infection on such a DAG obtained from real-world contact tracing data and learn the infection signal from samples assuming sparsity in the Fourier domain.

artificial intelligence, dag, machine learning, (18 more...)

2209.0797

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

arXiv.org Artificial IntelligenceJul-7-2023

QIGen: Generating Efficient Kernels for Quantized Inference on Large Language Models

Pegolotti, Tommaso, Frantar, Elias, Alistarh, Dan, Püschel, Markus

Our approach is informed by the target architecture Focusing specifically on generative inference, where the and a performance model, including both hardware size of the weights is the main bottleneck, the currently bestperforming characteristics and method-specific accuracy method is GPTQ (Frantar et al., 2022), which constraints. Results on CPU-based inference achieves near-lossless quantization to 4-bit weights, and can for LLaMA models show that our approach even accurately support 2 and 3-bit weights by reducing the can lead to high performance and high accuracy, granularity to smaller weight groups, e.g., by jointly quantizing comparing favorably to the best existing blocks of 64 weights using a shared scale and zero-point.

artificial intelligence, natural language, quantization, (15 more...)

2307.03738

Country: Europe (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.87)

arXiv.org Artificial IntelligenceMay-25-2023

Learning DAGs from Data with Few Root Causes

Misiakos, Panagiotis, Wendler, Chris, Püschel, Markus

We present a novel perspective and algorithm for learning directed acyclic graphs (DAGs) from data generated by a linear structural equation model (SEM). First, we show that a linear SEM can be viewed as a linear transform that, in prior work, computes the data from a dense input vector of random valued root causes (as we will call them) associated with the nodes. Instead, we consider the case of (approximately) few root causes and also introduce noise in the measurement of the data. Intuitively, this means that the DAG data is produced by few data-generating events whose effect percolates through the DAG. We prove identifiability in this new setting and show that the true DAG is the global minimizer of the $L^0$-norm of the vector of root causes. For data with few root causes, with and without noise, we show superior performance compared to prior DAG learning methods.

artificial intelligence, dag, machine learning, (17 more...)

2305.15936

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

arXiv.org Artificial IntelligenceMar-5-2021

Precise Multi-Neuron Abstractions for Neural Network Certification

Müller, Mark Niklas, Makarchuk, Gleb, Singh, Gagandeep, Püschel, Markus, Vechev, Martin

Formal verification of neural networks is critical for their safe adoption in real-world applications. However, designing a verifier which can handle realistic networks in a precise manner remains an open and difficult challenge. In this paper, we take a major step in addressing this challenge and present a new framework, called PRIMA, that computes precise convex approximations of arbitrary non-linear activations. PRIMA is based on novel approximation algorithms that compute the convex hull of polytopes, leveraging concepts from computational geometry. The algorithms have polynomial complexity, yield fewer constraints, and minimize precision loss. We evaluate the effectiveness of PRIMA on challenging neural networks with ReLU, Sigmoid, and Tanh activations. Our results show that PRIMA is significantly more precise than the state-of-the-art, verifying robustness for up to 16%, 30%, and 34% more images than prior work on ReLU-, Sigmoid-, and Tanh-based networks, respectively.

constraint, deep learning, neural network, (19 more...)

2103.03638

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

arXiv.org Artificial IntelligenceOct-1-2020

Learning Set Functions that are Sparse in Non-Orthogonal Fourier Bases

Wendler, Chris, Amrollahi, Andisheh, Seifert, Bastian, Krause, Andreas, Püschel, Markus

Many applications of machine learning on discrete domains, such as learning preference functions in recommender systems or auctions, can be reduced to estimating a set function that is sparse in the Fourier domain. In this work, we present a new family of algorithms for learning Fourier-sparse set functions. They require at most $nk - k \log_2 k + k$ queries (set function evaluations), under mild conditions on the Fourier coefficients, where $n$ is the size of the ground set and $k$ the number of non-zero Fourier coefficients. In contrast to other work that focused on the orthogonal Walsh-Hadamard transform, our novel algorithms operate with recently introduced non-orthogonal Fourier transforms that offer different notions of Fourier-sparsity. These naturally arise when modeling, e.g., sets of items forming substitutes and complements. We demonstrate effectiveness on several real-world applications.

artificial intelligence, coefficient, data quality, (20 more...)

2010.00439

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Quality > Data Transformation (0.36)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.34)

arXiv.org Machine LearningJul-20-2020

Neural Network Robustness Verification on GPUs

Müller, Christoph, Singh, Gagandeep, Püschel, Markus, Vechev, Martin

Certifying the robustness of neural networks against adversarial attacks is critical to their reliable adoption in real-world systems including autonomous driving and medical diagnosis. Unfortunately, state-of-the-art verifiers either do not scale to larger networks or are too imprecise to prove robustness, which limits their practical adoption. In this work, we introduce GPUPoly, a scalable verifier that can prove the robustness of significantly larger deep neural networks than possible with prior work. The key insight behind GPUPoly is the design of custom, sound polyhedra algorithms for neural network verification on a GPU. Our algorithms leverage the available GPU parallelism and the inherent sparsity of the underlying neural network verification task. GPUPoly scales to very large networks: for example, it can prove the robustness of a 1M neuron, 34-layer deep residual network in about 1 minute. We believe GPUPoly is a promising step towards the practical verification of large real-world networks.

deep learning, neural network, neuron, (19 more...)

2007.10868

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (1.00)

Industry: Information Technology > Robotics & Automation (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningSep-5-2019

Powerset Convolutional Neural Networks

Wendler, Chris, Alistarh, Dan, Püschel, Markus

We present a novel class of convolutional neural networks (CNNs) for set functions, i.e., data indexed with the powerset of a finite set. The convolutions are derived as linear, shift-equivariant functions for various notions of shifts on set functions. The framework is fundamentally different from graph convolutions based on the Laplacian, as it provides not one but several basic shifts, one for each element in the ground set. Prototypical experiments with several set function classification tasks on synthetic datasets and on datasets derived from real-world hypergraphs demonstrate the potential of our new powerset CNNs.

convolution, deep learning, neural network, (16 more...)

1909.02253

Country:

Europe (0.68)
North America > United States (0.46)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningMay-2-2019

On Linear Learning with Manycore Processors

Wszola, Eliza, Dünner, Celestine, Jaggi, Martin, Püschel, Markus

A new generation of manycore processors is on the rise that offers dozens and more cores on a chip and, in a sense, fuses host processor and accelerator. In this paper we target the efficient training of generalized linear models on these machines. We propose a novel approach for achieving parallelism which we call Heterogeneous Tasks on Homogeneous Cores (HTHC). It divides the problem into multiple fundamentally different tasks, which themselves are parallelized. For evaluation, we design a detailed, architecture-cognizant implementation of our scheme on a recent 72-core Knights Landing processor that is adaptive to the cache, memory, and core structure. Experiments for Lasso and SVM with different data sets show a speedup of typically an order of magnitude compared to straightforward parallel implementations in C++.

artificial intelligence, implementation, machine learning, (16 more...)

1905.00626

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Neural Information Processing SystemsDec-31-2018

Fast and Effective Robustness Certification

Singh, Gagandeep, Gehr, Timon, Mirman, Matthew, Püschel, Markus, Vechev, Martin

We present a new method and system, called DeepZ, for certifying neural network robustness based on abstract interpretation. Compared to state-of-the-art automated verifiers for neural networks, DeepZ: (i) handles ReLU, Tanh and Sigmoid activation functions, (ii) supports feedforward, convolutional, and residual architectures, (iii) is significantly more scalable and precise, and (iv) and is sound with respect to floating point arithmetic. These benefits are due to carefully designed approximations tailored to the setting of neural networks. As an example, DeepZ achieves a verification accuracy of 97% on a large network with 88, 500 hidden units under L attack with ɛ 0.1 with an average runtime of 133 seconds.

approximation, artificial intelligence, neural network, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)