AITopics | Overview

Collaborating Authors

Overview

Automatic differentiation in ML: Where we are and where we should be going

Bart van Merrienboer, Olivier Breuleux, Arnaud Bergeron, Pascal Lamblin

Neural Information Processing SystemsMay-24-2025, 07:22:44 GMT

We review the current state of automatic differentiation (AD) for array programming in machine learning (ML), including the different approaches such as operator overloading (OO) and source transformation (ST) used for AD, graph-based intermediate representations for programs, and source languages. Based on these insights, we introduce a new graph-based intermediate representation (IR) which specifically aims to efficiently support fully-general AD for array programming. Unlike existing dataflow programming representations in ML frameworks, our IR naturally supports function calls, higher-order functions and recursion, making ML models easier to implement. The ability to represent closures allows us to perform AD using ST without a tape, making the resulting derivative (adjoint) program amenable to ahead-of-time optimization using tools from functional language compilers, and enabling higher-order derivatives. Lastly, we introduce a proof of concept compiler toolchain called Myia which uses a subset of Python as a front end.

artificial intelligence, machine learning, programming language, (21 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.14)
Europe > Germany (0.14)

Genre:

Overview (0.68)
Research Report (0.48)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

Encoding Time-Series Explanations through Self-Supervised Model Behavior Consistency

Neural Information Processing SystemsMay-24-2025, 06:08:17 GMT

Interpreting time series models is uniquely challenging because it requires identifying both the location of time series signals that drive model predictions and their matching to an interpretable temporal pattern. While explainers from other modalities can be applied to time series, their inductive biases do not transfer well to the inherently challenging interpretation of time series.

data mining, explanation, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre:

Overview (0.67)
Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(3 more...)

Add feedback

63cb9921eecf51bfad27a99b2c53dd6d-Paper-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsMay-24-2025, 04:13:58 GMT

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
Asia (0.67)
North America > United States > California (0.45)
North America > United States > Washington > King County > Seattle (0.13)

Genre:

Research Report > New Finding (1.00)
Overview (0.92)
Research Report > Experimental Study (0.67)

Industry:

Media (1.00)
Leisure & Entertainment (1.00)
Law (1.00)
(6 more...)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
(6 more...)

Add feedback

63461de0b4cb760fc498e85b18a7fe81-Paper-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsMay-24-2025, 03:53:29 GMT

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Maryland (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre:

Research Report (0.67)
Overview (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Vision (0.95)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Add feedback

631bb9434d718ea309af82566347d607-Paper-Conference.pdf

Neural Information Processing SystemsMay-24-2025, 03:47:53 GMT

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
Asia (0.93)
North America > Canada (0.67)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre:

Workflow (0.94)
Overview (0.68)

Industry:

Health & Medicine > Therapeutic Area > Immunology (0.93)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.92)
Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Reverse KL-Divergence Training of Prior Networks: Improved Uncertainty and Adversarial Robustness

Neural Information Processing SystemsMay-24-2025, 00:47:45 GMT

Ensemble approaches for uncertainty estimation have recently been applied to the tasks of misclassification detection, out-of-distribution input detection and adversarial attack detection. Prior Networks have been proposed as an approach to efficiently emulate an ensemble of models for classification by parameterising a Dirichlet prior distribution over output distributions. These models have been shown to outperform alternative ensemble approaches, such as Monte-Carlo Dropout, on the task of out-of-distribution input detection. However, scaling Prior Networks to complex datasets with many classes is difficult using the training criteria originally proposed. This paper makes two contributions. First, we show that the appropriate training criterion for Prior Networks is the reverse KLdivergence between Dirichlet distributions. This addresses issues in the nature of the training data target distributions, enabling prior networks to be successfully trained on classification tasks with arbitrarily many classes, as well as improving out-of-distribution detection performance. Second, taking advantage of this new training criterion, this paper investigates using Prior Networks to detect adversarial attacks and proposes a generalized form of adversarial training. It is shown that the construction of successful adaptive whitebox attacks, which affect the prediction and evade detection, against Prior Networks trained on CIFAR-10 and CIFAR-100 using the proposed approach requires a greater amount of computational effort than against networks defended using standard adversarial training or MC-dropout.

artificial intelligence, machine learning, survey article, (21 more...)

Neural Information Processing Systems

Country:

North America > United States (0.28)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre:

Research Report (0.86)
Overview (0.54)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military (0.80)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

Deep Generative Model for Periodic Graphs

Neural Information Processing SystemsMay-24-2025, 00:27:35 GMT

Periodic graphs are graphs consisting of repetitive local structures, such as crystal nets and polygon mesh. Their generative modeling has great potential in real-world applications such as material design and graphics synthesis. Classical models either rely on domain-specific predefined generation principles (e.g., in crystal net design), or follow geometry-based prescribed rules. Recently, deep generative models have shown great promise in automatically generating general graphs. However, their advancement into periodic graphs has not been well explored due to several key challenges in 1) maintaining graph periodicity; 2) disentangling local and global patterns; and 3) efficiency in learning repetitive patterns. To address them, this paper proposes Periodical-Graph Disentangled Variational Auto-encoder (PGD-VAE), a new deep generative model for periodic graphs that can automatically learn, disentangle, and generate local and global graph patterns.

artificial intelligence, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom > England (0.14)

Genre: Overview (0.67)

Industry:

Information Technology (0.46)
Health & Medicine > Diagnostic Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.82)

Add feedback

Hierarchical Decision Making by Generating and Following Natural Language Instructions

Hengyuan Hu, Denis Yarats, Qucheng Gong, Yuandong Tian, Mike Lewis

Neural Information Processing SystemsMay-24-2025, 00:17:25 GMT

We explore using natural language instructions as an expressive and compositional representation of complex actions for hierarchical decision making. Rather than directly selecting micro-actions, our agent first generates a plan in natural language, which is then executed by a separate model. We introduce a challenging real-time strategy game environment in which the actions of a large number of units must be coordinated across long time scales. We gather a dataset of 76 thousand pairs of instructions and executions from human play, and train instructor and executor models. Experiments show that models generate intermediate plans in natural langauge significantly outperform models that directly imitate human actions. The compositional structure of language is conducive to learning generalizable action representations.

machine learning, natural language, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country:

Oceania > Australia (0.14)
North America > United States (0.14)
North America > Canada (0.14)
Europe > Sweden (0.14)

Genre: Overview (0.46)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.48)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.46)

Add feedback

StressID: a Multimodal Dataset for Stress Identification Michele Panariello 1 Bianca D'Alpaos

Neural Information Processing SystemsMay-24-2025, 00:14:32 GMT

StressID is a new dataset specifically designed for stress identification from unimodal and multimodal data. It contains videos of facial expressions, audio recordings, and physiological signals. The video and audio recordings are acquired using an RGB camera with an integrated microphone. The physiological data is composed of electrocardiography (ECG), electrodermal activity (EDA), and respiration signals that are recorded and monitored using a wearable device. This experimental setup ensures a synchronized and high-quality multimodal data collection.

artificial intelligence, machine learning, participant, (19 more...)

Neural Information Processing Systems

Country: