AITopics

2411.00446

Country:

Europe (1.00)
North America > United States (0.92)

Genre: Research Report (0.64)

Industry:

Energy (0.46)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

arXiv.org Artificial IntelligenceOct-30-2024

Does equivariance matter at scale?

Brehmer, Johann, Behrends, Sönke, de Haan, Pim, Cohen, Taco

Given large data sets and sufficient compute, is it beneficial to design neural architectures for the structure and symmetries of each problem? Or is it more efficient to learn them from data? We study empirically how equivariant and non-equivariant networks scale with compute and training samples. Focusing on a benchmark problem of rigid-body interactions and on general-purpose transformer architectures, we perform a series of experiments, varying the model size, training steps, and dataset size. We find evidence for three conclusions. First, equivariance improves data efficiency, but training non-equivariant models with data augmentation can close this gap given sufficient epochs. Second, scaling with compute follows a power law, with equivariant models outperforming non-equivariant ones at each tested compute budget. Finally, the optimal allocation of a compute budget onto model size and training duration differs between equivariant and non-equivariant models.

artificial intelligence, machine learning, natural language, (17 more...)

2410.23179

Country: Europe > Spain (0.14)

Genre: Research Report > New Finding (0.93)

Industry:

Leisure & Entertainment (0.67)
Health & Medicine > Diagnostic Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

arXiv.org Artificial IntelligenceOct-16-2024

Continuous normalizing flows for lattice gauge theories

Gerdes, Mathis, de Haan, Pim, Bondesan, Roberto, Cheng, Miranda C. N.

Continuous normalizing flows are known to be highly expressive and flexible, which allows for easier incorporation of large symmetries and makes them a powerful tool for sampling in lattice field theories. Building on previous work, we present a general continuous normalizing flow architecture for matrix Lie groups that is equivariant under group transformations. We apply this to lattice gauge theories in two dimensions as a proof-of-principle and demonstrate competitive performance, showing its potential as a tool for future lattice sampling tasks.

artificial intelligence, gauge theory, machine learning, (17 more...)

2410.13161

Country: Europe (0.28)

Genre: Research Report (0.64)

Industry: Energy (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

arXiv.org Machine LearningOct-10-2024

Noether's razor: Learning Conserved Quantities

van der Ouderaa, Tycho F. A., van der Wilk, Mark, de Haan, Pim

Symmetries have proven useful in machine learning models, improving generalisation and overall performance. At the same time, recent advancements in learning dynamical systems rely on modelling the underlying Hamiltonian to guarantee the conservation of energy. These approaches can be connected via a seminal result in mathematical physics: Noether's theorem, which states that symmetries in a dynamical system correspond to conserved quantities. This work uses Noether's theorem to parameterise symmetries as learnable conserved quantities. We then allow conserved quantities and associated symmetries to be learned directly from train data through approximate Bayesian model selection, jointly with the regular training procedure. As training objective, we derive a variational lower bound to the marginal likelihood. The objective automatically embodies an Occam's Razor effect that avoids collapse of conservation laws to the trivial constant, without the need to manually add and tune additional regularisers. We demonstrate a proof-ofprinciple on n-harmonic oscillators and n-body systems. We find that our method correctly identifies the correct conserved quantities and U(n) and SE(n) symmetry groups, improving overall performance and predictive accuracy on test data.

artificial intelligence, bayesian inference, machine learning, (14 more...)

2410.08087

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.50)

arXiv.org Machine LearningJul-9-2024

Lorentz-Equivariant Geometric Algebra Transformers for High-Energy Physics

Spinner, Jonas, Bresó, Victor, de Haan, Pim, Plehn, Tilman, Thaler, Jesse, Brehmer, Johann

Extracting scientific understanding from particle-physics experiments requires solving diverse learning problems with high precision and good data efficiency. We propose the Lorentz Geometric Algebra Transformer (L-GATr), a new multi-purpose architecture for high-energy physics. L-GATr represents high-energy data in a geometric algebra over four-dimensional space-time and is equivariant under Lorentz transformations, the symmetry group of relativistic kinematics. At the same time, the architecture is a Transformer, which makes it versatile and scalable to large systems. L-GATr is first demonstrated on regression and classification tasks from particle physics. We then construct the first Lorentz-equivariant generative model: a continuous normalizing flow based on an L-GATr network, trained with Riemannian flow matching. Across our experiments, L-GATr is on par with or outperforms strong domain-specific baselines.

artificial intelligence, deep learning, machine learning, (18 more...)

2405.14806

Country:

North America > United States (0.46)
Europe > Germany (0.46)

Genre: Research Report (0.82)

Industry: Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

arXiv.org Artificial IntelligenceDec-20-2023

Learning Lattice Quantum Field Theories with Equivariant Continuous Flows

Gerdes, Mathis, de Haan, Pim, Rainone, Corrado, Bondesan, Roberto, Cheng, Miranda C. N.

We propose a novel machine learning method for sampling from the high-dimensional probability distributions of Lattice Field Theories, which is based on a single neural ODE layer and incorporates the full symmetries of the problem. We test our model on the $\phi^4$ theory, showing that it systematically outperforms previously proposed flow-based methods in sampling efficiency, and the improvement is especially pronounced for larger lattices. Furthermore, we demonstrate that our model can learn a continuous family of theories at once, and the results of learning can be transferred to larger lattices. Such generalizations further accentuate the advantages of machine learning methods.

artificial intelligence, lattice, machine learning, (16 more...)

doi: 10.21468/SciPostPhys.15.6.238

2207.00283

Country:

Europe > Netherlands (0.14)
Asia > Taiwan (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

arXiv.org Artificial IntelligenceDec-6-2023

FoMo Rewards: Can we cast foundation models as reward functions?

Lubana, Ekdeep Singh, Brehmer, Johann, de Haan, Pim, Cohen, Taco

We explore the viability of casting foundation models as generic reward functions for reinforcement learning. To this end, we propose a simple pipeline that interfaces an off-the-shelf vision model with a large language model. Specifically, given a trajectory of observations, we infer the likelihood of an instruction describing the task that the user wants an agent to perform. We show that this generic likelihood function exhibits the characteristics ideally expected from a reward function: it associates high values with the desired behaviour and lower values for several similar, but incorrect policies. Overall, our work opens the possibility of designing open-ended agents for interactive tasks via foundation models.

arxiv preprint arxiv, large language model, machine learning, (17 more...)

2312.03881

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

arXiv.org Machine LearningNov-20-2023

Geometric Algebra Transformer

Brehmer, Johann, de Haan, Pim, Behrends, Sönke, Cohen, Taco

Problems involving geometric data arise in physics, chemistry, robotics, computer vision, and many other fields. Such data can take numerous forms, for instance points, direction vectors, translations, or rotations, but to date there is no single architecture that can be applied to such a wide variety of geometric types while respecting their symmetries. In this paper we introduce the Geometric Algebra Transformer (GATr), a general-purpose architecture for geometric data. GATr represents inputs, outputs, and hidden states in the projective geometric (or Clifford) algebra, which offers an efficient 16-dimensional vector-space representation of common geometric objects as well as operators acting on them. GATr is equivariant with respect to E(3), the symmetry group of 3D Euclidean space. As a Transformer, GATr is versatile, efficient, and scalable. We demonstrate GATr in problems from n-body modeling to wall-shear-stress estimation on large arterial meshes to robotic motion planning. GATr consistently outperforms both non-geometric and equivariant baselines in terms of error, data efficiency, and scalability.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

2305.18415

Country: Europe (0.46)

Genre: Research Report (0.63)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.34)

arXiv.org Artificial IntelligenceNov-8-2023

Euclidean, Projective, Conformal: Choosing a Geometric Algebra for Equivariant Transformers

de Haan, Pim, Cohen, Taco, Brehmer, Johann

The Geometric Algebra Transformer (GATr) is a versatile architecture for geometric deep learning based on projective geometric algebra. We generalize this architecture into a blueprint that allows one to construct a scalable transformer architecture given any geometric (or Clifford) algebra. We study versions of this architecture for Euclidean, projective, and conformal algebras, all of which are suited to represent 3D data, and evaluate them in theory and practice. The simplest Euclidean architecture is computationally cheap, but has a smaller symmetry group and is not as sample-efficient, while the projective model is not sufficiently expressive. Both the conformal algebra and an improved version of the projective algebra define powerful, performant architectures.

artificial intelligence, equivariant transformer, machine learning, (6 more...)

2311.04744

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

arXiv.org Machine LearningOct-19-2023

EDGI: Equivariant Diffusion for Planning with Embodied Agents

Brehmer, Johann, Bose, Joey, de Haan, Pim, Cohen, Taco

Embodied agents operate in a structured world, often solving tasks with spatial, temporal, and permutation symmetries. Most algorithms for planning and model-based reinforcement learning (MBRL) do not take this rich geometric structure into account, leading to sample inefficiency and poor generalization. We introduce the Equivariant Diffuser for Generating Interactions (EDGI), an algorithm for MBRL and planning that is equivariant with respect to the product of the spatial symmetry group SE(3), the discrete-time translation group Z, and the object permutation group Sn. EDGI follows the Diffuser framework (Janner et al., 2022) in treating both learning a world model and planning in it as a conditional generative modeling problem, training a diffusion model on an offline trajectory dataset. We introduce a new SE(3)xZxSn-equivariant diffusion model that supports multiple representations. We integrate this model in a planning loop, where conditioning and classifier guidance let us softly break the symmetry for specific tasks as needed. On object manipulation and navigation tasks, EDGI is substantially more sample efficient and generalizes better across the symmetry group than non-equivariant models.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

2303.1241

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.70)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
(2 more...)