AITopics | Gerken, Jan E.

Collaborating Authors

Gerken, Jan E.

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learning Chern Numbers of Topological Insulators with Gauge Equivariant Neural Networks

Huang, Longde, Balabanov, Oleksandr, Linander, Hampus, Granath, Mats, Persson, Daniel, Gerken, Jan E.

arXiv.org Artificial IntelligenceFeb-21-2025

Equivariant network architectures are a well-established tool for predicting invariant or equivariant quantities. However, almost all learning problems considered in this context feature a global symmetry, i.e. each point of the underlying space is transformed with the same group element, as opposed to a local ``gauge'' symmetry, where each point is transformed with a different group element, exponentially enlarging the size of the symmetry group. Gauge equivariant networks have so far mainly been applied to problems in quantum chromodynamics. Here, we introduce a novel application domain for gauge-equivariant networks in the theory of topological condensed matter physics. We use gauge equivariant networks to predict topological invariants (Chern numbers) of multiband topological insulators. The gauge symmetry of the network guarantees that the predicted quantity is a topological invariant. We introduce a novel gauge equivariant normalization layer to stabilize the training and prove a universal approximation theorem for our setup. We train on samples with trivial Chern number only but show that our models generalize to samples with non-trivial Chern number. We provide various ablations of our setup. Our code is available at https://github.com/sitronsea/GENet/tree/main.

artificial intelligence, chern number, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2502.15376

Country: Europe > Sweden (0.28)

Genre:

Research Report (0.64)
Overview (0.48)

Industry: Education (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Emergent Equivariance in Deep Ensembles

Gerken, Jan E., Kessel, Pan

arXiv.org Artificial IntelligenceJun-15-2024

We show that deep ensembles become equivariant for all inputs and at all training times by simply using data augmentation. Crucially, equivariance holds off-manifold and for any architecture in the infinite width limit. The equivariance is emergent in the sense that predictions of individual ensemble members are not equivariant but their collective prediction is. Neural tangent kernel theory is used to derive this result and we verify our theoretical insights using detailed numerical experiments.

artificial intelligence, ensemble, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2403.03103

Country: Europe > Austria > Vienna (0.14)

Genre: Research Report (0.82)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Equivariant Neural Tangent Kernels

Misof, Philipp, Kessel, Pan, Gerken, Jan E.

arXiv.org Artificial IntelligenceJun-10-2024

Equivariant neural networks have in recent years become an important technique for guiding architecture selection for neural networks with many applications in domains ranging from medical image analysis to quantum chemistry. In particular, as the most general linear equivariant layers with respect to the regular representation, group convolutions have been highly impactful in numerous applications. Although equivariant architectures have been studied extensively, much less is known about the training dynamics of equivariant neural networks. Concurrently, neural tangent kernels (NTKs) have emerged as a powerful tool to analytically understand the training dynamics of wide neural networks. In this work, we combine these two fields for the first time by giving explicit expressions for NTKs of group convolutional neural networks. In numerical experiments, we demonstrate superior performance for equivariant NTKs over non-equivariant NTKs on a classification task for medical images.

artificial intelligence, deep learning, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2406.06504

Country:

Europe > Sweden (0.14)
Europe > Switzerland (0.14)

Genre: Research Report (0.82)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.87)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

HEAL-SWIN: A Vision Transformer On The Sphere

Carlsson, Oscar, Gerken, Jan E., Linander, Hampus, Spieß, Heiner, Ohlsson, Fredrik, Petersson, Christoffer, Persson, Daniel

arXiv.org Artificial IntelligenceJul-14-2023

High-resolution wide-angle fisheye images are becoming more and more important for robotics applications such as autonomous driving. However, using ordinary convolutional neural networks or vision transformers on this data is problematic due to projection and distortion losses introduced when projecting to a rectangular grid on the plane. We introduce the HEAL-SWIN transformer, which combines the highly uniform Hierarchical Equal Area iso-Latitude Pixelation (HEALPix) grid used in astrophysics and cosmology with the Hierarchical Shifted-Window (SWIN) transformer to yield an efficient and flexible model capable of training on high-resolution, distortion-free spherical data. In HEAL-SWIN, the nested structure of the HEALPix grid is used to perform the patching and windowing operations of the SWIN transformer, resulting in a one-dimensional representation of the spherical data with minimal computational overhead. We demonstrate the superior performance of our model for semantic segmentation and depth regression tasks on both synthetic and real automotive datasets.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2307.07313

Country:

Europe > Sweden (0.30)
Europe > Germany (0.28)

Genre: Research Report (0.50)

Industry: Transportation > Ground > Road (0.49)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback