AITopics | Linander, Hampus

Plotting

Linander, Hampus

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Bayesian Predictive Coding

Tschantz, Alexander, Koudahl, Magnus, Linander, Hampus, Da Costa, Lancelot, Heins, Conor, Beck, Jeff, Buckley, Christopher

arXiv.org Machine LearningMar-31-2025

Predictive coding (PC) is an influential theory of information processing in the brain, providing a biologically plausible alternative to backpropagation. It is motivated in terms of Bayesian inference, as hidden states and parameters are optimised via gradient descent on variational free energy. However, implementations of PC rely on maximum \textit{a posteriori} (MAP) estimates of hidden states and maximum likelihood (ML) estimates of parameters, limiting their ability to quantify epistemic uncertainty. In this work, we investigate a Bayesian extension to PC that estimates a posterior distribution over network parameters. This approach, termed Bayesian Predictive coding (BPC), preserves the locality of PC and results in closed-form Hebbian weight updates. Compared to PC, our BPC algorithm converges in fewer epochs in the full-batch setting and remains competitive in the mini-batch setting. Additionally, we demonstrate that BPC offers uncertainty quantification comparable to existing methods in Bayesian deep learning, while also improving convergence properties. Together, these results suggest that BPC provides a biologically plausible method for Bayesian learning in the brain, as well as an attractive approach to uncertainty quantification in deep learning.

artificial intelligence, machine learning, posterior distribution, (16 more...)

arXiv.org Machine Learning

2503.24016

Country:

North America > United States (0.28)
Europe (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)

Add feedback

Learning Chern Numbers of Topological Insulators with Gauge Equivariant Neural Networks

Huang, Longde, Balabanov, Oleksandr, Linander, Hampus, Granath, Mats, Persson, Daniel, Gerken, Jan E.

arXiv.org Artificial IntelligenceFeb-21-2025

Equivariant network architectures are a well-established tool for predicting invariant or equivariant quantities. However, almost all learning problems considered in this context feature a global symmetry, i.e. each point of the underlying space is transformed with the same group element, as opposed to a local ``gauge'' symmetry, where each point is transformed with a different group element, exponentially enlarging the size of the symmetry group. Gauge equivariant networks have so far mainly been applied to problems in quantum chromodynamics. Here, we introduce a novel application domain for gauge-equivariant networks in the theory of topological condensed matter physics. We use gauge equivariant networks to predict topological invariants (Chern numbers) of multiband topological insulators. The gauge symmetry of the network guarantees that the predicted quantity is a topological invariant. We introduce a novel gauge equivariant normalization layer to stabilize the training and prove a universal approximation theorem for our setup. We train on samples with trivial Chern number only but show that our models generalize to samples with non-trivial Chern number. We provide various ablations of our setup. Our code is available at https://github.com/sitronsea/GENet/tree/main.

artificial intelligence, chern number, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2502.15376

Country: Europe > Sweden (0.28)

Genre:

Research Report (0.64)
Overview (0.48)

Industry: Education (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Uncertainty quantification in fine-tuned LLMs using LoRA ensembles

Balabanov, Oleksandr, Linander, Hampus

arXiv.org Machine LearningFeb-19-2024

Fine-tuning large language models can improve task specific performance, although a general understanding of what the fine-tuned model has learned, forgotten and how to trust its predictions is still missing. We derive principled uncertainty quantification for fine-tuned LLMs with posterior approximations using computationally efficient low-rank adaptation ensembles. We analyze three common multiple-choice datasets using low-rank adaptation ensembles based on Mistral-7b, and draw quantitative and qualitative conclusions on their perceived complexity and model efficacy on the different target domains during and after fine-tuning. In particular, backed by the numerical experiments, we hypothesise about signals from entropic uncertainty measures for data domains that are inherently difficult for a given architecture to learn.

large language model, machine learning, natural language, (20 more...)

arXiv.org Machine Learning

2402.12264

Country:

Europe (0.68)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.64)

Industry: Education (0.48)

Add feedback

HEAL-SWIN: A Vision Transformer On The Sphere

Carlsson, Oscar, Gerken, Jan E., Linander, Hampus, Spieß, Heiner, Ohlsson, Fredrik, Petersson, Christoffer, Persson, Daniel

arXiv.org Artificial IntelligenceJul-14-2023

High-resolution wide-angle fisheye images are becoming more and more important for robotics applications such as autonomous driving. However, using ordinary convolutional neural networks or vision transformers on this data is problematic due to projection and distortion losses introduced when projecting to a rectangular grid on the plane. We introduce the HEAL-SWIN transformer, which combines the highly uniform Hierarchical Equal Area iso-Latitude Pixelation (HEALPix) grid used in astrophysics and cosmology with the Hierarchical Shifted-Window (SWIN) transformer to yield an efficient and flexible model capable of training on high-resolution, distortion-free spherical data. In HEAL-SWIN, the nested structure of the HEALPix grid is used to perform the patching and windowing operations of the SWIN transformer, resulting in a one-dimensional representation of the spherical data with minimal computational overhead. We demonstrate the superior performance of our model for semantic segmentation and depth regression tasks on both synthetic and real automotive datasets.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2307.07313

Country:

Europe > Sweden (0.30)
Europe > Germany (0.28)

Genre: Research Report (0.50)

Industry: Transportation > Ground > Road (0.49)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Real-time semantic segmentation on FPGAs for autonomous vehicles with hls4ml

Ghielmetti, Nicolò, Loncar, Vladimir, Pierini, Maurizio, Roed, Marcel, Summers, Sioni, Aarrestad, Thea, Petersson, Christoffer, Linander, Hampus, Ngadiuba, Jennifer, Lin, Kelvin, Harris, Philip

arXiv.org Machine LearningMay-16-2022

In this paper, we investigate how field programmable gate arrays can serve as hardware accelerators for real-time semantic segmentation tasks relevant for autonomous driving. Considering compressed versions of the ENet convolutional neural network architecture, we demonstrate a fully-on-chip deployment with a latency of 4.9 ms per image, using less than 30% of the available resources on a Xilinx ZCU102 evaluation board. The latency is reduced to 3 ms per image when increasing the batch size to ten, corresponding to the use case where the autonomous vehicle receives inputs from multiple cameras simultaneously. We show, through aggressive filter reduction and heterogeneous quantization-aware training, and an optimized implementation of convolutional layers, that the power consumption and resource utilization can be significantly reduced while maintaining accuracy on the Cityscapes dataset.

artificial intelligence, implementation, machine learning, (17 more...)

arXiv.org Machine Learning

2205.0769

Country:

Europe (1.00)
North America > United States (0.94)

Genre: Research Report (0.50)

Industry: Automobiles & Trucks (0.66)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

Fast convolutional neural networks on FPGAs with hls4ml

Aarrestad, Thea, Loncar, Vladimir, Pierini, Maurizio, Summers, Sioni, Ngadiuba, Jennifer, Petersson, Christoffer, Linander, Hampus, Iiyama, Yutaro, Di Guglielmo, Giuseppe, Duarte, Javier, Harris, Philip, Rankin, Dylan, Jindariani, Sergo, Pedro, Kevin, Tran, Nhan, Liu, Mia, Kreinar, Edward, Wu, Zhenbin, Hoang, Duc

arXiv.org Machine LearningJan-13-2021

The hls4ml library [1, 2] is an open source software designed to facilitate the deployment of machine learning (ML) models on field-programmable gate arrays (FPGAs), targeting low-latency and low-power edge applications. Taking as input a neural network model, hls4ml generates C/C code designed to be transpiled into FPGA firmware by processing it with a high-level synthesis (HLS) library. The development of hls4ml was historically driven by the need to integrate ML algorithms in the first stage of the real-time data processing of particle physics experiments operating at the CERN Large Hadron Collider (LHC). The LHC produces high-energy proton collisions (or events) every 25 ns, each consisting of about 1 MB of raw data. Since this throughput is overwhelming for the currently available processing and storage resources, the LHC experiments run a real-time event selection system, the so-called Level-1 trigger (L1T), to reduce the event rate from 40 MHz to 100 kHz [3-6]. Due to the size of the buffering system, the L1T system operates with a fixed latency of O(1 µs). While hls4ml excels as a tool to automatically generate low-latency ML firmware for L1T applications, it also offers interesting opportunities for edge-computing applications beyond particle physics whenever efficient, e.g.

artificial intelligence, deep learning, neural network, (20 more...)

arXiv.org Machine Learning

2101.05108

Country:

Europe (1.00)
North America > United States > Illinois (0.28)
North America > United States > Indiana > Tippecanoe County (0.14)
North America > United States > California > San Diego County (0.14)

Genre: Research Report (1.00)

Industry:

Government > Regional Government (0.67)
Energy > Oil & Gas (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Architecture (1.00)

Add feedback