AITopics | chapter 3

Collaborating Authors

chapter 3 Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Ratio-based Loss Functions

Helgerth, Lena, Christmann, Andreas

arXiv.org Machine LearningMay-8-2026

Algorithms in machine learning and AI do critically depend on at least three key components: (i) the risk function, which is the expectation of the loss function, (ii) the function space, which is often called the hypothesis space, and (iii) the set of probability measures, which are allowed for the specified algorithm. This paper gives a survey of a certain class of loss functions, which we call ratio-based. In supervised learning, margin-based loss functions for classification tasks depending on the product of the output values $y_i$ and the predictions $f(x_i)$ as well as distance-based loss functions depending on the difference of $y_i$ and $f(x_i)$ for regression are common. Distance-based loss functions are in particular useful, if an additive model assumption seems plausible, i.e. the common signal plus noise assumption. However, in the literature, several loss functions proposed for regression purposes have a multiplicative error structure in mind and pay attention to relative errors, i.e. to the ratio of $y_i$ and $f(x_i)$. In this survey article, we systematically investigate such ratio-based loss functions and propose a few new losses, which may be interesting for future research. We concentrate on investigating general properties of ratio-based loss functions like continuity, Lipschitz-continuity, convexity, and differentiability, because these properties play a central role in most machine learning algorithms. Therefore, we do not focus on some specific machine learning algorithm to derive universal consistency, learning rates, or stability results. Instead, we want to enable future research in this direction.

artificial intelligence, machine learning, survey article, (19 more...)

arXiv.org Machine Learning

2605.05808

Country: Europe (0.28)

Genre: Overview (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)

Add feedback

Transformers for Tabular Data: A Training Perspective of Self-Attention via Optimal Transport

Candelieri, Antonio, Quadrio, Alessandro

arXiv.org Machine LearningDec-11-2025

This thesis examines self-attention training through the lens of Optimal Transport (OT) and develops an OT-based alternative for tabular classification. The study tracks intermediate projections of the self-attention layer during training and evaluates their evolution using discrete OT metrics, including Wasserstein distance, Monge gap, optimality, and efficiency. Experiments are conducted on classification tasks with two and three classes, as well as on a biomedical dataset. Results indicate that the final self-attention mapping often approximates the OT optimal coupling, yet the training trajectory remains inefficient. Pretraining the MLP section on synthetic data partially improves convergence but is sensitive to their initialization. To address these limitations, an OT-based algorithm is introduced: it generates class-specific dummy Gaussian distributions, computes an OT alignment with the data, and trains an MLP to generalize this mapping. The method achieves accuracy comparable to Transformers while reducing computational cost and scaling more efficiently under standardized inputs, though its performance depends on careful dummy-geometry design. All experiments and implementations are conducted in R.

ransformer, representation, self-attention, (15 more...)

arXiv.org Machine Learning

2512.0953

Country: Asia > India > Karnataka > Bengaluru (0.04)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Architectures for Building Agentic AI

Nowaczyk, Sławomir

arXiv.org Artificial IntelligenceDec-11-2025

This chapter argues that the reliability of agentic and generative AI is chiefly an architectural property. We define agentic systems as goal-directed, tool-using decision makers operating in closed loops, and show how reliability emerges from principled componentisation (goal manager, planner, tool-router, executor, memory, verifiers, safety monitor, telemetry), disciplined interfaces (schema-constrained, validated, least-privilege tool calls), and explicit control and assurance loops. Building on classical foundations, we propose a practical taxonomy-tool-using agents, memory-augmented agents, planning and self-improvement agents, multi-agent systems, and embodied or web agents - and analyse how each pattern reshapes the reliability envelope and failure modes. We distil design guidance on typed schemas, idempotency, permissioning, transactional semantics, memory provenance and hygiene, runtime governance (budgets, termination conditions), and simulate-before-actuate safeguards.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2512.09458

Country: Europe > Austria (0.28)

Genre: Research Report (0.64)

Industry:

Information Technology > Security & Privacy (0.68)
Health & Medicine > Consumer Health (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Add feedback

Benchmarking LLM Agents for Wealth-Management Workflows

Milsom, Rory

arXiv.org Artificial IntelligenceDec-3-2025

Modern work relies on an assortment of digital collaboration tools, yet routine processes continue to suffer from human error and delay. To address this gap, this dissertation extends TheAgentCompany with a finance-focused environment and investigates whether a general purpose LLM agent can complete representative wealth-management tasks both accurately and economically. This study introduces synthetic domain data, enriches colleague simulations, and prototypes an automatic task-generation pipeline. The study aims to create and assess an evaluation set that can meaningfully measure an agent's fitness for assistant-level wealth management work. We construct a benchmark of 12 task-pairs for wealth management assistants spanning retrieval, analysis, and synthesis/communication, with explicit acceptance criteria and deterministic graders. We seeded a set of new finance-specific data and introduced a high vs. low-autonomy variant of every task. The paper concluded that agents are limited less by mathematical reasoning and more so by end-to-end workflow reliability, and meaningfully affected by autonomy level, and that incorrect evaluation of models have hindered benchmarking.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2512.0223

Genre:

Workflow (1.00)
Research Report (0.82)

Industry: Banking & Finance > Financial Services (1.00)

Technology:

Information Technology > Software (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

An unsupervised tour through the hidden pathways of deep neural networks

Doimo, Diego

arXiv.org Artificial IntelligenceOct-29-2025

The goal of this thesis is to improve our understanding of the internal mechanisms by which deep artificial neural networks create meaningful representations and are able to generalize. We focus on the challenge of characterizing the semantic content of the hidden representations with unsupervised learning tools, partially developed by us and described in this thesis, which allow harnessing the low-dimensional structure of the data. Chapter 2. introduces Gride, a method that allows estimating the intrinsic dimension of the data as an explicit function of the scale without performing any decimation of the data set. Our approach is based on rigorous distributional results that enable the quantification of uncertainty of the estimates. Moreover, our method is simple and computationally efficient since it relies only on the distances among nearest data points. In Chapter 3, we study the evolution of the probability density across the hidden layers in some state-of-the-art deep neural networks. We find that the initial layers generate a unimodal probability density getting rid of any structure irrelevant to classification. In subsequent layers, density peaks arise in a hierarchical fashion that mirrors the semantic hierarchy of the concepts. This process leaves a footprint in the probability density of the output layer, where the topography of the peaks allows reconstructing the semantic relationships of the categories. In Chapter 4, we study the problem of generalization in deep neural networks: adding parameters to a network that interpolates its training data will typically improve its generalization performance, at odds with the classical bias-variance trade-off. We show that wide neural networks learn redundant representations instead of overfitting to spurious correlation and that redundant neurons appear only if the network is regularized and the training error is zero.

artificial intelligence, machine learning, representation, (18 more...)

arXiv.org Artificial Intelligence

2510.21582

Country: North America > United States (0.92)

Genre: Research Report > New Finding (0.92)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.92)
Health & Medicine > Therapeutic Area > Neurology (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Money, muscles and anxiety: why the manosphere clicked with young men – a visual deep dive

The GuardianOct-21-2025, 07:09:16 GMT

You are on slide 3 of chapter 7. Use right arrow to continue. Alternatively, use the open square bracket key and close square bracket key to navigate, and disable left arrow and right arrow key navigation.

key and close square bracket, right arrow key navigation, square bracket key and close, (8 more...)

The Guardian

Country:

North America > United States (0.68)
Oceania > Australia (0.04)
Europe > United Kingdom > England > Leicestershire > Loughborough (0.04)
(6 more...)

Industry:

Leisure & Entertainment > Sports (1.00)
Media > News (0.93)
Education > Educational Setting (0.68)
(3 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence (1.00)

Add feedback

Introduction to Regularization and Learning Methods for Inverse Problems

Bednarski, Danielle, Roith, Tim

arXiv.org Artificial IntelligenceAug-26-2025

These lecture notes evolve around mathematical concepts arising in inverse problems. We start by introducing inverse problems through examples such as differentiation, deconvolution, computed tomography and phase retrieval. This then leads us to the framework of well-posedness and first considerations regarding reconstruction and inversion approaches. The second chapter then first deals with classical regularization theory of inverse problems in Hilbert spaces. After introducing the pseudo-inverse, we review the concept of convergent regularization. Within this chapter we then proceed to ask the question of how to realize practical reconstruction algorithms. Here, we mainly focus on Tikhonov and sparsity promoting regularization in finite dimensional spaces. In the third chapter, we dive into modern deep-learning methods, which allow solving inverse problems in a data-dependent approach. The intersection between inverse problems and machine learning is a rapidly growing field and our exposition here restricts itself to a very limited selection of topics. Among them are learned regularization, fully-learned Bayesian estimation, post-processing strategies and plug-n-play methods.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2508.18178

Country:

Europe > Germany (0.45)
Europe > United Kingdom > England (0.27)

Genre:

Overview (0.87)
Instructional Material > Course Syllabus & Notes (0.68)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

A Deep Autoregressive Model for Dynamic Combinatorial Complexes

Tuna, Ata

arXiv.org Artificial IntelligenceMar-3-2025

We introduce DAMCC (Deep Autoregressive Model for Dynamic Combinatorial Complexes), the first deep learning model designed to generate dynamic combinatorial complexes (CCs). Unlike traditional graph-based models, CCs capture higher-order interactions, making them ideal for representing social networks, biological systems, and evolving infrastructures. While existing models primarily focus on static graphs, DAMCC addresses the challenge of modeling temporal dynamics and higher-order structures in dynamic networks. DAMCC employs an autoregressive framework to predict the evolution of CCs over time. Through comprehensive experiments on real-world and synthetic datasets, we demonstrate its ability to capture both temporal and higher-order dependencies. As the first model of its kind, DAMCC lays the foundation for future advancements in dynamic combinatorial complex modeling, with opportunities for improved scalability and efficiency on larger networks.

dataset, graph, matrix, (16 more...)

arXiv.org Artificial Intelligence

2503.01999

Country:

North America > United States (0.14)
Europe > United Kingdom > England (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.67)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

LLMs to Support a Domain Specific Knowledge Assistant

Lovin, Maria-Flavia

arXiv.org Artificial IntelligenceFeb-6-2025

This work presents a custom approach to developing a domain specific knowledge assistant for sustainability reporting using the International Financial Reporting Standards (IFRS). In this domain, there is no publicly available question-answer dataset, which has impeded the development of a high-quality chatbot to support companies with IFRS reporting. The two key contributions of this project therefore are: (1) A high-quality synthetic question-answer (QA) dataset based on IFRS sustainability standards, created using a novel generation and evaluation pipeline leveraging Large Language Models (LLMs). This comprises 1,063 diverse QA pairs that address a wide spectrum of potential user queries in sustainability reporting. Various LLM-based techniques are employed to create the dataset, including chain-of-thought reasoning and few-shot prompting. A custom evaluation framework is developed to assess question and answer quality across multiple dimensions, including faithfulness, relevance, and domain specificity. The dataset averages a score range of 8.16 out of 10 on these metrics. (2) Two architectures for question-answering in the sustainability reporting domain - a RAG pipeline and a fully LLM-based pipeline. The architectures are developed by experimenting, fine-tuning, and training on the QA dataset. The final pipelines feature an LLM fine-tuned on domain specific data and an industry classification component to improve the handling of complex queries. The RAG architecture achieves an accuracy of 85.32% on single-industry and 72.15% on cross-industry multiple-choice questions, outperforming the baseline approach by 4.67 and 19.21 percentage points, respectively. The LLM-based pipeline achieves an accuracy of 93.45% on single-industry and 80.30% on cross-industry multiple-choice questions, an improvement of 12.80 and 27.36 percentage points over the baseline, respectively.

community relations, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2502.04095

Country:

Asia (0.67)
North America > United States (0.27)

Genre:

Questionnaire & Opinion Survey (0.87)
Public Relations > Community Relations (0.75)
Research Report > New Finding (0.45)

Industry:

Transportation (1.00)
Law (1.00)
Energy > Renewable (1.00)
(7 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Mathematics of Neural Networks (Lecture Notes Graduate Course)

Smets, Bart M. N.

arXiv.org Artificial IntelligenceMar-6-2024

These are the lecture notes that accompanied the course of the same name that I taught at the Eindhoven University of Technology from 2021 to 2023. The course is intended as an introduction to neural networks for mathematics students at the graduate level and aims to make mathematics students interested in further researching neural networks. It consists of two parts: first a general introduction to deep learning that focuses on introducing the field in a formal mathematical way. The second part provides an introduction to the theory of Lie groups and homogeneous spaces and how it can be applied to design neural networks with desirable geometric equivariances. The lecture notes were made to be as self-contained as possible so as to accessible for any student with a moderate mathematics background. The course also included coding tutorials and assignments in the form of a set of Jupyter notebooks that are publicly available at https://gitlab.com/bsmetsjr/mathematics_of_neural_networks.

activation function, neural network, operator, (17 more...)

arXiv.org Artificial Intelligence

2403.04807

Country:

Europe > Netherlands > North Brabant > Eindhoven (0.24)
Europe > Netherlands > South Holland > Dordrecht (0.04)
Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)
(2 more...)

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Education > Curriculum > Subject-Specific Education (0.53)
Health & Medicine > Diagnostic Medicine > Imaging (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback