AITopics | Müller, Thomas

Collaborating Authors

Müller, Thomas

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Factual Confidence of LLMs: on Reliability and Robustness of Current Estimators

Mahaut, Matéo, Aina, Laura, Czarnowska, Paula, Hardalov, Momchil, Müller, Thomas, Màrquez, Lluís

arXiv.org Artificial IntelligenceJun-19-2024

Large Language Models (LLMs) tend to be unreliable in the factuality of their answers. To address this problem, NLP researchers have proposed a range of techniques to estimate LLM's confidence over facts. However, due to the lack of a systematic comparison, it is not clear how the different methods compare to one another. To fill this gap, we present a survey and empirical comparison of estimators of factual confidence. We define an experimental framework allowing for fair comparison, covering both fact-verification and question answering. Our experiments across a series of LLMs indicate that trained hidden-state probes provide the most reliable confidence estimates, albeit at the expense of requiring access to weights and training data. We also conduct a deeper assessment of factual confidence by measuring the consistency of model behavior under meaning-preserving variations in the input. We find that the confidence of LLMs is often unstable across semantically equivalent inputs, suggesting that there is much room for improvement of the stability of models' parametric knowledge. Our code is available at (https://github.com/amazon-science/factual-confidence-of-llms).

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2406.13415

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Louisiana (0.14)

Genre:

Overview (1.00)
Research Report > New Finding (0.46)

Industry: Government > Regional Government (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Labeled Morphological Segmentation with Semi-Markov Models

Cotterell, Ryan, Müller, Thomas, Fraser, Alexander, Schütze, Hinrich

arXiv.org Artificial IntelligenceApr-13-2024

We present labeled morphological segmentation, an alternative view of morphological processing that unifies several tasks. From an annotation standpoint, we additionally introduce a new hierarchy of morphotactic tagsets. Finally, we develop \modelname, a discriminative morphological segmentation system that, contrary to previous work, explicitly models morphotactics. We show that \textsc{chipmunk} yields improved performance on three tasks for all six languages: (i) morphological segmentation, (ii) stemming and (iii) morphological tag classification. On morphological segmentation, our method shows absolute improvements of 2--6 points $F_1$ over the baseline.

machine learning, natural language, segmentation, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.18653/v1/K15-1017

2404.08997

Country:

Europe (1.00)
North America > United States > Ohio (0.14)
North America > United States > New York (0.14)
(2 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Towards interpretable quantum machine learning via single-photon quantum walks

Flamini, Fulvio, Krumm, Marius, Fiderer, Lukas J., Müller, Thomas, Briegel, Hans J.

arXiv.org Artificial IntelligenceOct-16-2023

Variational quantum algorithms represent a promising approach to quantum machine learning where classical neural networks are replaced by parametrized quantum circuits. However, both approaches suffer from a clear limitation, that is a lack of interpretability. Here, we present a variational method to quantize projective simulation (PS), a reinforcement learning model aimed at interpretable artificial intelligence. Decision making in PS is modeled as a random walk on a graph describing the agent's memory. To implement the quantized model, we consider quantum walks of single photons in a lattice of tunable Mach-Zehnder interferometers trained via variational algorithms. Using an example from transfer learning, we show that the quantized PS model can exploit quantum interference to acquire capabilities beyond those of its classical counterpart. Finally, we discuss the role of quantum interference for training and tracing the decision making process, paving the way for realizations of interpretable quantum learning agents.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2301.13669

Country: Europe > Austria (0.14)

Genre: Research Report (0.70)

Industry:

Government (0.46)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.66)

Add feedback

Parallel Inversion of Neural Radiance Fields for Robust Pose Estimation

Lin, Yunzhi, Müller, Thomas, Tremblay, Jonathan, Wen, Bowen, Tyree, Stephen, Evans, Alex, Vela, Patricio A., Birchfield, Stan

arXiv.org Artificial IntelligenceMar-10-2023

We present a parallelized optimization method based on fast Neural Radiance Fields (NeRF) for estimating 6-DoF pose of a camera with respect to an object or scene. Given a single observed RGB image of the target, we can predict the translation and rotation of the camera by minimizing the residual between pixels rendered from a fast NeRF model and pixels in the observed image. We integrate a momentum-based camera extrinsic optimization procedure into Instant Neural Graphics Primitives, a recent exceptionally fast NeRF implementation. By introducing parallel Monte Carlo sampling into the pose estimation task, our method overcomes local minima and improves efficiency in a more extensive search space. We also show the importance of adopting a more robust pixel-based loss function to reduce error. Experiments demonstrate that our method can achieve improved generalization and robustness on both synthetic and real-world benchmarks.

artificial intelligence, machine learning, pose estimation, (14 more...)

arXiv.org Artificial Intelligence

2210.10108

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Vision > Video Understanding (0.65)

Add feedback

MATE: Multi-view Attention for Table Transformer Efficiency

Eisenschlos, Julian Martin, Gor, Maharshi, Müller, Thomas, Cohen, William W.

arXiv.org Artificial IntelligenceSep-9-2021

This work presents a sparse-attention Transformer architecture for modeling documents that contain large tables. Tables are ubiquitous on the web, and are rich in information. However, more than 20% of relational tables on the web have 20 or more rows (Cafarella et al., 2008), and these large tables present a challenge for current Transformer models, which are typically limited to 512 tokens. Here we propose MATE, a novel Transformer architecture designed to model the structure of web tables. MATE uses sparse attention in a way that allows heads to efficiently attend to either rows or columns in a table. This architecture scales linearly with respect to speed and memory, and can handle documents containing more than 8000 tokens with current accelerators. MATE also has a more appropriate inductive bias for tabular data, and sets a new state-of-the-art for three table reasoning datasets. For HybridQA (Chen et al., 2020b), a dataset that involves large documents containing tables, we improve the best prior result by 19 points.

artificial intelligence, computational linguistics, neural network, (17 more...)

arXiv.org Artificial Intelligence

2109.04312

Country:

Asia (0.93)
Europe (0.93)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.82)

Industry:

Automobiles & Trucks (0.93)
Leisure & Entertainment > Sports > Motorsports > Formula One (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Collective defense of honeybee colonies: experimental results and theoretical modeling

López-Incera, Andrea, Nouvian, Morgane, Ried, Katja, Müller, Thomas, Briegel, Hans J.

arXiv.org Artificial IntelligenceOct-14-2020

Social insect colonies routinely face large vertebrate predators, against which they need to mount a collective defense. To do so, honeybees use an alarm pheromone that recruits nearby bees into mass stinging of the perceived threat. This alarm pheromone is carried directly on the stinger, hence its concentration builds up during the course of the attack. Here, we investigate how individual bees react to different alarm pheromone concentrations, and how this evolved response-pattern leads to better coordination at the group level. We first present an individual dose-response curve to the alarm pheromone, obtained experimentally. Second, we apply Projective Simulation to model each bee as an artificial learning agent that relies on the pheromone concentration to decide whether to sting or not. If the emergent collective performance benefits the colony, the individual reactions that led to it are enhanced via reinforcement learning, thus emulating natural selection. Predators are modeled in a realistic way so that the effect of factors such as their resistance, their killing rate or their frequency of attacks can be studied. We are able to reproduce the experimentally measured response-pattern of real bees, and to identify the main selection pressures that shaped it. Finally, we apply the model to a case study: by tuning the parameters to represent the environmental conditions of European or African bees, we can predict the difference in aggressiveness observed between these two subspecies.

artificial intelligence, health & medicine, predator, (19 more...)

arXiv.org Artificial Intelligence

2010.07326

Country:

North America > United States (0.46)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Understanding tables with intermediate pre-training

Eisenschlos, Julian Martin, Krichene, Syrine, Müller, Thomas

arXiv.org Artificial IntelligenceOct-5-2020

Table entailment, the binary classification task of finding if a sentence is supported or refuted by the content of a table, requires parsing language and table structure as well as numerical and discrete reasoning. While there is extensive work on textual entailment, table entailment is less well studied. We adapt TAPAS (Herzig et al., 2020), a table-based BERT model, to recognize entailment. Motivated by the benefits of data augmentation, we create a balanced dataset of millions of automatically created training examples which are learned in an intermediate step prior to fine-tuning. This new data is not only useful for table entailment, but also for SQA (Iyyer et al., 2017), a sequential table QA task. To be able to use long examples as input of BERT models, we evaluate table pruning techniques as a pre-processing step to drastically improve the training and prediction efficiency at a moderate drop in accuracy. The different methods set the new state-of-the-art on the TabFact (Chen et al., 2020) and SQA datasets.

computational linguistics, deep learning, neural network, (21 more...)

arXiv.org Artificial Intelligence

2010.00571

Country:

Europe (1.00)
Asia (0.68)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Missouri > Jackson County > Kansas City (0.14)

Genre: Research Report (0.40)

Industry: Leisure & Entertainment > Sports (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Neural Control Variates

Müller, Thomas, Rousselle, Fabrice, Novák, Jan, Keller, Alexander

arXiv.org Machine LearningSep-4-2020

We propose neural control variates (NCV) for unbiased variance reduction in parametric Monte Carlo integration. So far, the core challenge of applying the method of control variates has been finding a good approximation of the integrand that is cheap to integrate. We show that a set of neural networks can face that challenge: a normalizing flow that approximates the shape of the integrand and another neural network that infers the solution of the integral equation. We also propose to leverage a neural importance sampler to estimate the difference between the original integrand and the learned control variate. To optimize the resulting parametric estimator, we derive a theoretically optimal, variance-minimizing loss function, and propose an alternative, composite loss for stable online training in practice. When applied to light transport simulation, neural control variates are capable of matching the state-of-the-art performance of other unbiased approaches, while providing means to develop more performant, practical solutions. Specifically, we show that the learned light-field approximation is of sufficient quality for high-order bounces, allowing us to omit the error correction and thereby dramatically reduce the noise at the cost of negligible visible bias.

artificial intelligence, control variate, neural network, (15 more...)

arXiv.org Machine Learning

2006.01524

Country:

North America > United States > California (0.28)
North America > United States > Oklahoma > Beaver County (0.25)

Genre: Research Report (1.00)

Industry: Education > Educational Setting > Online (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Graphics (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)

Add feedback

How a minimal learning agent can infer the existence of unobserved variables in a complex environment

Ried, Katja, Eva, Benjamin, Müller, Thomas, Briegel, Hans J.

arXiv.org Artificial IntelligenceOct-15-2019

According to a mainstream position in contemporary cognitive science and philosophy, the use of abstract compositional concepts is both a necessary and a sufficient condition for the presence of genuine thought. In this article, we show how the ability to develop and utilise abstract conceptual structures can be achieved by a particular kind of learning agents. More specifically, we provide and motivate a concrete operational definition of what it means for these agents to be in possession of abstract concepts, before presenting an explicit example of a minimal architecture that supports this capability. We then proceed to demonstrate how the existence of abstract conceptual structures can be operationally useful in the process of employing previously acquired knowledge in the face of new experiences, thereby vindicating the natural conjecture that the cognitive functions of abstraction and generalisation are closely related. Keywords: concept formation, projective simulation, reinforcement learning, transparent artificial intelligence, theory formation, explainable artificial intelligence (XAI)

agent, attention deficit hyperactivity disorder, neural network, (23 more...)

arXiv.org Artificial Intelligence

1910.06985

Country:

North America > United States (0.28)
Europe > United Kingdom > England (0.14)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.34)
Health & Medicine > Therapeutic Area > Neurology (0.34)
Health & Medicine > Consumer Health (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Add feedback

Neural Importance Sampling

Müller, Thomas, McWilliams, Brian, Rousselle, Fabrice, Gross, Markus, Novák, Jan

arXiv.org Machine LearningAug-11-2018

We propose to use deep neural networks for generating samples in Monte Carlo integration. Our work is based on non-linear independent component analysis, which we extend in numerous ways to improve performance and enable its application to integration problems. First, we introduce piecewise-polynomial coupling transforms that greatly increase the modeling power of individual coupling layers. Second, we propose to preprocess the inputs of neural networks using one-blob encoding, which stimulates localization of computation and improves inference. Third, we derive a gradient-descent-based optimization for the KL and the $\chi^2$ divergence for the specific application of Monte Carlo integration with stochastic estimates of the target distribution. Our approach enables fast and accurate inference and efficient sample generation independent of the dimensionality of the integration domain. We demonstrate the benefits of our approach for generating natural images and in two applications to light-transport simulation. First, we show how to learn joint path-sampling densities in primary sample space and how to importance sample multi-dimensional path prefixes thereof. Second, we use our technique to extract conditional directional densities driven by the triple product of the rendering equation, and leverage them for path guiding. In all applications, our approach yields on-par or higher performance at equal sample count than competing techniques.

coupling layer, deep learning, neural network, (18 more...)

arXiv.org Machine Learning

1808.03856

Country:

North America > United States (0.46)
Europe (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback