AITopics | Wattenhofer, Roger

Collaborating Authors

Wattenhofer, Roger

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Siamese SIREN: Audio Compression with Implicit Neural Representations

Lanzendörfer, Luca A., Wattenhofer, Roger

arXiv.org Artificial IntelligenceJun-22-2023

Implicit Neural Representations (INRs) have emerged as a promising method for representing diverse data modalities, including 3D shapes, images, and audio. While recent research has demonstrated successful applications of INRs in image and 3D shape compression, their potential for audio compression remains largely unexplored. Motivated by this, we present a preliminary investigation into the use of INRs for audio compression. Our study introduces Siamese SIREN, a novel approach based on the popular SIREN architecture. Our experimental results indicate that Siamese SIREN achieves superior audio reconstruction fidelity while utilizing fewer network parameters compared to previous INR architectures.

artificial intelligence, machine learning, siamese siren, (13 more...)

arXiv.org Artificial Intelligence

2306.12957

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre:

Research Report > New Finding (0.86)
Research Report > Promising Solution (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Abstract Visual Reasoning Enabled by Language

Camposampiero, Giacomo, Houmard, Loic, Estermann, Benjamin, Mathys, Joël, Wattenhofer, Roger

arXiv.org Artificial IntelligenceJun-22-2023

While artificial intelligence (AI) models have achieved human or even superhuman performance in many well-defined applications, they still struggle to show signs of broad and flexible intelligence. The Abstraction and Reasoning Corpus (ARC), a visual intelligence benchmark introduced by Fran\c{c}ois Chollet, aims to assess how close AI systems are to human-like cognitive abilities. Most current approaches rely on carefully handcrafted domain-specific program searches to brute-force solutions for the tasks present in ARC. In this work, we propose a general learning-based framework for solving ARC. It is centered on transforming tasks from the vision to the language domain. This composition of language and vision allows for pre-trained models to be leveraged at each stage, enabling a shift from handcrafted priors towards the learned priors of the models. While not yet beating state-of-the-art models on ARC, we demonstrate the potential of our approach, for instance, by solving some ARC tasks that have not been solved previously.

artificial intelligence, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2303.04091

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.49)

Add feedback

Examining the Emergence of Deductive Reasoning in Generative Language Models

Belcak, Peter, Lanzendörfer, Luca A., Wattenhofer, Roger

arXiv.org Artificial IntelligenceMay-31-2023

We conduct a preliminary inquiry into the ability of generative transformer models to deductively reason from premises provided. We observe notable differences in the performance of models coming from different training setups and find that the deductive reasoning ability increases with scale. Further, we discover that the performance generally does not decrease with the length of the deductive chain needed to reach the conclusion, with the exception of OpenAI GPT-3 and GPT-3.5 models. Our study considers a wide variety of transformer-decoder models, ranging from 117 million to 175 billion parameters in size.

artificial intelligence, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2306.01009

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.79)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.62)

Add feedback

Cascaded Beam Search: Plug-and-Play Terminology-Forcing For Neural Machine Translation

Odermatt, Frédéric, Egressy, Béni, Wattenhofer, Roger

arXiv.org Artificial IntelligenceMay-23-2023

This paper presents a plug-and-play approach for translation with terminology constraints. Terminology constraints are an important aspect of many modern translation pipelines. In both specialized domains and newly emerging domains (such as the COVID-19 pandemic), accurate translation of technical terms is crucial. Recent approaches often train models to copy terminologies from the input into the output sentence by feeding the target terminology along with the input. But this requires expensive training whenever the underlying language model is changed or the system should specialize to a new domain. We propose Cascade Beam Search, a plug-and-play terminology-forcing approach that requires no training. Cascade Beam Search has two parts: 1) logit manipulation to increase the probability of target terminologies and 2) a cascading beam setup based on grid beam search, where beams are grouped by the number of terminologies they contain. We evaluate the performance of our approach by competing against the top submissions of the WMT21 terminology translation task. Our plug-and-play approach performs on par with the winning submissions without using a domain-specific language model and with no additional training.

artificial intelligence, natural language, terminology, (15 more...)

arXiv.org Artificial Intelligence

2305.14538

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > Pennsylvania (0.14)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.67)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Automating Rigid Origami Design

Geiger, Jeremia, Martinkus, Karolis, Richter, Oliver, Wattenhofer, Roger

arXiv.org Artificial IntelligenceApr-28-2023

Rigid origami has shown potential in large diversity of practical applications. However, current rigid origami crease pattern design mostly relies on known tessellations. This strongly limits the diversity and novelty of patterns that can be created. In this work, we build upon the recently developed principle of three units method to formulate rigid origami design as a discrete optimization problem, the rigid origami game. Our implementation allows for a simple definition of diverse objectives and thereby expands the potential of rigid origami further to optimized, application-specific crease patterns. We showcase the flexibility of our formulation through use of a diverse set of search methods in several illustrative case studies. We are not only able to construct various patterns that approximate given target shapes, but to also specify abstract, function-based rewards which result in novel, foldable and functional designs for everyday objects.

artificial intelligence, crease pattern, survey article, (17 more...)

arXiv.org Artificial Intelligence

2211.13219

Country: Asia > Japan (0.14)

Genre: Research Report > Promising Solution (0.46)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Discovering Graph Generation Algorithms

Babiac, Mihai, Martinkus, Karolis, Wattenhofer, Roger

arXiv.org Artificial IntelligenceApr-25-2023

We provide a novel approach to construct generative models for graphs. Instead of using the traditional probabilistic models or deep generative models, we propose to instead find an algorithm that generates the data. We achieve this using evolutionary search and a powerful fitness function, implemented by a randomly initialized graph neural network. This brings certain advantages over current deep generative models, for instance, a higher potential for out-of-training-distribution generalization and direct interpretability, as the final graph generative process is expressed as a Python function. We show that this approach can be competitive with deep generative models and under some circumstances can even find the true graph generative process, and as such perfectly generalize. Generating new samples of graphs similar to a given set of graphs is a long-standing problem, which initially was tackled with various statistical models, such as the Erdős/Rényi model (Erdös & Rényi, 1959; Holland et al., 1983; Eldridge et al., 2017). While such models lend themselves well to formal analysis, they do not closely fit real-world graph distributions. More recently, deep generative models have proven to fit graph distributions well (You et al., 2018; Liao et al., 2020; Simonovsky & Komodakis, 2018; Martinkus et al., 2022; Haefeli et al., 2022; Vignac et al., 2022). However, similar to other deep models, they are not interpretable and struggle to generalize to graph sizes outside of the training distribution. In this work, we propose an alternative approach.

artificial intelligence, machine learning, node, (20 more...)

arXiv.org Artificial Intelligence

2304.12895

Country:

Europe > Czechia (0.14)
North America > United States (0.14)

Genre: Research Report (0.84)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.85)

Add feedback

DAVA: Disentangling Adversarial Variational Autoencoder

Estermann, Benjamin, Wattenhofer, Roger

arXiv.org Artificial IntelligenceMar-2-2023

The use of well-disentangled representations offers many advantages for downstream tasks, e.g. an increased sample efficiency, or better interpretability. However, the quality of disentangled interpretations is often highly dependent on the choice of dataset-specific hyperparameters, in particular the regularization strength. To address this issue, we introduce DAVA, a novel training procedure for variational auto-encoders. DAVA completely alleviates the problem of hyperparameter selection. We compare DAVA to models with optimal hyperparameters. Without any hyperparameter tuning, DAVA is competitive on a diverse range of commonly used datasets. Underlying DAVA, we discover a necessary condition for unsupervised disentanglement, which we call PIPE. We demonstrate the ability of PIPE to positively predict the performance of downstream models in abstract reasoning. We also thoroughly investigate correlations with existing supervised and unsupervised metrics. The code is available at https://github.com/besterma/dava.

artificial intelligence, dataset, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2303.01384

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Agent-based Graph Neural Networks

Martinkus, Karolis, Papp, Pál András, Schesch, Benedikt, Wattenhofer, Roger

arXiv.org Artificial IntelligenceFeb-27-2023

We present a novel graph neural network we call AgentNet, which is designed specifically for graph-level tasks. AgentNet is inspired by sublinear algorithms, featuring a computational complexity that is independent of the graph size. The architecture of AgentNet differs fundamentally from the architectures of traditional graph neural networks. In AgentNet, some trained \textit{neural agents} intelligently walk the graph, and then collectively decide on the output. We provide an extensive theoretical analysis of AgentNet: We show that the agents can learn to systematically explore their neighborhood and that AgentNet can distinguish some structures that are even indistinguishable by 2-WL. Moreover, AgentNet is able to separate any two graphs which are sufficiently different in terms of subgraphs. We confirm these theoretical results with synthetic experiments on hard-to-distinguish graphs and real-world graph classification tasks. In both cases, we compare favorably not only to standard GNNs but also to computationally more expensive GNN extensions.

agent, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2206.1101

Country:

Europe (0.46)
Oceania > Australia (0.14)
North America > United States (0.14)
North America > Canada (0.14)

Genre: Research Report (0.81)

Industry: Health & Medicine (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Electrode Clustering and Bandpass Analysis of EEG Data for Gaze Estimation

Kastrati, Ard, Plomecka, Martyna Beata, Küchler, Joël, Langer, Nicolas, Wattenhofer, Roger

arXiv.org Artificial IntelligenceFeb-19-2023

In this study, we validate the findings of previously published papers, showing the feasibility of an Electroencephalography (EEG) based gaze estimation. Moreover, we extend previous research by demonstrating that with only a slight drop in model performance, we can significantly reduce the number of electrodes, indicating that a high-density, expensive EEG cap is not necessary for the purposes of EEG-based eye tracking. Using data-driven approaches, we establish which electrode clusters impact gaze estimation and how the different types of EEG data preprocessing affect the models' performance. Finally, we also inspect which recorded frequencies are most important for the defined tasks.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2302.1271

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report > New Finding (0.66)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

FedHQL: Federated Heterogeneous Q-Learning

Fan, Flint Xiaofeng, Ma, Yining, Dai, Zhongxiang, Tan, Cheston, Low, Bryan Kian Hsiang, Wattenhofer, Roger

arXiv.org Artificial IntelligenceJan-26-2023

Federated Reinforcement Learning (FedRL) encourages distributed agents to learn collectively from each other's experience to improve their performance without exchanging their raw trajectories. The existing work on FedRL assumes that all participating agents are homogeneous, which requires all agents to share the same policy parameterization (e.g., network architectures and training configurations). However, in real-world applications, agents are often in disagreement about the architecture and the parameters, possibly also because of disparate computational budgets. Because homogeneity is not given in practice, we introduce the problem setting of Federated Reinforcement Learning with Heterogeneous And bLack-box agEnts (FedRL-HALE). We present the unique challenges this new setting poses and propose the Federated Heterogeneous Q-Learning (FedHQL) algorithm that principally addresses these challenges. We empirically demonstrate the efficacy of FedHQL in boosting the sample efficiency of heterogeneous agents with distinct policy parameterization using standard RL tasks.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2301.11135

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report (0.40)

Industry: Health & Medicine (0.95)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.46)

Add feedback