schilling
Uppaal Coshy: Automatic Synthesis of Compact Shields for Hybrid Systems
Brorholt, Asger Horn, Høeg-Petersen, Andreas Holck, Jensen, Peter Gjøl, Larsen, Kim Guldstrand, Mikučionis, Marius, Schilling, Christian, Wąsowski, Andrzej
We present Uppaal Coshy, a tool for automatic synthesis of a safety strategy -- or shield -- for Markov decision processes over continuous state spaces and complex hybrid dynamics. The general methodology is to partition the state space and then solve a two-player safety game, which entails a number of algorithmically hard problems such as reachability for hybrid systems. The general philosophy of Uppaal Coshy is to approximate hard-to-obtain solutions using simulations. Our implementation is fully automatic and supports the expressive formalism of Uppaal models, which encompass stochastic hybrid automata. The precision of our partition-based approach benefits from using finer grids, which however are not efficient to store. We include an algorithm called Caap to efficiently compute a compact representation of a shield in the form of a decision tree, which yields significant reductions.
- Europe > Denmark > North Jutland > Aalborg (0.04)
- Europe > Denmark > Capital Region > Copenhagen (0.04)
Author-Specific Linguistic Patterns Unveiled: A Deep Learning Study on Word Class Distributions
Krauss, Patrick, Schilling, Achim
Deep learning methods have been increasingly applied to computational linguistics to uncover patterns in text data. This study investigates author-specific word class distributions using part-of-speech (POS) tagging and bigram analysis. By leveraging deep neural networks, we classify literary authors based on POS tag vectors and bigram frequency matrices derived from their works. We employ fully connected and convolutional neural network architectures to explore the efficacy of unigram and bigram-based representations. Our results demonstrate that while unigram features achieve moderate classification accuracy, bigram-based models significantly improve performance, suggesting that sequential word class patterns are more distinctive of authorial style. Multi-dimensional scaling (MDS) visualizations reveal meaningful clustering of authors' works, supporting the hypothesis that stylistic nuances can be captured through computational methods. These findings highlight the potential of deep learning and linguistic feature analysis for author profiling and literary studies.
Exploring Narrative Clustering in Large Language Models: A Layerwise Analysis of BERT
Banerjee, Awritrojit, Schilling, Achim, Krauss, Patrick
This study investigates the internal mechanisms of BERT, a transformer-based large language model, with a focus on its ability to cluster narrative content and authorial style across its layers. Using a dataset of narratives developed via GPT-4, featuring diverse semantic content and stylistic variations, we analyze BERT's layerwise activations to uncover patterns of localized neural processing. Through dimensionality reduction techniques such as Principal Component Analysis (PCA) and Multidimensional Scaling (MDS), we reveal that BERT exhibits strong clustering based on narrative content in its later layers, with progressively compact and distinct clusters. While strong stylistic clustering might occur when narratives are rephrased into different text types (e.g., fables, sci-fi, kids' stories), minimal clustering is observed for authorial style specific to individual writers. These findings highlight BERT's prioritization of semantic content over stylistic features, offering insights into its representational capabilities and processing hierarchy. This study contributes to understanding how transformer models like BERT encode linguistic information, paving the way for future interdisciplinary research in artificial intelligence and cognitive neuroscience.
AI thought knee X-rays could tell if you drink beer and eat refried beans
Some artificial intelligence models are struggling to learn the old principle, "Correlation does not equal causation." And while that's not a reason to abandon AI tools, a recent study should remind programmers that even reliable versions of the technology are still prone to bouts of weirdness--like claiming knee X-rays can prove someone drinks beer or eats refried beans. Artificial intelligence models do much more than generate (occasionally accurate) text responses and (somewhat) realistic videos. Truly well-made tools are already helping medical researchers parse troves of datasets to discover new breakthroughs, accurately forecast weather patterns, and assess environmental conservation efforts. But according to a study published in the journal Scientific Reports, algorithmic "shortcut learning" continues to pose a problem of generating results that are simultaneously highly accurate and misinformed.
Analysis of Argument Structure Constructions in the Large Language Model BERT
Ramezani, Pegah, Schilling, Achim, Krauss, Patrick
This study investigates how BERT processes and represents Argument Structure Constructions (ASCs), extending previous LSTM analyses. Using a dataset of 2000 sentences across four ASC types (transitive, ditransitive, caused-motion, resultative), we analyzed BERT's token embeddings across 12 layers. Visualizations with MDS and t-SNE and clustering quantified by Generalized Discrimination Value (GDV) were used. Feedforward classifiers (probes) predicted construction categories from embeddings. CLS token embeddings clustered best in layers 2-4, decreased in intermediate layers, and slightly increased in final layers. DET and SUBJ embeddings showed consistent clustering in intermediate layers, VERB embeddings increased in clustering from layer 1 to 12, and OBJ embeddings peaked in layer 10. Probe accuracies indicated low construction information in layer 1, with over 90 percent accuracy from layer 2 onward, revealing latent construction information beyond GDV clustering. Fisher Discriminant Ratio (FDR) analysis of attention weights showed OBJ tokens were crucial for differentiating ASCs, followed by VERB and DET tokens. SUBJ, CLS, and SEP tokens had insignificant FDR scores. This study highlights BERT's layered processing of linguistic constructions and its differences from LSTMs. Future research will compare these findings with neuroimaging data to understand the neural correlates of ASC processing. This research underscores neural language models' potential to mirror linguistic processing in the human brain, offering insights into the computational and neural mechanisms underlying language understanding.
- Europe > Germany > Bavaria > Middle Franconia > Nuremberg (0.04)
- North America > United States > New York (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (4 more...)
Analysis of Argument Structure Constructions in a Deep Recurrent Language Model
Ramezani, Pegah, Schilling, Achim, Krauss, Patrick
Understanding how language and linguistic constructions are processed in the brain is a fundamental question in cognitive computational neuroscience. In this study, we explore the representation and processing of Argument Structure Constructions (ASCs) in a recurrent neural language model. We trained a Long Short-Term Memory (LSTM) network on a custom-made dataset consisting of 2000 sentences, generated using GPT-4, representing four distinct ASCs: transitive, ditransitive, caused-motion, and resultative constructions. We analyzed the internal activations of the LSTM model's hidden layers using Multidimensional Scaling (MDS) and t-Distributed Stochastic Neighbor Embedding (t-SNE) to visualize the sentence representations. The Generalized Discrimination Value (GDV) was calculated to quantify the degree of clustering within these representations. Our results show that sentence representations form distinct clusters corresponding to the four ASCs across all hidden layers, with the most pronounced clustering observed in the last hidden layer before the output layer. This indicates that even a relatively simple, brain-constrained recurrent neural network can effectively differentiate between various construction types. These findings are consistent with previous studies demonstrating the emergence of word class and syntax rule representations in recurrent language models trained on next word prediction tasks. In future work, we aim to validate these results using larger language models and compare them with neuroimaging data obtained during continuous speech perception. This study highlights the potential of recurrent neural language models to mirror linguistic processing in the human brain, providing valuable insights into the computational and neural mechanisms underlying language understanding.
- Europe > Germany > Bavaria > Middle Franconia > Nuremberg (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > United States > California > Los Angeles County > Pasadena (0.04)
- (2 more...)
Analyzing Narrative Processing in Large Language Models (LLMs): Using GPT4 to test BERT
Krauss, Patrick, Hösch, Jannik, Metzner, Claus, Maier, Andreas, Uhrig, Peter, Schilling, Achim
The ability to transmit and receive complex information via language is unique to humans and is the basis of traditions, culture and versatile social interactions. Through the disruptive introduction of transformer based large language models (LLMs) humans are not the only entity to "understand" and produce language any more. In the present study, we have performed the first steps to use LLMs as a model to understand fundamental mechanisms of language processing in neural networks, in order to make predictions and generate hypotheses on how the human brain does language processing. Thus, we have used ChatGPT to generate seven different stylistic variations of ten different narratives (Aesop's fables). We used these stories as input for the open source LLM BERT and have analyzed the activation patterns of the hidden units of BERT using multi-dimensional scaling and cluster analysis. We found that the activation vectors of the hidden units cluster according to stylistic variations in earlier layers of BERT (1) than narrative content (4-5). Despite the fact that BERT consists of 12 identical building blocks that are stacked and trained on large text corpora, the different layers perform different tasks. This is a very useful model of the human brain, where self-similar structures, i.e. different areas of the cerebral cortex, can have different functions and are therefore well suited to processing language in a very efficient way. The proposed approach has the potential to open the black box of LLMs on the one hand, and might be a further step to unravel the neural processes underlying human language processing and cognition in general.
- Asia > Middle East (0.14)
- North America > Mexico > Gulf of Mexico (0.14)
- Europe > Germany > Bavaria > Middle Franconia > Nuremberg (0.05)
Synthesis of Hierarchical Controllers Based on Deep Reinforcement Learning Policies
Delgrange, Florent, Avni, Guy, Lukina, Anna, Schilling, Christian, Nowé, Ann, Pérez, Guillermo A.
We propose a novel approach to the problem of controller design for environments modeled as Markov decision processes (MDPs). Specifically, we consider a hierarchical MDP a graph with each vertex populated by an MDP called a "room." We first apply deep reinforcement learning (DRL) to obtain low-level policies for each room, scaling to large rooms of unknown structure. We then apply reactive synthesis to obtain a high-level planner that chooses which low-level policy to execute in each room. The central challenge in synthesizing the planner is the need for modeling rooms. We address this challenge by developing a DRL procedure to train concise "latent" policies together with PAC guarantees on their performance. Unlike previous approaches, ours circumvents a model distillation step. Our approach combats sparse rewards in DRL and enables reusability of low-level policies. We demonstrate feasibility in a case study involving agent navigation amid moving obstacles.
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- North America > United States > Arizona > Maricopa County > Phoenix (0.04)
- (11 more...)
- Research Report > Promising Solution (0.34)
- Overview > Innovation (0.34)
The inverse problem for neural networks
Forets, Marcelo, Schilling, Christian
We study the problem of computing the preimage of a set under a neural network with piecewise-affine activation functions. We recall an old result that the preimage of a polyhedral set is again a union of polyhedral sets and can be effectively computed. We show several applications of computing the preimage for analysis and interpretability of neural networks.
- South America > Uruguay (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Europe > Denmark > North Jutland > Aalborg (0.04)
Conceptual Cognitive Maps Formation with Neural Successor Networks and Word Embeddings
Stoewer, Paul, Schilling, Achim, Maier, Andreas, Krauss, Patrick
The human brain possesses the extraordinary capability to contextualize the information it receives from our environment. The entorhinal-hippocampal plays a critical role in this function, as it is deeply engaged in memory processing and constructing cognitive maps using place and grid cells. Comprehending and leveraging this ability could significantly augment the field of artificial intelligence. The multi-scale successor representation serves as a good model for the functionality of place and grid cells and has already shown promise in this role. Here, we introduce a model that employs successor representations and neural networks, along with word embedding vectors, to construct a cognitive map of three separate concepts. The network adeptly learns two different scaled maps and situates new information in proximity to related pre-existing representations. The dispersion of information across the cognitive map varies according to its scale - either being heavily concentrated, resulting in the formation of the three concepts, or spread evenly throughout the map. We suggest that our model could potentially improve current AI models by providing multi-modal context information to any input, based on a similarity metric for the input and pre-existing knowledge representations.
- Europe > Germany > Bavaria > Middle Franconia > Nuremberg (0.14)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)