maier
Exploring Narrative Clustering in Large Language Models: A Layerwise Analysis of BERT
Banerjee, Awritrojit, Schilling, Achim, Krauss, Patrick
This study investigates the internal mechanisms of BERT, a transformer-based large language model, with a focus on its ability to cluster narrative content and authorial style across its layers. Using a dataset of narratives developed via GPT-4, featuring diverse semantic content and stylistic variations, we analyze BERT's layerwise activations to uncover patterns of localized neural processing. Through dimensionality reduction techniques such as Principal Component Analysis (PCA) and Multidimensional Scaling (MDS), we reveal that BERT exhibits strong clustering based on narrative content in its later layers, with progressively compact and distinct clusters. While strong stylistic clustering might occur when narratives are rephrased into different text types (e.g., fables, sci-fi, kids' stories), minimal clustering is observed for authorial style specific to individual writers. These findings highlight BERT's prioritization of semantic content over stylistic features, offering insights into its representational capabilities and processing hierarchy. This study contributes to understanding how transformer models like BERT encode linguistic information, paving the way for future interdisciplinary research in artificial intelligence and cognitive neuroscience.
Analyzing Narrative Processing in Large Language Models (LLMs): Using GPT4 to test BERT
Krauss, Patrick, Hรถsch, Jannik, Metzner, Claus, Maier, Andreas, Uhrig, Peter, Schilling, Achim
The ability to transmit and receive complex information via language is unique to humans and is the basis of traditions, culture and versatile social interactions. Through the disruptive introduction of transformer based large language models (LLMs) humans are not the only entity to "understand" and produce language any more. In the present study, we have performed the first steps to use LLMs as a model to understand fundamental mechanisms of language processing in neural networks, in order to make predictions and generate hypotheses on how the human brain does language processing. Thus, we have used ChatGPT to generate seven different stylistic variations of ten different narratives (Aesop's fables). We used these stories as input for the open source LLM BERT and have analyzed the activation patterns of the hidden units of BERT using multi-dimensional scaling and cluster analysis. We found that the activation vectors of the hidden units cluster according to stylistic variations in earlier layers of BERT (1) than narrative content (4-5). Despite the fact that BERT consists of 12 identical building blocks that are stacked and trained on large text corpora, the different layers perform different tasks. This is a very useful model of the human brain, where self-similar structures, i.e. different areas of the cerebral cortex, can have different functions and are therefore well suited to processing language in a very efficient way. The proposed approach has the potential to open the black box of LLMs on the one hand, and might be a further step to unravel the neural processes underlying human language processing and cognition in general.
Girls of Steel Showcase Projects for U.S. Rep. Mike Doyle
It's two weeks until the competition, and 17-year-old Ella Maier is ecstatic her robot can finally do a pull-up. "Oh, that's so exciting," the Girls of Steel member said, as her robot latched on to a bar at the team's practice facility and hoisted itself to the second rung. "I'm in charge of that subsystem, and I'm really pleased it works. There's always a fear that it might not perform. There are no guarantees on this stuff, ever."
Limited Angle Tomography for Transmission X-Ray Microscopy Using Deep Learning
Huang, Yixing, Wang, Shengxiang, Guan, Yong, Maier, Andreas
In transmission X-ray microscopy (TXM) systems, the rotation of a scanned sample might be restricted to a limited angular range to avoid collision to other system parts or high attenuation at certain tilting angles. Image reconstruction from such limited angle data suffers from artifacts due to missing data. In this work, deep learning is applied to limited angle reconstruction in TXMs for the first time. With the challenge to obtain sufficient real data for training, training a deep neural network from synthetic data is investigated. Particularly, the U-Net, the state-of-the-art neural network in biomedical imaging, is trained from synthetic ellipsoid data and multi-category data to reduce artifacts in filtered back-projection (FBP) reconstruction images. The proposed method is evaluated on synthetic data and real scanned chlorella data in $100^\circ$ limited angle tomography. For synthetic test data, the U-Net significantly reduces root-mean-square error (RMSE) from $2.55 \times 10^{-3}$ {\mu}m$^{-1}$ in the FBP reconstruction to $1.21 \times 10^{-3}$ {\mu}m$^{-1}$ in the U-Net reconstruction, and also improves structural similarity (SSIM) index from 0.625 to 0.920. With penalized weighted least square denoising of measured projections, the RMSE and SSIM are further improved to $1.16 \times 10^{-3}$ {\mu}m$^{-1}$ and 0.932, respectively. For real test data, the proposed method remarkably improves the 3-D visualization of the subcellular structures in the chlorella cell, which indicates its important value for nano-scale imaging in biology, nanoscience and materials science.
Keep The Robot In The Cage--How Effective (And Safe) Are Co-Bots?
Manufacturing robots are breaking free of their cages. Sensor technology and artificial intelligence have now progressed to the stage where collaborative robots (co-bots) can work safely alongside humans in a wide variety of applications, including supermarkets, farms, and hospitals. But manufacturing is where co-bots will really help to boost productivity far beyond that of humans and robots working separately. As the market expands, however, businesses must prepare appropriately to get the most out of co-bots and avoid costly, if not dangerous mistakes. Bringing robots into close quarters with humans is a huge advance in our technological progress, and will allow us to achieve new levels of intricacy and minimize risk of injury, most notably in manufacturing.
UK must prepare for fourth industrial revolution, says report
Advanced digital technology could give UK manufacturing a huge boost and create hundreds of thousands of jobs, a new report claims. The independent review, chaired by the head of Siemens UK, highlights the benefits of robotics, 3D printing and artificial intelligence. But Juergen Maier said the UK needed "greater ambition" to take advantage of such technology. His report calls for a commission to help business adopt the advances. The report, Made Smarter, brought together executives from companies such as Rolls Royce, GKN and IBM, with representatives from small firms as well as academics from the universities of Newcastle and Cambridge.
Robotics, AI and 3D printing could close UK's productivity gap
The future has already arrived in a small factory in Worcester, according to the man hired by Theresa May to put Britain at the forefront of the next industrial revolution. Juergen Maier, the chief executive of Siemens UK, believes new technologies including robotics, artificial intelligence and additive manufacturing, or 3D printing, can deliver greater productivity and create more highly paid jobs. But failing to crack the next revolution will come at a high price: falling living standards. The work being done in Worcester, and places like it, will be crucial if Britain is to be successful outside the EU, Maier says. The starting gun has been fired in this global race and Britain needs to get ahead.
How much security can you turn over to AI?
It's not always easy to know when you're under attack, or when your security has already been breached. If you're capable of detecting a breach, you might find it in as few as 10 days, but survey after survey finds that breaches that are detected by someone outside the business typically take over 100 days to find. For one thing, between ecommerce, company websites, email, mobile users and overseas divisions, your company is doing business 24/7; however, your IT security team probably works business hours. That's one way 60 percent of attackers are able to compromise an organization in minutes, according to Verizon's 2015 Data Breach Investigations Report. But only a third of businesses can detect a breach within a few days.
How much security can you turn over to AI?
It's not always easy to know when you're under attack, or when your security has already been breached. If you're capable of detecting a breach, you might find it in as few as 10 days, but survey after survey finds that breaches that are detected by someone outside the business typically take over 100 days to find. For one thing, between ecommerce, company websites, email, mobile users and overseas divisions, your company is doing business 24/7; however, your IT security team probably works business hours. That's one way 60 percent of attackers are able to compromise an organization in minutes, according to Verizon's 2015 Data Breach Investigations Report. But only a third of businesses can detect a breach within a few days.
How much security can you turn over to AI?
It's not always easy to know when you're under attack, or when your security has already been breached. If you're capable of detecting a breach, you might find it in as few as 10 days, but survey after survey finds that breaches that are detected by someone outside the business typically take over 100 days to find. For one thing, between ecommerce, company websites, email, mobile users and overseas divisions, your company is doing business 24/7; however, your IT security team probably works business hours. That's one way 60 percent of attackers are able to compromise an organization in minutes, according to Verizon's 2015 Data Breach Investigations Report. But only a third of businesses can detect a breach within a few days.