Norway
The Good Robot podcast: Machine vision with Jill Walker Rettberg
Hosted by Eleanor Drage and Kerry McInerney, The Good Robot is a podcast which explores the many complex intersections between gender, feminism and technology. In this episode, we talked to Jill Walker Rettberg, Professor of Digital Culture at the University of Bergen in Norway. In this wide-ranging conversation, we talk about machine vision's origins in polished volcanic glass, whether or not we'll actually have self-driving cars, and that famous photo-shopped Mother's Day photo released by Kate Middleton in March, 2024. Jill Walker Rettberg is Professor of Digital Culture and Co-Director of the Center for Digital Narrative (CDN), a Norwegian Center of Research Excellence that has received a 15 million grant from the Norwegian Research Council (2023-2033). She is also Principal Investigator of the ERC project Machine Vision in Everyday Life: Playful Interactions with Visual Technologies in Digital Art, Games, Narratives and Social Media (2018-2024), and of the ERC Advanced grant project AI Stories: Narrative Archetypes for Artificial Intelligence (2024-2029).
The 200 Android vs. the 1,000 iPhone: How our digital divide keeps growing
On one screen, an urban professional in Oslo taps through ultra-secure banking apps, relies on an AI-powered personal assistant, and streams media seamlessly over high-speed 5G using their iPhone. On the other screen, a farmer in Malawi scrolls through a modest Android phone -- likely costing less than a week's wages -- just to read the news, check tomorrow's weather, and send WhatsApp messages over a patchy mobile connection. These very different experiences highlight the divide between the Global North and the Global South. These terms refer not only to geographic locations but also to the world's wealthiest and most industrialized regions -- such as Europe, North America, and parts of East Asia -- and economically developing nations across much of Africa, Latin America, South Asia, and Oceania. Technology symbolizes innovation, convenience, and seamless connectivity in the Global North.
Comparative Analysis of Audio Feature Extraction for Real-Time Talking Portrait Synthesis
Salehi, Pegah, Sheshkal, Sajad Amouei, Thambawita, Vajira, Gautam, Sushant, Sabet, Saeed S., Johansen, Dag, Riegler, Michael A., Halvorsen, Pรฅl
The application of AI in education has gained widespread attention for its potential to enhance learning experiences across disciplines, including psychology [1, 2]. In the context of investigative interviewing, especially when questioning suspected child victims, AI offers a promising alternative to traditional training approaches. These conventional methods, often delivered through short workshops, fail to provide the hands-on practice, feedback, and continuous engagement needed for interviewers to master best practices in questioning child victims [3, 4]. Research has shown that while best practices recommend open-ended questions and discourage leading or suggestive queries [5, 6], many interviewers still struggle to implement these techniques effectively during real-world investigations [7]. The adoption of AI-powered child avatars provides a valuable solution, enabling Child Protective Services (CPS) workers to engage in realistic practice sessions without the ethical dilemmas associated with using real children, while simultaneously offering personalized feedback on their performance [8]. Our current system leverages advanced AI techniques within a structured virtual environment to train professionals in investigative interviewing. Specifically, this system integrates the Unity Engine to generate virtual avatars. Despite the potential advantages of our AI-based training system, its effectiveness largely depends on the perceived realism and fidelity of the virtual avatars used in these simulations [9]. Based on our findings, we observed that avatars generated using Generative Adversarial Networks (GANs) demonstrated higher levels of realism compared to those created with the Unity Engine in several key aspects [10].
Are nuclear masks all you need for improved out-of-domain generalisation? A closer look at cancer classification in histopathology
Tomar, Dhananjay, Binder, Alexander, Kleppe, Andreas
Domain generalisation in computational histopathology is challenging because the images are substantially affected by differences among hospitals due to factors like fixation and staining of tissue and imaging equipment. We hypothesise that focusing on nuclei can improve the out-of-domain (OOD) generalisation in cancer detection. We propose a simple approach to improve OOD generalisation for cancer detection by focusing on nuclear morphology and organisation, as these are domain-invariant features critical in cancer detection. Our approach integrates original images with nuclear segmentation masks during training, encouraging the model to prioritise nuclei and their spatial arrangement. Going beyond mere data augmentation, we introduce a regularisation technique that aligns the representations of masks and original images. We show, using multiple datasets, that our method improves OOD generalisation and also leads to increased robustness to image corruptions and adversarial attacks. The source code is available at https://github.com/undercutspiky/SFL/
Artificial intelligence to improve clinical coding practice in Scandinavia: a crossover randomized controlled trial
Chomutare, Taridzo, Svenning, Therese Olsen, Hernรกndez, Miguel รngel Tejedor, Ngo, Phuong Dinh, Budrionis, Andrius, Markljung, Kaisa, Hind, Lill Irene, Torsvik, Torbjรธrn, Mikalsen, Karl รyvind, Babic, Aleksandar, Dalianis, Hercules
International Statistical Classification of Diseases and Related Health Problems codes, tenth revision (ICD-10) [1] play an important role in healthcare. All hospitals in Scandinavia record their activity by summarizing patient encounters into ICD-10 codes. Clinical coding directly affects how health institutions function on a daily basis because they are partially reimbursed based on the codes they report. The same codes are used to measure both volume and quality of care, thereby providing an important foundation of knowledge for decision makers at all levels in the healthcare service. Clinical coding is a highly complex and challenging task that requires a deep understanding of both the medical terminology and intricate clinical documentation. Coders must accurately translate detailed patient records into standardized codes, navigating the inherently complex medical language, which make this task prone to errors and inconsistencies.
Leader-Follower 3D Formation for Underwater Robots
Ni, Di, Ko, Hungtang, Nagpal, Radhika
The schooling behavior of fish is hypothesized to confer many survival benefits, including foraging success, safety from predators, and energy savings through hydrodynamic interactions when swimming in formation. Underwater robot collectives may be able to achieve similar benefits in future applications, e.g. using formation control to achieve efficient spatial sampling for environmental monitoring. Although many theoretical algorithms exist for multi-robot formation control, they have not been tested in the underwater domain due to the fundamental challenges in underwater communication. Here we introduce a leader-follower strategy for underwater formation control that allows us to realize complex 3D formations, using purely vision-based perception and a reactive control algorithm that is low computation. We use a physical platform, BlueSwarm, to demonstrate for the first time an experimental realization of inline, side-by-side, and staggered swimming 3D formations. More complex formations are studied in a physics-based simulator, providing new insights into the convergence and stability of formations given underwater inertial/drag conditions. Our findings lay the groundwork for future applications of underwater robot swarms in aquatic environments with minimal communication.
'I am valued here': the extraordinary film that recreates a disabled boy's rich digital life
The night after their son Mats died aged just 25, Trude and Robert Steen sat on the sofa in their living room in Oslo with their daughter Mia. "Everything was a blur," remembers Trude of that day 10 years ago. "Then Robert said, 'Maybe we should reach out to Mats' friends in World of Warcraft.'" Mats was born with Duchenne muscular dystrophy, a progressive condition that causes the muscles to weaken gradually. He was diagnosed aged four and started using a wheelchair at 10.
Sony announces PlayStation The Concert, a world tour starting in 2025
As a big soundtrack fan, I love any occasion in which musicians perform them live in concert. So, I'm excited that Sony has created PlayStation The Concert, a world tour featuring the scores from titles like The Last of Us, God of War, Ghost of Tsushima and Horizon. Previous video game concerts have included The Legend of Zelda: Symphony of the Goddesses, which ran from 2012 to 2017. The announcement coincides with the 30th anniversary of PlayStation, with the production meant to reflect "30 years of making games that have not only captivated players but are celebrated for their breathtaking and immersive soundtracks too," Sid Shuman, senior director of Sony Interactive Entertainment Content Communications, stated in the release. The tour will start on April 15, 2025 in Dublin before traveling to cities around Europe like Paris, Oslo, London and Budapest.
A generative model of the hippocampal formation trained with theta driven local learning rules
Advances in generative models have recently revolutionised machine learning. Meanwhile, in neuroscience, generative models have long been thought fundamental to animal intelligence. Understanding the biological mechanisms that support these processes promises to shed light on the relationship between biological and artificial intelligence. In animals, the hippocampal formation is thought to learn and use a generative model to support its role in spatial and non-spatial memory. Here we introduce a biologically plausible model of the hippocampal formation tantamount to a Helmholtz machine that we apply to a temporal stream of inputs.
Collective variables of neural networks: empirical time evolution and scaling laws
Tovey, Samuel, Krippendorf, Sven, Spannowsky, Michael, Nikolaou, Konstantin, Holm, Christian
This work presents a novel means for understanding learning dynamics and scaling relations in neural networks. We show that certain measures on the spectrum of the empirical neural tangent kernel, specifically entropy and trace, yield insight into the representations learned by a neural network and how these can be improved through architecture scaling. These results are demonstrated first on test cases before being shown on more complex networks, including transformers, auto-encoders, graph neural networks, and reinforcement learning studies. In testing on a wide range of architectures, we highlight the universal nature of training dynamics and further discuss how it can be used to understand the mechanisms behind learning in neural networks. We identify two such dominant mechanisms present throughout machine learning training. The first, information compression, is seen through a reduction in the entropy of the NTK spectrum during training, and occurs predominantly in small neural networks. The second, coined structure formation, is seen through an increasing entropy and thus, the creation of structure in the neural network representations beyond the prior established by the network at initialization. Due to the ubiquity of the latter in deep neural network architectures and its flexibility in the creation of feature-rich representations, we argue that this form of evolution of the network's entropy be considered the onset of a deep learning regime.