glasgow
Neuron Block Dynamics for XOR Classification with Zero-Margin
Braun, Guillaume, Imaizumi, Masaaki
The ability of neural networks to learn useful features through stochastic gradient descent (SGD) is a cornerstone of their success. Most theoretical analyses focus on regression or on classification tasks with a positive margin, where worst-case gradient bounds suffice. In contrast, we study zero-margin nonlinear classification by analyzing the Gaussian XOR problem, where inputs are Gaussian and the XOR decision boundary determines labels. In this setting, a non-negligible fraction of data lies arbitrarily close to the boundary, breaking standard margin-based arguments. Building on Glasgow's (2024) analysis, we extend the study of training dynamics from discrete to Gaussian inputs and develop a framework for the dynamics of neuron blocks. We show that neurons cluster into four directions and that block-level signals evolve coherently, a phenomenon essential in the Gaussian setting where individual neuron signals vary significantly. Leveraging this block perspective, we analyze generalization without relying on margin assumptions, adopting an average-case view that distinguishes regions of reliable prediction from regions of persistent error. Numerical experiments confirm the predicted two-phase block dynamics and demonstrate their robustness beyond the Gaussian setting.
The major UK city that will get driverless trains in 2026
Inside the former US embassy that's now one of the world's top luxury hotels - with 8 bars and restaurants and suites to book for £26,100 The world's most expensive cities for days out revealed, with London in the top 15 Going beyond the guidebook: Here are 10 must-try cultural and wildlife experiences in Australia's'Garden State' Fairy-tale villages, castle tours and dinner at Austria's oldest winery: These enchanting river cruises will take you to the heart of each picturesque port of call you visit Revealed: The world's best new luxury hotel is in the UK - and it has a huge pool and rooftop bar Travel expert reveals the'science-backed tool' to help overcome fear of flying Eurostar's'snow train' set to return this week for winter Could YOU pass France's new'civic examination' needed to live in the country? Try these sample questions and find out... Airline finds'lost' Boeing 737 a decade after it vanished'If you don't enjoy Benidorm, you've only got yourself to blame': Meet the British couple who have been to the Spanish hotspot more than 100 TIMES The'dangerous' destinations that are actually not scary - and why you should holiday there next Brit who moved to the world's most desirable place to live reveals the soaring unexpected costs of relocating A major UK city is set to get driverless trains next year as part of its rail modernisation project. In 2023, new trains were launched in Glasgow as part of the full-scale upgrade to improve the city's subway after more than 30 years. The renovations have continued and now, the Strathclyde Partnership for Transport (SPT) has announced Unattended Train Operation will be introduced to Glasgow. The modernisation project is in its'final stages,' Time Out reports, and the driverless subway trains are expected to be brought in next year.
Into the Wild: When Robots Are Not Welcome
Ashkenazi, Shaul, Skantze, Gabriel, Stuart-Smith, Jane, Foster, Mary Ellen
-- Social robots are increasingly being deployed in public spaces, where they face not only technological difficulties and unexpected user utterances, but also objections from stakeholders who may not be comfortable with introducing a robot into those spaces. We describe our difficulties with deploying a social robot in two different public settings: 1) Student services center; 2) Refugees and asylum seekers drop-in service. Although this is a failure report, in each use case we eventually managed to earn the trust of the staff and form a relationship with them, allowing us to deploy our robot and conduct our studies. We have developed a multilingual robot system (Figure 1) described in [1] for two different use cases: 1) Supporting newly arrived international students in a UK university, answering frequently asked questions; 2) Supporting refugees and asylum seekers with navigating bureaucratic processes. Like most current public-space robot deployments, our field studies involved adding a robot to an existing workplace, with stakeholders including management, visitors, as well as front-line workers who should all be consulted to develop the details of the system to be deployed.
The Future of Skill: What Is It to Be Skilled at Work?
Niklasson, Axel, Rintel, Sean, Makri, Stephann, Taylor, Alex
In this short paper, we introduce work that is aiming to purposefully venture into this mesh of questions from a different starting point. Interjecting into the conversation, we want to ask: 'What is it to be skilled at work?' Building on work from scholars like Tim Ingold, and strands of longstanding research in workplace studies and CSCW, our interest is in turning the attention to the active work of 'being good', or 'being skilled', at what we as workers do. As we see it, skill provides a counterpoint to the version of intelligence that appears to be easily blackboxed in systems like Slack, and that ultimately reduces much of what people do to work well together. To put it slightly differently, skill - as we will argue below - gives us a way into thinking about work as a much more entangled endeavour, unfolding through multiple and interweaving sets of practices, places, tools and collaborations. In this vein, designing for the future of work seems to be about much more than where work is done or how we might bolt on discrete containers of intelligence. More fruitful would be attending to how we succeed in threading so many entities together to do our jobs well - in 'coming to be skilled'.
Learning Gaze-aware Compositional GAN
Aranjuelo, Nerea, Huang, Siyu, Arganda-Carreras, Ignacio, Unzueta, Luis, Otaegui, Oihana, Pfister, Hanspeter, Wei, Donglai
Gaze-annotated facial data is crucial for training deep neural networks (DNNs) for gaze estimation. However, obtaining these data is labor-intensive and requires specialized equipment due to the challenge of accurately annotating the gaze direction of a subject. In this work, we present a generative framework to create annotated gaze data by leveraging the benefits of labeled and unlabeled data sources. We propose a Gaze-aware Compositional GAN that learns to generate annotated facial images from a limited labeled dataset. Then we transfer this model to an unlabeled data domain to take advantage of the diversity it provides. Experiments demonstrate our approach's effectiveness in generating within-domain image augmentations in the ETH-XGaze dataset and cross-domain augmentations in the CelebAMask-HQ dataset domain for gaze estimation DNN training. We also show additional applications of our work, which include facial image editing and gaze redirection.
A Transformer-Based Model for the Prediction of Human Gaze Behavior on Videos
Ozdel, Suleyman, Rong, Yao, Albaba, Berat Mert, Kuo, Yen-Ling, Wang, Xi, Kasneci, Enkelejda
Eye-tracking applications that utilize the human gaze in video understanding tasks have become increasingly important. To effectively automate the process of video analysis based on eye-tracking data, it is important to accurately replicate human gaze behavior. However, this task presents significant challenges due to the inherent complexity and ambiguity of human gaze patterns. In this work, we introduce a novel method for simulating human gaze behavior. Our approach uses a transformer-based reinforcement learning algorithm to train an agent that acts as a human observer, with the primary role of watching videos and simulating human gaze behavior. We employed an eye-tracking dataset gathered from videos generated by the VirtualHome simulator, with a primary focus on activity recognition. Our experimental results demonstrate the effectiveness of our gaze prediction method by highlighting its capability to replicate human gaze behavior and its applicability for downstream tasks where real human-gaze is used as input.
Zero-Shot Segmentation of Eye Features Using the Segment Anything Model (SAM)
Maquiling, Virmarie, Byrne, Sean Anthony, Niehorster, Diederick C., Nyström, Marcus, Kasneci, Enkelejda
The advent of foundation models signals a new era in artificial intelligence. The Segment Anything Model (SAM) is the first foundation model for image segmentation. In this study, we evaluate SAM's ability to segment features from eye images recorded in virtual reality setups. The increasing requirement for annotated eye-image datasets presents a significant opportunity for SAM to redefine the landscape of data annotation in gaze estimation. Our investigation centers on SAM's zero-shot learning abilities and the effectiveness of prompts like bounding boxes or point clicks. Our results are consistent with studies in other domains, demonstrating that SAM's segmentation effectiveness can be on-par with specialized models depending on the feature, with prompts improving its performance, evidenced by an IoU of 93.34% for pupil segmentation in one dataset. Foundation models like SAM could revolutionize gaze estimation by enabling quick and easy image segmentation, reducing reliance on specialized models and extensive manual annotation.
AIx Speed: Playback Speed Optimization Using Listening Comprehension of Speech Recognition Models
Kawamura, Kazuki, Rekimoto, Jun
Since humans can listen to audio and watch videos at faster speeds than actually observed, we often listen to or watch these pieces of content at higher playback speeds to increase the time efficiency of content comprehension. To further utilize this capability, systems that automatically adjust the playback speed according to the user's condition and the type of content to assist in more efficient comprehension of time-series content have been developed. However, there is still room for these systems to further extend human speed-listening ability by generating speech with playback speed optimized for even finer time units and providing it to humans. In this study, we determine whether humans can hear the optimized speech and propose a system that automatically adjusts playback speed at units as small as phonemes while ensuring speech intelligibility. The system uses the speech recognizer score as a proxy for how well a human can hear a certain unit of speech and maximizes the speech playback speed to the extent that a human can hear. This method can be used to produce fast but intelligible speech. In the evaluation experiment, we compared the speech played back at a constant fast speed and the flexibly speed-up speech generated by the proposed method in a blind test and confirmed that the proposed method produced speech that was easier to listen to.
Fully Convolutional Generative Machine Learning Method for Accelerating Non-Equilibrium Greens Function Simulations
Aleksandrov, Preslav, Rezaei, Ali, Xeni, Nikolas, Dutta, Tapas, Asenov, Asen, Georgiev, Vihar
This work describes a novel simulation approach that combines machine learning and device modelling simulations. The device simulations are based on the quantum mechanical non-equilibrium Greens function (NEGF) approach and the machine learning method is an extension to a convolutional generative network. We have named our new simulation approach ML-NEGF and we have implemented it in our in-house simulator called NESS (nano-electronics simulations software). The reported results demonstrate the improved convergence speed of the ML-NEGF method in comparison to the standard NEGF approach. The trained ML model effectively learns the underlying physics of nano-sheet transistor behaviour, resulting in faster convergence of the coupled Poisson-NEGF simulations. Quantitatively, our ML- NEGF approach achieves an average convergence acceleration of 60%, substantially reducing the computational time while maintaining the same accuracy.