Netherlands
Beyond Euclidean: Dual-Space Representation Learning for Weakly Supervised Video Violence Detection
While numerous Video Violence Detection (VVD) methods have focused on representation learning in Euclidean space, they struggle to learn sufficiently discriminative features, leading to weaknesses in recognizing normal events that are visually similar to violent events (i.e., ambiguous violence). In contrast, hyperbolic representation learning, renowned for its ability to model hierarchical and complex relationships between events, has the potential to amplify the discrimination between visually similar events. Inspired by these, we develop a novel Dual-Space Representation Learning (DSRL) method for weakly supervised VVD to utilize the strength of both Euclidean and hyperbolic geometries, capturing the visual features of events while also exploring the intrinsic relations between events, thereby enhancing the discriminative capacity of the features. DSRL employs a novel information aggregation strategy to progressively learn event context in hyperbolic spaces, which selects aggregation nodes through layer-sensitive hyperbolic association degrees constrained by hyperbolic Dirichlet energy. Furthermore, DSRL attempts to break the cyber-balkanization of different spaces, utilizing cross-space attention to facilitate information interactions between Euclidean and hyperbolic space to capture better discriminative features for final violence detection. Comprehensive experiments demonstrate the effectiveness of our proposed DSRL.
The surprising benefits of video games
Breakthroughs, discoveries, and DIY tips sent every weekday. There are plenty of negative stereotypes about games and gamers. And it's true that focusing on gaming to the detriment of all else will have negative effects--there's a reason that the World Health Organization recognizes video game addiction as a mental health condition. In the 50 years since Atari unleashed Pong on the world, there's been plenty of research on the effects of video games on our brains, and it's not all bad. Here are a few of the potential benefits of gaming, according to research. A research review published in American Psychologist in 2013 by Isabela Granic, Adam Lobel, and Rutger C. M. E. Engels at Radboud University in Nijmegen, the Netherlands, looked at decades of research and highlighted the various benefits found in gaming.
A new AI translation system for headphones clones multiple voices simultaneously
"There are so many smart people across the world, and the language barrier prevents them from having the confidence to communicate," says Shyam Gollakota, a professor at the University of Washington, who worked on the project. "My mom has such incredible ideas when she's speaking in Telugu, but it's so hard for her to communicate with people in the US when she visits from India. We think this kind of system could be transformative for people like her." While there are plenty of other live AI translation systems out there, such as the one running on Meta's Ray-Ban smart glasses, they focus on a single speaker, not multiple people speaking at once, and deliver robotic-sounding automated translations. The new system is designed to work with existing, off-the shelf noise-canceling headphones that have microphones, plugged into a laptop powered by Apple's M2 silicon chip, which can support neural networks.
Optimizing Power Grid Topologies with Reinforcement Learning: A Survey of Methods and Challenges
van der Sar, Erica, Zocca, Alessandro, Bhulai, Sandjai
Power grid operation is becoming increasingly complex due to the rising integration of renewable energy sources and the need for more adaptive control strategies. Reinforcement Learning (RL) has emerged as a promising approach to power network control (PNC), offering the potential to enhance decision-making in dynamic and uncertain environments. The Learning To Run a Power Network (L2RPN) competitions have played a key role in accelerating research by providing standardized benchmarks and problem formulations, leading to rapid advancements in RL-based methods. This survey provides a comprehensive and structured overview of RL applications for power grid topology optimization, categorizing existing techniques, highlighting key design choices, and identifying gaps in current research. Additionally, we present a comparative numerical study evaluating the impact of commonly applied RL-based methods, offering insights into their practical effectiveness. By consolidating existing research and outlining open challenges, this survey aims to provide a foundation for future advancements in RL-driven power grid optimization.
In pictures: Prayers and reflection mark Eid celebrations around the world
Muslims around the world have begun celebrating Eid al-Fitr, one of the biggest celebrations in the Islamic calendar. Eid al-Fitr - which means "festival of the breaking of the fast" - is celebrated at the end of Ramadan, a month of fasting for many adults, as well as spiritual reflection and prayer.ReutersHere in Moscow, worshippers are seen preparing for prayer.ReutersHundreds took part in prayers at Tononoka grounds, in Mombasa, KenyaGetty ImagesPrayers were also observed at a stadium in Port Sudan in the east of the countryGetty ImagesLittle children joined adults at the Moskee Essalam in Rotterdam, NetherlandsGetty ImagesGifts are handed out to Muslim children in Lviv, Ukraine, as Russia's war on the country continuesReuters Palestinians in Jabaliya in the northern Gaza Strip pray amidst the rubble of a mosque destroyed in the current war between Israel and HamasGetty ImagesFamilies gather at al-Aqsa mosque in Jerusalem - the third holiest site in IslamReutersA boy yawns during prayers at a stadium in QatarEPAMuslims greet each-other at Martim Moniz Square in Lisbon, PortugalGetty ImagesWomen worshippers gather in Burgess Park, London, for an outdoor prayerEPAThere were also worshippers gathered outside Plebiscito Square in Naples, ItalyReutersSome women took pictures after attending prayers at the Hagia Sophia Grand Mosque in Istanbul, TurkeyGetty ImagesAfghan refugees pray at a mosque on the outskirts of Peshawar, PakistanMiddle EastEuropeEid al-FitrReligionIslamRelated'I was afraid for my life': At the scene of the attack on Palestinian Oscar winner 5 days agoMiddle EastMore8 hrs ago'In Bradford, families spend thousands on new clothes for Eid' Muslims spend large amounts in Bradford's supermarkets, clothes shops and other services before Eid.8 hrs agoEngland1 day ago The tourist has received an award from the city's mayor after restraining a man during a stabbing.1 day agoEurope1 day ago Another 21 people are injured, as a restaurant and several buildings are set ablaze in the city, local officials say.1 day agoWorld1 day ago Town's successful Ramadan lights project expanded A Scunthorpe community group says it has seen an "amazing" response to its lights display.1 day agoLincolnshire1 day ago Bishop says school that changed Easter events'valued' The BBC is not responsible for the content of external sites.
VCR-GauS: View Consistent Depth-Normal Regularizer for Gaussian Surface Reconstruction 2 Chen Li1
Although 3D Gaussian Splatting has been widely studied because of its realistic and efficient novel-view synthesis, it is still challenging to extract a high-quality surface from the point-based representation. Previous works improve the surface by incorporating geometric priors from the off-the-shelf normal estimator. However, there are two main limitations: 1) Supervising normals rendered from 3D Gaussians effectively updates the rotation parameter but is less effective for other geometric parameters; 2) The inconsistency of predicted normal maps across multiple views may lead to severe reconstruction artifacts. In this paper, we propose a Depth-Normal regularizer that directly couples normal with other geometric parameters, leading to full updates of the geometric parameters from normal regularization. We further propose a confidence term to mitigate inconsistencies of normal predictions across multiple views. Moreover, we also introduce a densification and splitting strategy to regularize the size and distribution of 3D Gaussians for more accurate surface modeling. Compared with Gaussian-based baselines, experiments show that our approach obtains better reconstruction quality and maintains competitive appearance quality at faster training speed and 100+ FPS rendering.
Unified Speech Recognition: A Single Model for Auditory, Visual, and Audiovisual Inputs
Research in auditory, visual, and audiovisual speech recognition (ASR, VSR, and AVSR, respectively) has traditionally been conducted independently. Even recent self-supervised studies addressing two or all three tasks simultaneously tend to yield separate models, leading to disjoint inference pipelines with increased memory requirements and redundancies. This paper proposes unified training strategies for these systems. We demonstrate that training a single model for all three tasks enhances VSR and AVSR performance, overcoming typical optimisation challenges when training from scratch. Moreover, we introduce a greedy pseudo-labelling approach to more effectively leverage unlabelled samples, addressing shortcomings in related self-supervised methods. Finally, we develop a self-supervised pretraining method within our framework, proving its effectiveness alongside our semi-supervised approach. Despite using a single model for all tasks, our unified approach achieves state-of-the-art performance compared to recent methods on LRS3 and LRS2 for ASR, VSR, and AVSR, as well as on the newly released WildVSR dataset. Code and models are available at https://github.com/
Towards Understanding Evolving Patterns in Sequential Data
In many machine learning tasks, data is inherently sequential. Most existing algorithms learn from sequential data in an auto-regressive manner, which predicts the next unseen data point based on the observed sequence, implicitly assuming the presence of an evolving pattern embedded in the data that can be leveraged. However, identifying and assessing evolving patterns in learning tasks heavily relies on human expertise, and lacks a standardized quantitative measure. In this paper, we show that such a measure enables us to determine the suitability of employing sequential models, measure the temporal order of time series data, and conduct feature/data selections, which can be beneficial to a variety of learning tasks: time-series forecastings, classification tasks with temporal distribution shift, video predictions, etc.
FewViewGS: Gaussian Splatting with Few View Matching and Multi-stage Training
The field of novel view synthesis from images has seen rapid advancements with the introduction of Neural Radiance Fields (NeRF) and more recently with 3D Gaussian Splatting. Gaussian Splatting became widely adopted due to its efficiency and ability to render novel views accurately. While Gaussian Splatting performs well when a sufficient amount of training images are available, its unstructured explicit representation tends to overfit in scenarios with sparse input images, resulting in poor rendering performance. To address this, we present a 3D Gaussian-based novel view synthesis method using sparse input images that can accurately render the scene from the viewpoints not covered by the training images. We propose a multi-stage training scheme with matching-based consistency constraints imposed on the novel views without relying on pre-trained depth estimation or diffusion models. This is achieved by using the matches of the available training images to supervise the generation of the novel views sampled between the training frames with color, geometry, and semantic losses. In addition, we introduce a locality preserving regularization for 3D Gaussians which removes rendering artifacts by preserving the local color structure of the scene. Evaluation on synthetic and realworld datasets demonstrates competitive or superior performance of our method in few-shot novel view synthesis compared to existing state-of-the-art methods.