Castile and León
A Multivariate Bernoulli-Based Sampling Method for Multi-Label Data with Application to Meta-Research
Chung, Simon, Vorland, Colby J., Maney, Donna L., Brown, Andrew W.
Datasets may contain observations with multiple labels. If the labels are not mutually exclusive, and if the labels vary greatly in frequency, obtaining a sample that includes sufficient observations with scarcer labels to make inferences about those labels, and which deviates from the population frequencies in a known manner, creates challenges. In this paper, we consider a multivariate Bernoulli distribution as our underlying distribution of a multi-label problem. We present a novel sampling algorithm that takes label dependencies into account. It uses observed label frequencies to estimate multivariate Bernoulli distribution parameters and calculate weights for each label combination. This approach ensures the weighted sampling acquires target distribution characteristics while accounting for label dependencies. We applied this approach to a sample of research articles from Web of Science labeled with 64 biomedical topic categories. We aimed to preserve category frequency order, reduce frequency differences between most and least common categories, and account for category dependencies. This approach produced a more balanced sub-sample, enhancing the representation of minority categories.
Watch: Polar bears occupy abandoned Soviet-era research station
Drone footage has captured a group of polar bears living inside an abandoned research station on Russia's Kolyuchin Island. Travel blogger, Vadim Makhorov, shared video that shows several bears inside the scattered building, looking through windows and walking around the island. A bear could be seen trying to catch the blogger's drone as it approached. The Kolyuchin weather station was abandoned in the early 1990s, after the collapse of the Soviet Union. Russian Alexey Molchanov breaks his own 2024 world record in one of the most technically challenging freediving events.
Watch: Moment freediver sets new world record with breath-defying 126m plunge
Russian freediver Alexey Molchanov plunged 126m (413ft) in a single breath to set a new world record at the AIDA Freediving World Championships in Limassol, Cyprus. He descended deep below the Mediterranean Sea with nothing but a headlight, two fins and a rope as a guide, in a feat considered one of the most technically challenging freedive categories. Mr Molchanov broke his own 2024 world record of 125m, during which he held his breath for a staggering four minutes and 32 seconds. The BBC's Russia editor, Steve Rosenberg, reports from joint manoeuvres by Russia and Belarus, as part of the Zapad 2025 (West 2025) military drills. 'Looks like a toy, but it's real': BBC examines a downed Russian drone Drones like this one were shot down over Polish airspace in the early hours of Wednesday.
Notre-Dame's iconic towers reopen six years after fire
Notre-Dame's iconic towers reopen six years after fire The iconic Medieval towers of Notre-Dame Cathedral in Paris have reopened to the public, six years after a massive fire ravaged parts of the historical landmark and forced its closure. The central part of the cathedral was reopened in December 2024, but it has taken longer for Notre-Dame's twin towers to be accessible once again for visitors. A huge restoration project has taken place over the past few years to bring the cathedral back to its former glory after parts of it were substantially damaged during 2019's fire. French President Emmanuel Macron on Friday reopened the newly-restored towers to the public. The BBC's Russia editor, Steve Rosenberg, reports from joint manoeuvres by Russia and Belarus, as part of the Zapad 2025 (West 2025) military drills.
Watch: 'Looks like a toy, but it's real': BBC examines a downed Russian drone
'Looks like a toy, but it's real': BBC examines a downed Russian drone At least three Russian drones were shot down in Poland's airspace during attacks on Ukraine, the Polish prime minister said on Wednesday. The BBC's Sarah Rainsford has been looking at the exact type of Russian drone that flew into Poland, and is proving a massive challenge for Ukraine's territorial defence forces. The BBC's Sarah Rainsford says Sunday's attack caused a huge amount of damage. One of Kyiv's main government buildings was hit in overnight missile and drone strikes by Russia. 'The hit was very hard': Eyewitness in second carriage shares video of crash moment The incident in Lisbon's funicular has left 16 dead and multiple injured.
Inside Kyiv government building hit by missile strike
Ukraine's main government building in Kyiv was hit for the first time since Russia's full-scale invasion of the country on Sunday, officials said. The BBC's Sarah Rainsford visited the scene, where she observed a huge amount of damage. Local media reports suggest a cable came loose along the railway's route, causing it to lose control - a national day of mourning is being observed Actor Julia Roberts makes her Venice Film Festival debut promoting her new movie After The Hunt. The helicopter was attempting to collect water to fight wildfires at the time of the crash. 'Give it a go!': Tips from a top rate tree hugger Top tree hugger Hannah Willow explains why she loves the sport so much.
A XAI-based Framework for Frequency Subband Characterization of Cough Spectrograms in Chronic Respiratory Disease
Amado-Caballero, Patricia, San-José-Revuelta, Luis M., Wang, Xinheng, Garmendia-Leiza, José Ramón, Alberola-López, Carlos, Casaseca-de-la-Higuera, Pablo
This paper presents an explainable artificial intelligence (XAI)-based framework for the spectral analysis of cough sounds associated with chronic respiratory diseases, with a particular focus on Chronic Obstructive Pulmonary Disease (COPD). A Convolutional Neural Network (CNN) is trained on time-frequency representations of cough signals, and occlusion maps are used to identify diagnostically relevant regions within the spectrograms. These highlighted areas are subsequently decomposed into five frequency subbands, enabling targeted spectral feature extraction and analysis. The results reveal that spectral patterns differ across subbands and disease groups, uncovering complementary and compensatory trends across the frequency spectrum. Noteworthy, the approach distinguishes COPD from other respiratory conditions, and chronic from non-chronic patient groups, based on interpretable spectral markers. These findings provide insight into the underlying pathophysiological characteristics of cough acoustics and demonstrate the value of frequency-resolved, XAI-enhanced analysis for biomedical signal interpretation and translational respiratory disease diagnostics.
Reliable Programmatic Weak Supervision with Confidence Intervals for Label Probabilities
Álvarez, Verónica, Mazuelas, Santiago, An, Steven, Dasgupta, Sanjoy
Abstract--The accurate labeling of datasets is often both costly and time-consuming. Given an unlabeled dataset, programma tic weak supervision obtains probabilistic predictions for th e labels by leveraging multiple weak labeling functions (LFs) that p ro-vide rough guesses for labels. Weak LFs commonly provide guesses with assorted types and unknown interdependences that can result in unreliable predictions. This paper presents a methodology for programma tic weak supervision that can provide confidence intervals for l abel probabilities and obtain more reliable predictions. In par ticular, the methods proposed use uncertainty sets of distributions that encapsulate the information provided by LFs with unrestric ted behavior and typology. Experiments on multiple benchmark datasets show the improvement of the presented methods over the state-of-the-art and the practicality of the confidence intervals presented. OR many machine learning applications, the accurate labeling of datasets is both costly and time-consuming [1]-[4]. Given an unlabeled dataset, methods for programmatic weak supervision aim to leverage multiple wea k labeling functions (LFs) to provide accurate labels [5], [6 ]. Since common LFs only provide rough guesses for labels, programmatic weak supervision methods use the outputs of multiple LFs to obtain probabilistic predictions for the la bel of each instance [7]-[13]. These predictions can then be use d to create a fully supervised dataset composed by the instanc es corresponding to high-confidence predictions, e.g., a labe l with a large enough predicted probability is regarded as the actu al Manuscript received September 30, 2024; accepted August 4, 2025.