AITopics

Masked Autoencoders (MAEs) learn rich semantic representations in audio classification through an efficient self-supervised reconstruction task. However, general-purpose models fail to generalize well when applied directly to fine-grained audio domains. Specifically, bird-sound classification requires distinguishing subtle inter-species differences and managing high intra-species acoustic variability, revealing the performance limitations of general-domain Audio-MAEs. This work demonstrates that bridging this domain gap domain gap requires full-pipeline adaptation, not just domain-specific pretraining data. We systematically revisit and adapt the pretraining recipe, fine-tuning methods, and frozen feature utilization to bird sounds using BirdSet, a large-scale bioacoustic dataset comparable to AudioSet. Our resulting Bird-MAE achieves new state-of-the-art results in BirdSet's multi-label classification benchmark. Additionally, we introduce the parameter-efficient prototypical probing, enhancing the utility of frozen MAE representations and closely approaching fine-tuning performance in low-resource settings. Bird-MAE's prototypical probes outperform linear probing by up to 37 percentage points in mean average precision and narrow the gap to fine-tuning across BirdSet downstream tasks. Bird-MAE also demonstrates robust few-shot capabilities with prototypical probing in our newly established few-shot benchmark on BirdSet, highlighting the potential of tailored self-supervised learning pipelines for fine-grained audio domains.

artificial intelligence, deep learning, representation, (17 more...)

2504.1288

Country:

South America > Colombia (0.14)
North America > United States > Nevada (0.04)
South America > Venezuela (0.04)
(11 more...)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Jones, Andrew, Whiteley, Nick

Generalisation and benign over-fitting for linear regression onto random functional covariates

arXiv.org Machine LearningAug-20-2025

We study theoretical predictive performance of ridge and ridge-less least-squares regression when covariate vectors arise from evaluating $p$ random, means-square continuous functions over a latent metric space at $n$ random and unobserved locations, subject to additive noise. This leads us away from the standard assumption of i.i.d. data to a setting in which the $n$ covariate vectors are exchangeable but not independent in general. Under an assumption of independence across dimensions, $4$-th order moment, and other regularity conditions, we obtain probabilistic bounds on a notion of predictive excess risk adapted to our random functional covariate setting, making use of recent results of Barzilai and Shamir. We derive convergence rates in regimes where $p$ grows suitably fast relative to $n$, illustrating interplay between ingredients of the model in determining convergence behaviour and the role of additive covariate noise in benign-overfitting.

artificial intelligence, machine learning, regression, (18 more...)

arXiv.org Machine Learning

2508.13895

Country:

Europe (0.14)
South America (0.04)
Oceania (0.04)
(3 more...)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.82)

Vehicle detection from GSV imagery: Predicting travel behaviour for cycling and motorcycling using Computer Vision

Kyriaki, null, Kokka, null, Goel, Rahul, Abbas, Ali, Nice, Kerry A., Martial, Luca, Labib, SM, Ke, Rihuan, Schönlieb, Carola Bibiane, Woodcock, James

Transportation influence health by shaping exposure to physical activity, air pollution and injury risk. Comparative data on cycling and motorcycling behaviours is scarce, particularly at a global scale. Street view imagery, such as Google Street View (GSV), combined with computer vision, is a valuable resource for efficiently capturing travel behaviour data. This study demonstrates a novel approach using deep learning on street view images to estimate cycling and motorcycling levels across diverse cities worldwide. We utilized data from 185 global cities. The data on mode shares of cycling and motorcycling estimated using travel surveys or censuses. We used GSV images to detect cycles and motorcycles in sampled locations, using 8000 images per city. The YOLOv4 model, fine-tuned using images from six cities, achieved a mean average precision of 89% for detecting cycles and motorcycles. A global prediction model was developed using beta regression with city-level mode shares as outcome, with log transformed explanatory variables of counts of GSV-detected images with cycles and motorcycles, while controlling for population density. We found strong correlations between GSV motorcycle counts and motorcycle mode share (0.78) and moderate correlations between GSV cycle counts and cycling mode share (0.51). Beta regression models predicted mode shares with $R^2$ values of 0.614 for cycling and 0.612 for motorcycling, achieving median absolute errors (MDAE) of 1.3% and 1.4%, respectively. Scatterplots demonstrated consistent prediction accuracy, though cities like Utrecht and Cali were outliers. The model was applied to 60 cities globally for which we didn't have recent mode share data. We provided estimates for some cities in the Middle East, Latin America and East Asia. With computer vision, GSV images capture travel modes and activity, providing insights alongside traditional data sources.

artificial intelligence, deep learning, machine learning, (19 more...)

2508.12794

Country:

South America (1.00)
North America (1.00)
Africa (1.00)
(2 more...)

Genre: Research Report (1.00)

Industry:

Transportation > Passenger (1.00)
Transportation > Ground > Road (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.48)

Lourenço, Vítor N., Paes, Aline, Weyde, Tillman

Exploring Content and Social Connections of Fake News with Explainable Text and Graph Learning

The global spread of misinformation and concerns about content trustworthiness have driven the development of automated fact-checking systems. Since false information often exploits social media dynamics such as "likes" and user networks to amplify its reach, effective solutions must go beyond content analysis to incorporate these factors. Moreover, simply labelling content as false can be ineffective or even reinforce biases such as automation and confirmation bias. This paper proposes an explainable framework that combines content, social media, and graph-based features to enhance fact-checking. It integrates a misinformation classifier with explainability techniques to deliver complete and interpretable insights supporting classification decisions. Experiments demonstrate that multimodal information improves performance over single modalities, with evaluations conducted on datasets in English, Spanish, and Portuguese. Additionally, the framework's explanations were assessed for interpretability, trustworthiness, and robustness with a novel protocol, showing that it effectively generates human-understandable justifications for its predictions. The code and experiments are available at https://github.com/MeLLL-UFF/mu2X/ .

explanation, machine learning, natural language, (19 more...)

2508.1004

Country: South America > Brazil > Rio de Janeiro (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Media > News (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.47)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.68)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Capdevila, Martin, Turek, Esteban Villa, Fernandez, Ellen Karina Chumbe, Galvez, Luis Felipe Polo, Marroquin, Andrea, Quesada, Rebeca Vargas, Crew, Johanna, Galarraga, Nicole Vallejo, Rodriguez, Christopher, Gutierrez, Diego, Datla, Radhi

Crossing Borders Without Crossing Boundaries: How Sociolinguistic Awareness Can Optimize User Engagement with Localized Spanish AI Models Across Hispanophone Countries

Large language models are, by definition, based on language. In an effort to underscore the critical need for regional localized models, this paper examines primary differences between variants of written Spanish across Latin America and Spain, with an in-depth sociocultural and linguistic contextualization therein. We argue that these differences effectively constitute significant gaps in the quotidian use of Spanish among dialectal groups by creating sociolinguistic dissonances, to the extent that locale-sensitive AI models would play a pivotal role in bridging these divides. In doing so, this approach informs better and more efficient localization strategies that also serve to more adequately meet inclusivity goals, while securing sustainable active daily user growth in a major low-risk investment geographic area. Therefore, implementing at least the proposed five sub variants of Spanish addresses two lines of action: to foment user trust and reliance on AI language models while also demonstrating a level of cultural, historical, and sociolinguistic awareness that reflects positively on any internationalization strategy.

artificial intelligence, crossing border, natural language, (16 more...)

2505.09902

Country:

South America (1.00)
North America > United States > Florida (0.28)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

da Silva, Diego Correa, Boaventura, Denis Robson Dantas, Oliveira, Mayki dos Santos, da Silva, Eduardo Ferreira, Pires, Joel Machado, Durão, Frederico Araújo

Understanding Distribution Structure on Calibrated Recommendation Systems

--Traditional recommender systems aim to generate a recommendation list comprising the most relevant or similar items to the user's profile. These approaches can create recommendation lists that omit item genres from the less prominent areas of a user's profile, thereby undermining the user's experience. T o solve this problem, the calibrated recommendation system provides a guarantee of including less representative areas in the recommended list. The calibrated context works with three distributions. The first is from the user's profile, the second is from the candidate items, and the last is from the recommendation list. These distributions are G-dimensional, where G is the total number of genres in the system. This high dimensionality requires a different evaluation method, considering that traditional recommenders operate in a one-dimensional data space. In this sense, we implement fifteen models that help to understand how these distributions are structured. We evaluate the users' patterns in three datasets from the movie domain. The results indicate that the models of outlier detection provide a better understanding of the structures. The calibrated system creates recommendation lists that act similarly to traditional recommendation lists, allowing users to change their groups of preferences to the same degree. Commonly, traditional recommender systems generate recommendations with miscalibration [1]. Miscalibration means that the recommendation lists do not follow the user preferences distribution, instead suggesting items from user's dominant area of interest. It creates an overspecialized recommendation list in which the items from the less dominant area are overwhelmed. This effect puts the user in a filter bubble or an echo chamber problem [2]. For instance, when a specific area dominates the recommended list, the user likely has few other options to interact with, aside from items within that dominant area. Then, the subsequent lists are recommended, with the dominant area becoming more overspecialized. In recent years, calibrated recommendation systems have attracted attention [3]-[8] from the recommender system community to overcome this issue. This type of system demonstrates the capacity to improve several objectives, such as diversity [3], control of popularity bias [4], item coverage [5], precision [6], and the reduction of miscalibration [7]. To illustrate how calibrated recommendation works, consider a scenario: if a user's preferences distribution indicates Corresponding author is Diego Corr ˆ ea da Silva.

algorithm, artificial intelligence, machine learning, (16 more...)

2508.13568

Country: South America > Brazil (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Media > Film (0.68)
Leisure & Entertainment (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.69)

Neural Information Processing SystemsAug-19-2025, 23:49:00 GMT

Average Case Column Subset Selection for Entrywise $\ell_1$-Norm Loss

Zhao Song, David Woodruff, Peilin Zhong

Nevertheless, we show that under certain minimal and realistic distributional settings, it is possible to obtain a (1+ null)-approximation with a nearly linear running time and poly (k/null) + O ( k log n) columns. Namely, we show that if the input matrix A has the form A = B + E, where B is an arbitrary rank-k matrix, and E is a matrix with i.i.d.

algorithm, matrix, probability, (16 more...)

Neural Information Processing Systems

Country:

South America > Paraguay > Asunción > Asunción (0.04)
North America > United States > Virginia > Arlington County > Arlington (0.04)
North America > United States > Rhode Island > Providence County > Providence (0.04)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Neural Information Processing SystemsAug-19-2025, 23:11:58 GMT

Metamers of neural networks reveal divergence from human perceptual systems

Jenelle Feather, Alex Durango, Ray Gonzalez, Josh McDermott

Neural Information Processing Systems http://nips.cc/

metamer, model metamer, representation, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
South America > Paraguay > Asunción > Asunción (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(3 more...)

Genre: Research Report > New Finding (0.93)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)