South America
Explaining Deep Network Classification of Matrices: A Case Study on Monotonicity
Farina, Leandro, Korotov, Sergey
This work demonstrates a methodology for using deep learning to discover simple, practical criteria for classifying matrices based on abstract algebraic properties. By combining a high-performance neural network with explainable AI (XAI) techniques, we can distill a model's learned strategy into human-interpretable rules. We apply this approach to the challenging case of monotone matrices, defined by the condition that their inverses are entrywise nonnegative. Despite their simple definition, an easy characterization in terms of the matrix elements or the derived parameters is not known. Here, we present, to the best of our knowledge, the first systematic machine-learning approach for deriving a practical criterion that distinguishes monotone from non-monotone matrices. After establishing a labelled dataset by randomly generated monotone and non-monotone matrices uniformly on $(-1,1)$, we employ deep neural network algorithms for classifying the matrices as monotone or non-monotone, using both their entries and a comprehensive set of matrix features. By saliency methods, such as integrated gradients, we identify among all features, two matrix parameters which alone provide sufficient information for the matrix classification, with $95\%$ accuracy, namely the absolute values of the two lowest-order coefficients, $c_0$ and $c_1$ of the matrix's characteristic polynomial. A data-driven study of 18,000 random $7\times7$ matrices shows that the monotone class obeys $\lvert c_{0}/c_{1}\rvert\le0.18$ with probability $>99.98\%$; because $\lvert c_{0}/c_{1}\rvert = 1/\mathrm{tr}(A^{-1})$ for monotone $A$, this is equivalent to the simple bound $\mathrm{tr}(A^{-1})\ge5.7$.
Google's Newest AI Model Acts like a Satellite to Track Climate Change
Google's newest AI model is going to scour the Earth and, ideally, help it out. The mission is to find out once and for all, in fine detail, what we are doing to our planet. Crucially, once the model has supposedly done this it will also, apparently, explain where we might be able to best put things in place to help our world. AlphaEarth Foundations, an offshoot of Google's DeepMind AI model, aims to leverage machine learning and all the gobs and gobs of data that Google has absorbed about our planet over the last two decades, in order to understand how specific areas are changing over time. The model uses a system called "embeddings" that takes terabytes of data collected from satellites every day, analyzes it, and compresses it down to save storage space.
ChatGPT conversations lack "legal privilege"
ChatGPT conversations lack "legal privilege"Quotable ChatGPT conversations lack "legal privilege" Published On 30 Jul 202530 Jul 2025 Gaza is starving as "abundance of food" sits nearby Video Duration 01 minutes 30 seconds play-arrow01:30 * Palestinian lives are "not seen as equivalent" to others Video Duration 00 minutes 59 seconds play-arrow00:59 * UNRWA's "ability to respond" to needs in Gaza depend on Israel Video Duration 01 minutes 06 seconds play-arrow01:06 * Malaysia "calls on world leaders" to restrain Israel Video Duration 01 minutes 20 seconds play-arrow01:20 *
AIhub monthly digest: July 2025 – RoboCup round-up, ICML in Vancouver, and leveraging feedback in human-robot interactions
Welcome to our monthly digest, where you can catch up with any AIhub stories you may have missed, peruse the latest news, recap recent events, and more. This month, we take a trip around some of the RoboCup leagues, check in at ICML, learn about the NASA onboard AI research platform, and explore feedback in human-robot interactions. This month saw the running of RoboCup 2025, with the event taking place in Salvador, Brazil, from 15-21 July. Ahead of kick-off, we spoke to the general chair Marco Simões and caught up with Ana Patrícia Magalhães, lead organizer for RoboCupJunior, to find out more about their plans for the week. You can find out what the participants got up to in our two round-ups from social media: #RoboCup2025: social media round-up 1 #RoboCup2025: social media round-up part 2. If you missed the action, you can find the recordings of the livestreams here.
OCSVM-Guided Representation Learning for Unsupervised Anomaly Detection
Pinon, Nicolas, Lartizien, Carole
Unsupervised anomaly detection (UAD) aims to detect anomalies without labeled data, a necessity in many machine learning applications where anomalous samples are rare or not available. Most state-of-the-art methods fall into two categories: reconstruction-based approaches, which often reconstruct anomalies too well, and decoupled representation learning with density estimators, which can suffer from suboptimal feature spaces. While some recent methods attempt to couple feature learning and anomaly detection, they often rely on surrogate objectives, restrict kernel choices, or introduce approximations that limit their expressiveness and robustness. To address this challenge, we propose a novel method that tightly couples representation learning with an analytically solvable one-class SVM (OCSVM), through a custom loss formulation that directly aligns latent features with the OCSVM decision boundary. The model is evaluated on two tasks: a new benchmark based on MNIST-C, and a challenging brain MRI subtle lesion detection task. Unlike most methods that focus on large, hyperintense lesions at the image level, our approach succeeds to target small, non-hyperintense lesions, while we evaluate voxel-wise metrics, addressing a more clinically relevant scenario. Both experiments evaluate a form of robustness to domain shifts, including corruption types in MNIST-C and scanner/age variations in MRI. Results demonstrate performance and robustness of our proposed mode,highlighting its potential for general UAD and real-world medical imaging applications. The source code is available at https://github.com/Nicolas-Pinon/uad_ocsvm_guided_repr_learning
iLSU-T: an Open Dataset for Uruguayan Sign Language Translation
Stassi, Ariel E., Boria, Yanina, Di Martino, J. Matías, Randall, Gregory
Automatic sign language translation has gained particular interest in the computer vision and computational linguistics communities in recent years. Given each sign language country particularities, machine translation requires local data to develop new techniques and adapt existing ones. This work presents iLSU T, an open dataset of interpreted Uruguayan Sign Language RGB videos with audio and text transcriptions. This type of multimodal and curated data is paramount for developing novel approaches to understand or generate tools for sign language processing. iLSU T comprises more than 185 hours of interpreted sign language videos from public TV broadcasting. It covers diverse topics and includes the participation of 18 professional interpreters of sign language. A series of experiments using three state of the art translation algorithms is presented. The aim is to establish a baseline for this dataset and evaluate its usefulness and the proposed pipeline for data processing. The experiments highlight the need for more localized datasets for sign language translation and understanding, which are critical for developing novel tools to improve accessibility and inclusion of all individuals. Our data and code can be accessed.
Analise Semantica Automatizada com LLM e RAG para Bulas Farmaceuticas
The production of digital documents has been growing rapidly in academic, business, and health environments, presenting new challenges in the efficient extraction and analysis of unstructured information. This work investigates the use of RAG (Retrieval-Augmented Generation) architectures combined with Large-Scale Language Models (LLMs) to automate the analysis of documents in PDF format. The proposal integrates vector search techniques by embeddings, semantic data extraction and generation of contextualized natural language responses. To validate the approach, we conducted experiments with drug package inserts extracted from official public sources. The semantic queries applied were evaluated by metrics such as accuracy, completeness, response speed and consistency. The results indicate that the combination of RAG with LLMs offers significant gains in intelligent information retrieval and interpretation of unstructured technical texts.
Data-Driven and Participatory Approaches toward Neuro-Inclusive AI
Biased data representation in AI marginalizes up to 75 million autistic people worldwide through medical applications viewing autism as a deficit of neurotypical social skills rather than an aspect of human diversity, and this perspective is grounded in research questioning the humanity of autistic people. Turing defined artificial intelligence as the ability to mimic human communication, and as AI development increasingly focuses on human-like agents, this benchmark remains popular. In contrast, we define Neuro-Inclusive AI as datasets and systems that move away from mimicking humanness as a benchmark for machine intelligence. Then, we explore the origins, prevalence, and impact of anti-autistic biases in current research. Our work finds that 90% of human-like AI agents exclude autistic perspectives, and AI creators continue to believe ethical considerations are beyond the scope of their work. To improve the autistic representation in data, we conduct empirical experiments with annotators and LLMs, finding that binary labeling schemes sufficiently capture the nuances of labeling anti-autistic hate speech. Our benchmark, AUTALIC, can be used to evaluate or fine-tune models, and was developed to serve as a foundation for more neuro-inclusive future work.
Hierarchy-of-Visual-Words: a Learning-based Approach for Trademark Image Retrieval
Lourenço, Vítor N., Silva, Gabriela G., Fernandes, Leandro A. F.
From the background, the procedure extracts the holes' shapes and associate them with the component shapes' list (lines 7 and 8). The foreground shapes are used in the next iterations (lines 5 and 9) until all component shapes have been extracted from the initial binary trademark image. Shape's feature extraction consists of building a feature vector for each component shape of a given trademark image (Figs. 1 (d) and (k)). These 29-dimension feature vectors combine region-based and contour-based descriptors. Shape's region is described by the 25 moments of the Zernike polynomials (ZM) of order p from 0 to 8: Z p,q= p + 1 π null ρ null θ V p,q(ρ,θ) I ( ρ,θ), (1) where ρ = null x 2 + y 2 is the length of vector from origin to pixel (x,y), θ is the angle between the vector defining ρ and the x -axis in the counter clockwise direction and V p,q(ρ,θ) is a Zernike polynomial of order p with repetition q that forms a complete set over the interior of the unit disk inscribing the component shape: V p,q( ρ,θ) = R p,q(ρ) exp ( i qθ) .
Culinary Crossroads: A RAG Framework for Enhancing Diversity in Cross-Cultural Recipe Adaptation
Hu, Tianyi, Morales-Garzón, Andrea, Zheng, Jingyi, Maistro, Maria, Hershcovich, Daniel
In cross-cultural recipe adaptation, the goal is not only to ensure cultural appropriateness and retain the original dish's essence, but also to provide diverse options for various dietary needs and preferences. Retrieval Augmented Generation (RAG) is a promising approach, combining the retrieval of real recipes from the target cuisine for cultural adaptability with large language models (LLMs) for relevance. However, it remains unclear whether RAG can generate diverse adaptation results. Our analysis shows that RAG tends to overly rely on a limited portion of the context across generations, failing to produce diverse outputs even when provided with varied contextual inputs. This reveals a key limitation of RAG in creative tasks with multiple valid answers: it fails to leverage contextual diversity for generating varied responses. To address this issue, we propose CARRIAGE, a plug-and-play RAG framework for cross-cultural recipe adaptation that enhances diversity in both retrieval and context organization. To our knowledge, this is the first RAG framework that explicitly aims to generate highly diverse outputs to accommodate multiple user preferences. Our experiments show that CARRIAGE achieves Pareto efficiency in terms of diversity and quality of recipe adaptation compared to closed-book LLMs.