AITopics

doi: 10.1007/s10462-024-10724-3

2507.11902

Country:

South America > Brazil (0.46)
North America > Canada (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.87)

Bezerra, Wesley dos Reis, Bezerra, Lais Machado, Westphall, Carlos Becker

Challenges in GenAI and Authentication: a scoping review

arXiv.org Artificial IntelligenceJul-17-2025

Authentication and authenticity have been a security challenge since the beginning of information sharing, especially in the context of digital information. With the advancement of generative artificial intelligence, these challenges have evolved, demanding a more up-to-date analysis of their impacts on society and system security. This work presents a scoping review that analyzed 88 documents from the IEEExplorer, Scopus, and ACM databases, promoting an analysis of the resulting portfolio through six guiding questions focusing on the most relevant work, challenges, attack surfaces, threats, proposed solutions, and gaps. Finally, the portfolio articles are analyzed through this guiding research lens and also receive individualized analysis. The results consistently outline the challenges, gaps, and threats related to images, text, audio, and video, thereby supporting new research in the areas of authentication and generative artificial intelligence.

artificial intelligence, machine learning, natural language, (17 more...)

2507.11775

Country: South America > Brazil (0.28)

Genre:

Research Report (1.00)
Overview (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
(2 more...)

van der Linden, Putri A., Timans, Alexander, Tailor, Dharmesh, Bekkers, Erik J.

On Equivariant Model Selection through the Lens of Uncertainty

arXiv.org Machine LearningJul-16-2025

Equivariant models leverage prior knowledge on symmetries to improve predictive performance, but misspecified architectural constraints can harm it instead. While work has explored learning or relaxing constraints, selecting among pretrained models with varying symmetry biases remains challenging. We examine this model selection task from an uncertainty-aware perspective, comparing frequentist (via Conformal Prediction), Bayesian (via the marginal likelihood), and calibration-based measures to naive error-based evaluation. We find that uncertainty metrics generally align with predictive performance, but Bayesian model evidence does so inconsistently. We attribute this to a mismatch in Bayesian and geometric notions of model complexity for the employed last-layer Laplace approximation, and discuss possible remedies. Our findings point towards the potential of uncertainty in guiding symmetry-aware model selection.

artificial intelligence, machine learning, neural information processing system, (13 more...)

2506.18629

Country:

Europe > Netherlands > North Holland > Amsterdam (0.05)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.50)

Rocha, Vanderson, Kreutz, Diego, Canto, Gabriel, Bragança, Hendrio, Feitosa, Eduardo

MH-FSF: A Unified Framework for Overcoming Benchmarking and Reproducibility Limitations in Feature Selection Evaluation

arXiv.org Artificial IntelligenceJul-16-2025

MH-FSF: A Unified Framework for Overcoming Benchmarking and Reproducibility Limitations in Feature Selection Evaluation V anderson Rocha 1 1 Federal University of Amazonas (UFAM) Diego Kreutz 2 2 Federal University of Pampa (UNIP AMP A) Gabriel Canto 1 1 Federal University of Amazonas (UFAM) Hendrio Braganc a 1 1 Federal University of Amazonas (UFAM) Eduardo Feitosa 1 1 Federal University of Amazonas (UFAM) Abstract --Feature selection is vital for building effective predictive models, as it reduces dimensionality and emphasizes key features. However, current research often suffers from limited benchmarking and reliance on proprietary datasets. This severely hinders reproducibility and can negatively impact overall performance. T o address these limitations, we introduce the MH-FSF framework, a comprehensive, modular, and extensible platform designed to facilitate the reproduction and implementation of feature selection methods. Developed through collaborative research, MH-FSF provides implementations of 17 methods (11 classical, 6 domain-specific) and enables systematic evaluation on 10 publicly available Android malware datasets. Our results reveal performance variations across both balanced and imbalanced datasets, highlighting the critical need for data preprocessing and selection criteria that account for these asymmetries. We demonstrate the importance of a unified platform for comparing diverse feature selection techniques, fostering methodological consistency and rigor . By providing this framework, we aim to significantly broaden the existing literature and pave the way for new research directions in feature selection, particularly within the context of Android malware detection. I NTRODUCTION Feature selection is crucial for constructing effective predictive models. By identifying and focusing on the most relevant feature subsets, it reduces data dimensionality, leading to improved model accuracy and significantly decreased computational overhead during training [1].

artificial intelligence, feature selection method, machine learning, (7 more...)

2507.10591

Country: South America > Brazil (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)

World's largest known turtle nesting site found in the Amazon

Breakthroughs, discoveries, and DIY tips sent every weekday. Researchers from the University of Florida have uncovered the largest known nesting site for the threatened giant South American river turtle (Podocnemis expansa). How did they find over 41,000 nesting reptiles? The turtles were found gathered along the Amazon's Guaporé River between Brazil and Bolivia. This innovative use of drones opens up new avenues for conservationists, as detailed in a study recently published in the Journal of Applied Ecology.

artificial intelligence, turtle, university, (13 more...)

Popular Science

Country:

South America > Brazil (0.27)
South America > Bolivia (0.27)

Genre: Research Report > New Finding (0.36)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.51)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.51)

AIHubJul-15-2025, 14:26:05 GMT

Tackling the 3D Simulation League: an interview with Klaus Dorer and Stefan Glaser

A screenshot from the new simulator that will be trialled for a special challenge at RoboCup2025. The annual RoboCup event, where teams gather from across the globe to take part in competitions across a number of leagues, will this year take place in Brazil, from 15-21 July. In advance of kick-off, we spoke to two members of the RoboCup Soccer 3D Simulation League: Executive Committee Member Klaus Dorer, and Stefan Glaser, who is on the Maintenance Committee and who has been recently developing a new simulator for the League. Could start by just giving us a quick introduction to the Simulation League? Klaus Dorer: There are two Simulation Leagues in Soccer: the 2D Simulation League and the 3D Simulation League. The 2D Simulation League, as the name suggests, is a flat league where the players and ball are simulated with simplified physics and the main focus is on team strategy.

artificial intelligence, simulation league, simulator, (12 more...)

AIHub

Country: South America > Brazil (0.25)

Genre: Personal > Interview (0.40)

Industry: Leisure & Entertainment > Sports > Soccer (1.00)

Technology: Information Technology > Artificial Intelligence > Robots > Soccer Robots (0.76)

Özeren, Enes, Liu, Yihong, Schütze, Hinrich

HYPEROFA: Expanding LLM Vocabulary to New Languages via Hypernetwork-Based Embedding Initialization

arXiv.org Artificial IntelligenceJul-15-2025

Many pre-trained language models (PLMs) exhibit suboptimal performance on mid- and low-resource languages, largely due to limited exposure to these languages during pre-training. A common strategy to address this is to introduce new tokens specific to the target languages, initialize their embeddings, and apply continual pre-training on target-language data. Among such methods, OFA (Liu et al., 2024a) proposes a similarity-based subword embedding initialization heuristic that is both effective and efficient. However, OFA restricts target-language token embeddings to be convex combinations of a fixed number of source-language embeddings, which may limit expressiveness. To overcome this limitation, we propose HYPEROFA, a hypernetwork-based approach for more adaptive token embedding initialization. The hypernetwork is trained to map from an external multilingual word vector space to the PLMs token embedding space using source-language tokens. Once trained, it can generate flexible embeddings for target-language tokens, serving as a good starting point for continual pretraining. Experiments demonstrate that HYPEROFA consistently outperforms random initialization baseline and matches or exceeds the performance of OFA in both continual pre-training convergence and downstream task performance. We make the code publicly available.

large language model, latn, machine learning, (21 more...)

2504.21018

Country:

North America > United States > Florida > Miami-Dade County > Miami (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
(20 more...)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.82)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.67)

Dudley, Carson, Magdaleno, Reiden, Harding, Christopher, Eisenberg, Marisa

Simulation as Supervision: Mechanistic Pretraining for Scientific Discovery

arXiv.org Machine LearningJul-15-2025

Scientific modeling faces a core limitation: mechanistic models offer interpretability but collapse under real-world complexity, while machine learning models are flexible but require large labeled datasets, cannot infer unobservable quantities, and operate as black boxes. We introduce Simulation-Grounded Neural Networks (SGNNs), a general framework that uses mechanistic simulations as training data for neural networks. SGNNs are pretrained on synthetic corpora spanning diverse model structures, parameter regimes, stochasticity, and observational artifacts. We evaluated SGNNs across scientific disciplines and modeling tasks, and found that SGNNs achieved state-of-the-art results across settings: for prediction tasks, they nearly tripled COVID-19 forecasting skill versus CDC baselines, reduced chemical yield prediction error by one third, and maintained accuracy in ecological forecasting where task specific models failed. For inference tasks, SGNNs also accurately classified the source of information spread in simulated social networks and enabled supervised learning for unobservable targets, such as estimating COVID-19 transmissibility more accurately than traditional methods even in early outbreaks. Finally, SGNNs enable back-to-simulation attribution, a new form of mechanistic interpretability. Given real world input, SGNNs retrieve simulations based on what the model has learned to see as most similar, revealing which underlying dynamics the model believes are active. This provides process-level insight -- what the model thinks is happening -- not just which features mattered. SGNNs unify scientific theory with deep learning flexibility and unlock a new modeling paradigm -- transforming simulations from rigid, post hoc tools into flexible sources of supervision, enabling robust, interpretable inference even when ground truth is missing.

artificial intelligence, machine learning, simulation, (18 more...)

2507.08977

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > Michigan (0.05)
North America > United States > New York (0.04)
(6 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.67)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Public Health (1.00)
Health & Medicine > Epidemiology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Ceccon, Marina, Cornacchia, Giandomenico, Pezze, Davide Dalle, Fabris, Alessandro, Susto, Gian Antonio

Underrepresentation, Label Bias, and Proxies: Towards Data Bias Profiles for the EU AI Act and Beyond

arXiv.org Machine LearningJul-15-2025

Undesirable biases encoded in the data are key drivers of algorithmic discrimination. Their importance is widely recognized in the algorithmic fairness literature, as well as legislation and standards on anti-discrimination in AI. Despite this recognition, data biases remain understudied, hindering the development of computational best practices for their detection and mitigation. In this work, we present three common data biases and study their individual and joint effect on algorithmic discrimination across a variety of datasets, models, and fairness measures. We find that underrepresentation of vulnerable populations in training sets is less conducive to discrimination than conventionally affirmed, while combinations of proxies and label bias can be far more critical. Consequently, we develop dedicated mechanisms to detect specific types of bias, and combine them into a preliminary construct we refer to as the Data Bias Profile (DBP). This initial formulation serves as a proof of concept for how different bias signals can be systematically documented. Through a case study with popular fairness datasets, we demonstrate the effectiveness of the DBP in predicting the risk of discriminatory outcomes and the utility of fairness-enhancing interventions. Overall, this article bridges algorithmic fairness research and anti-discrimination policy through a data-centric lens.

data mining, disadvantaged group, machine learning, (16 more...)

doi: 10.1016/j.eswa.2025.128266

2507.08866

Country:

Europe > Austria > Vienna (0.14)
South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(19 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Law (1.00)
Health & Medicine > Therapeutic Area > Dermatology (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Natural Language (0.92)

Dai, Yan, Golrezaei, Negin, Jaillet, Patrick

Incentive-Aware Dynamic Resource Allocation under Long-Term Cost Constraints

arXiv.org Machine LearningJul-15-2025

Motivated by applications such as cloud platforms allocating GPUs to users or governments deploying mobile health units across competing regions, we study the dynamic allocation of a reusable resource to strategic agents with private valuations. Our objective is to simultaneously (i) maximize social welfare, (ii) satisfy multi-dimensional long-term cost constraints, and (iii) incentivize truthful reporting. We begin by numerically evaluating primal-dual methods widely used in constrained online optimization and find them to be highly fragile in strategic settings -- agents can easily manipulate their reports to distort future dual updates for future gain. To address this vulnerability, we develop an incentive-aware framework that makes primal-dual methods robust to strategic behavior. Our design combines epoch-based lazy updates -- where dual variables remain fixed within each epoch -- with randomized exploration rounds that extract approximately truthful signals for learning. Leveraging carefully designed online learning subroutines that can be of independent interest for dual updates, our mechanism achieves $\tilde{\mathcal{O}}(\sqrt{T})$ social welfare regret, satisfies all cost constraints, and ensures incentive alignment. This matches the performance of non-strategic allocation approaches while being robust to strategic agents.

artificial intelligence, incentive-aware dynamic resource allocation, machine learning, (12 more...)

2507.09473

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.81)

Industry:

Information Technology > Services (0.87)
Health & Medicine (0.67)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)