South America
The Morgan-Pitman Test of Equality of Variances and its Application to Machine Learning Model Evaluation and Selection
Arratia, Argimiro, Cabaña, Alejandra, Mordecki, Ernesto, Rovira-Parra, Gerard
Model selection in non-linear models often prioritizes performance metrics over statistical tests, limiting the ability to account for sampling variability. We propose the use of a statistical test to assess the equality of variances in forecasting errors. The test builds upon the classic Morgan-Pitman approach, incorporating enhancements to ensure robustness against data with heavy-tailed distributions or outliers with high variance, plus a strategy to make residuals from machine learning models statistically independent. Through a series of simulations and real-world data applications, we demonstrate the test's effectiveness and practical utility, offering a reliable tool for model evaluation and selection in diverse contexts.
PeruMedQA: Benchmarking Large Language Models (LLMs) on Peruvian Medical Exams -- Dataset Construction and Evaluation
Carrillo-Larco, Rodrigo M., Melgarejo, Jesus Lovón, Castillo-Cara, Manuel, Bravo-Rocca, Gusseppe
BACKGROUND: Medical large language models (LLMS) have demonstrated remarkable performance in answering medical examinations. However, the extent to which this high performance is transferable to medical questions in Spanish and from a Latin American country remains unexplored. This knowledge is crucial as LLM-based medical applications gain traction in Latin America. AIMS: to build a dataset of questions from medical examinations taken by Peruvian physicians pursuing specialty training; to fine-tune a LLM on this dataset; to evaluate and compare the performance in terms of accuracy between vanilla LLMs and the fine-tuned LLM. METHODS: We curated PeruMedQA, a multiple-choice question-answering (MCQA) datasets containing 8,380 questions spanning 12 medical domains (2018-2025). We selected eight medical LLMs including medgemma-4b-it and medgemma-27b-text-it, and developed zero-shot task-specific prompts to answer the questions appropriately. We employed parameter-efficient fine tuning (PEFT)and low-rant adaptation (LoRA) to fine-tune medgemma-4b-it utilizing all questions except those from 2025 (test set). RESULTS: medgemma-27b-text-it outperformed all other models, achieving a proportion of correct answers exceeding 90% in several instances. LLMs with <10 billion parameters exhibited <60% of correct answers, while some exams yielded results <50%. The fine-tuned version of medgemma-4b-it emerged victorious agains all LLMs with <10 billion parameters and rivaled a LLM with 70 billion parameters across various examinations. CONCLUSIONS: For medical AI application and research that require knowledge bases from Spanish-speaking countries and those exhibiting similar epidemiological profiles to Peru's, interested parties should utilize medgemma-27b-text-it or a fine-tuned version of medgemma-4b-it.
A Mixed User-Centered Approach to Enable Augmented Intelligence in Intelligent Tutoring Systems: The Case of MathAIde app
Guerino, Guilherme, Rodrigues, Luiz, Bianchini, Luana, Alves, Mariana, Marinho, Marcelo, Veloso, Thomaz, Macario, Valmir, Dermeval, Diego, Vieira, Thales, Bittencourt, Ig, Isotani, Seiji
This study explores the integration of Augmented Intelligence (AuI) in Intelligent Tutoring Systems (ITS) to address challenges in Artificial Intelligence in Education (AIED), including teacher involvement, AI reliability, and resource accessibility. We present MathAIde, an ITS that uses computer vision and AI to correct mathematics exercises from student work photos and provide feedback. The system was designed through a collaborative process involving brainstorming with teachers, high-fidelity prototyping, A/B testing, and a real-world case study. Findings emphasize the importance of a teacher-centered, user-driven approach, where AI suggests remediation alternatives while teachers retain decision-making. Results highlight efficiency, usability, and adoption potential in classroom contexts, particularly in resource-limited environments. The study contributes practical insights into designing ITSs that balance user needs and technological feasibility, while advancing AIED research by demonstrating the effectiveness of a mixed-methods, user-centered approach to implementing AuI in educational technologies.
Agent-based Simulation for Drone Charging in an Internet of Things Environment System
Grando, Leonardo, Leite, José Roberto Emiliano, Ursini, Edson Luiz
Abstract--This paper presents an agent-based simulation model for coordinating battery recharging in drone swarms, focusing on applications in Internet of Things (IoT) and Industry 4.0 environments. The proposed model includes a detailed description of the simulation methodology, system architecture, and implementation. One practical use case is explored: Smart Farming, highlighting how autonomous coordination strategies can optimize battery usage and mission efficiency in large-scale drone deployments. This work uses a machine learning technique to analyze the agent-based simulation sensitivity analysis output results. Drones have become important tools within the Internet of Things, and can be used in agribusiness, disaster response, logistics, and other usages.
How AI Is Upending Politics, Tech, the Media, and More
At WIRED's AI Power Summit on Monday, industry executives and officials discussed the impact artificial intelligence is having on every corner of society--and where it goes from here. In an increasingly divided world, one thing that everyone seems to agree on is that artificial intelligence is a hugely disruptive--and sometimes downright destructive--phenomenon. At WIRED's AI Power Summit in New York on Monday, leaders from the worlds of tech, politics, and the media came together to discuss how AI is transforming their intertwined worlds. The Summit included voices from the AI industry, a current US senator, a former Trump administration official, and publishers including WIRED's parent company, Condé Nast. You can view a livestream of the event in full below.
Looted mummy skull mislabeled for over a century
The roughly 350-year-old elongated skull is not Incan at all. Breakthroughs, discoveries, and DIY tips sent every weekday. While re-evaluating a centuries' old, elongated skull originally theorized to belong to an Incan "tribal chief," a team of archaeologists in Switzerland came across a startling discovery. The mummified cranium instead likely belonged to a respected member of a separate indigenous community that lived as subjects within the Incan empire. Their findings are published in the and are now helping recontextualize the remains while also addressing the field's longstanding ethical issues.
UK fighters to defend Polish skies after Russian drone incursion
Fighter jets from the UK will join Nato allies in defending Polish airspace after last week's incursion of Russian drones, the defence secretary has confirmed. RAF Typhoon jets will fly air defence missions over Poland as part of the military alliance's mission to bolster the eastern flank. Other allies including Denmark, Germany and France are already taking part - a jet from the latter was scrambled earlier on Monday in response to another potential incursion by Russian drones. Nato said that alert was quickly over. Tensions have risen across Europe since Poland accused Russia of the incident, which saw 19 drones enter its territory.
Hundreds of Google AI Workers Were Fired Amid Fight Over Working Conditions
Over 200 contractors who work on improving Google's AI products, including Gemini and AI Overviews, have been laid off, sources say. Workers enter a building on the Google headquarters campus on July 23, 2025, in Mountain View, California. More than 200 contractors who worked on evaluating and improving Google's AI products have been laid off without warning in at least two rounds of layoffs last month. The move comes amid an ongoing fight over pay and working conditions, according to workers who spoke to WIRED. In the past few years, Google has outsourced its AI rating work--which includes evaluating, editing, or rewriting the Gemini chatbot's response to make it sound more human and "intelligent"--to thousands of contractors employed by Hitachi-owned GlobalLogic and other outsourcing companies.
USA Today Enters Its Gen AI Era With a Chatbot
DeeperDive, a new tool that converses with readers, is an effort to beat the AI industry at its own game. The publishing company behind USA Today and 220 other publications is today rolling out a chatbot -like tool called DeeperDive that can converse with readers, summarize insights from its journalism, and suggest new content from across its sites. "Visitors now have a trusted AI answer engine on our platform for anything they want to engage with, anything they want to ask," Mike Reed, CEO of Gannett and the USA Today Network, said at the WIRED AI Power Summit in New York, an event that brought together voices from the tech industry, politics, and the world of media. "and it is performing really great." Most publishers have a fraught relationship with AI, as the chatbots that trained on their content are now summarizing it and eating the traffic that search engines used to send them.
Belarus and Russia's show of firepower appears to be a message to Europe
Belarus and Russia's show of firepower appears to be a message to Europe In a large field 45 miles (72km) from Belarus' capital Minsk, a battle is raging. There are giant explosions as Sukhoi-34 bombers drop guided bombs. Helicopter gunships join the attack, while surveillance drones sweep overhead to view the damage. Together with other international media we've been brought to the Borisovsky training ground where Belarusian and Russian forces are taking part in joint manoeuvres. Military attachés, too, from a variety of embassies are observing the drill from a viewing platform.