Goto

Collaborating Authors

 South America


LLM-Assisted Iterative Evolution with Swarm Intelligence Toward SuperBrain

arXiv.org Artificial Intelligence

We propose a novel SuperBrain framework for collective intelligence, grounded in the co-evolution of large language models (LLMs) and human users. Unlike static prompt engineering or isolated agent simulations, our approach emphasizes a dynamic pathway from Subclass Brain to Superclass Brain: (1) A Subclass Brain arises from persistent, personalized interaction between a user and an LLM, forming a cognitive dyad with adaptive learning memory. (2) Through GA-assisted forward-backward evolution, these dyads iteratively refine prompts and task performance. (3) Multiple Subclass Brains coordinate via Swarm Intelligence, optimizing across multi-objective fitness landscapes and exchanging distilled heuristics. (4) Their standardized behaviors and cognitive signatures integrate into a Superclass Brain, an emergent meta-intelligence capable of abstraction, generalization and self-improvement. We outline the theoretical constructs, present initial implementations (e.g., UAV scheduling, KU/KI keyword filtering) and propose a registry for cross-dyad knowledge consolidation. This work provides both a conceptual foundation and an architectural roadmap toward scalable, explainable and ethically aligned collective AI.


Multiple LLM Agents Debate for Equitable Cultural Alignment

arXiv.org Artificial Intelligence

Large Language Models (LLMs) need to adapt their predictions to diverse cultural contexts to benefit diverse communities across the world. While previous efforts have focused on single-LLM, single-turn approaches, we propose to exploit the complementary strengths of multiple LLMs to promote cultural adaptability. We introduce a Multi-Agent Debate framework, where two LLM-based agents debate over a cultural scenario and collaboratively reach a final decision. We propose two variants: one where either LLM agents exclusively debate and another where they dynamically choose between self-reflection and debate during their turns. We evaluate these approaches on 7 open-weight LLMs (and 21 LLM combinations) using the NormAd-ETI benchmark for social etiquette norms in 75 countries. Experiments show that debate improves both overall accuracy and cultural group parity over single-LLM baselines. Notably, multi-agent debate enables relatively small LLMs (7-9B) to achieve accuracies comparable to that of a much larger model (27B parameters).


Exploring Quantum Machine Learning for Weather Forecasting

arXiv.org Artificial Intelligence

Weather forecasting plays a crucial role in supporting strategic decisions across various sectors, including agriculture, renewable energy production, and disaster management. However, the inherently dynamic and chaotic behavior of the atmosphere presents significant challenges to conventional predictive models. On the other hand, introducing quantum computing simulation techniques to the forecasting problems constitutes a promising alternative to overcome these challenges. In this context, this work explores the emerging intersection between quantum machine learning (QML) and climate forecasting. We present the implementation of a Quantum Neural Network (QNN) trained on real meteorological data from NASA's Prediction of Worldwide Energy Resources (POWER) database. The results show that QNN has the potential to outperform a classical Recurrent Neural Network (RNN) in terms of accuracy and adaptability to abrupt data shifts, particularly in wind speed prediction. Despite observed nonlinearities and architectural sensitivities, the QNN demonstrated robustness in handling temporal variability and faster convergence in temperature prediction. These findings highlight the potential of quantum models in short and medium term climate prediction, while also revealing key challenges and future directions for optimization and broader applicability.


Centralized vs. Federated Learning for Educational Data Mining: A Comparative Study on Student Performance Prediction with SAEB Microdata

arXiv.org Artificial Intelligence

The application of data mining and artificial intelligence in education offers unprecedented potential for personalizing learning and early identification of at-risk students. However, the practical use of these techniques faces a significant barrier in privacy legislation, such as Brazil's General Data Protection Law (LGPD), which restricts the centralization of sensitive student data. To resolve this challenge, privacy-preserving computational approaches are required. The present study evaluates the feasibility and effectiveness of Federated Learning, specifically the FedProx algorithm, to predict student performance using microdata from the Brazilian Basic Education Assessment System (SAEB). A Deep Neural Network (DNN) model was trained in a federated manner, simulating a scenario with 50 schools, and its performance was rigorously benchmarked against a centralized eXtreme Gradient Boosting (XGBoost) model. The analysis, conducted on a universe of over two million student records, revealed that the centralized model achieved an accuracy of 63.96%. Remarkably, the federated model reached a peak accuracy of 61.23%, demonstrating a marginal performance loss in exchange for a robust privacy guarantee. The results indicate that Federated Learning is a viable and effective solution for building collaborative predictive models in the Brazilian educational context, in alignment with the requirements of the LGPD.


Navigating the growing field of research on AI for software testing -- the taxonomy for AI-augmented software testing and an ontology-driven literature survey

arXiv.org Artificial Intelligence

In industry, software testing is the primary method to verify and validate the functionality, performance, security, usability, and so on, of software-based systems. Test automation has gained increasing attention in industry over the last decade, following decades of intense research into test automation and model-based testing. However, designing, developing, maintaining and evolving test automation is a considerable effort. Meanwhile, AI's breakthroughs in many engineering fields are opening up new perspectives for software testing, for both manual and automated testing. This paper reviews recent research on AI augmentation in software test automation, from no automation to full automation. It also discusses new forms of testing made possible by AI. Based on this, the newly developed taxonomy, ai4st, is presented and used to classify recent research and identify open research questions.


Onion CEO Ben Collins Hasn't Given Up on Print--or Buying Infowars

WIRED

Onion CEO Ben Collins Hasn't Given Up on Print--or Buying Infowars A year after relaunching The Onion as a newspaper, Collins visits to talk about why "going into something and not ruining it is bravery." Ben Collins made a big bet. A year ago, just a few months after he'd been named CEO of The Onion, he relaunched its print edition. Once a favorite on university campuses, The Onion hadn't published a physical issue since 2013 . Common wisdom said that readership, and advertising dollars, just weren't there for newspapers. But Collins, a fan of the satirical paper since childhood, thought "that's dumb." Readers celebrated The Onion's relaunch and the ability to read all of its bitingly funny headlines on a single broadsheet. Collins wouldn't give exact numbers on how many people are currently subscribed to the print edition but did say they should be enough to keep its writers' room humming (a few weeks after we taped this episode, the Wall Street Journal reported that The Onion now boasts more than 53,000 paying subscribers). On this episode of, I spoke with Collins about his hopes for The Onion, the future of journalism, and his Balatro addiction. KATIE DRUMMOND: Do you have a recent favorite Onion headline? Can I look it up for you? "Ghislaine Maxwell Can't Help but Notice Interview Room Covered in Plastic Sheeting." The staff churns out like 15 a day that are great. I sit there, and I still don't know how they do it. When I say they throw away eight or nine of the best sentences I would ever write every day, I mean that sincerely.


Latam-GPT: The Free, Open Source, and Collaborative AI of Latin America

WIRED

Latam-GPT is new large language model being developed in and for Latin America. The project, led by the nonprofit Chilean National Center for Artificial Intelligence (CENIA), aims to help the region achieve technological independence by developing an open source AI model trained on Latin American languages and contexts. "This work cannot be undertaken by just one group or one country in Latin America: It is a challenge that requires everyone's participation," says Álvaro Soto, director of CENIA, in an interview with WIRED en Español. "Latam-GPT is a project that seeks to create an open, free, and, above all, collaborative AI model. We've been working for two years with a very bottom-up process, bringing together citizens from different countries who want to collaborate. Recently, it has also seen some more top-down initiatives, with governments taking an interest and beginning to participate in the project."


The Rosario Dataset v2: Multimodal Dataset for Agricultural Robotics

arXiv.org Artificial Intelligence

World population will grow by a third by 2050, directly impacting global food demand (Fukase and Martin (2020)). In this context, the agricultural industry should increase its production to satisfy such demand. The use of autonomous robots to carry out agricultural tasks such as seeding, harvesting, weed remotion, pest control among others is an attractive solution since it can improve the production time in a sustainable manner reducing the environmental impact and pollution. However, the implementation of autonomous robots in the agricultural field is a challenging work due the rough terrain, natural light variations, perceptual aliasing, areas with GNSS-denied signal, and the long-term robot operation required to carry out the desired applications. All these challenges cause robot localization methods to fail or perform poorly, making them impractical for real agricultural tasks, as evidenced in Cremona et al. (2022, 2023); Soncini et al. (2024); Cox et al. (2023); Bai et al. (2023); Ait et al. (2023). In the last decade, there has been a growing trend towards the creation and public availability of agricultural datasets, enabling researchers to test new techniques and develop more sophisticated algorithms to address these challenges, such as Pire et al. (2019); Kragh et al. (2017); Tanco et al. (2024). However, none of them are properly curated for evaluating multi-modal SLAM algorithms. Effective multi-modal SLAM evaluation imposes specific requirements such as hardware-synchronized sensors, 6-DOF ground-truth, and trajectories with loops to effectively test loop closure algorithms. In this work, we present a multi-modal dataset recorded by the weed removing robot developed at CIFASIS (CONICET - UNR) in a soybean agricultural field.


HSFN: Hierarchical Selection for Fake News Detection building Heterogeneous Ensemble

arXiv.org Artificial Intelligence

Psychological biases, such as confirmation bias, make individuals particularly vulnerable to believing and spreading fake news on social media, leading to significant consequences in domains such as public health and politics. Machine learning-based fact-checking systems have been widely studied to mitigate this problem. Among them, ensemble methods are particularly effective in combining multiple classifiers to improve robustness. However, their performance heavily depends on the diversity of the constituent classifiers-selecting genuinely diverse models remains a key challenge, especially when models tend to learn redundant patterns. In this work, we propose a novel automatic classifier selection approach that prioritizes diversity, also extended by performance. The method first computes pairwise diversity between classifiers and applies hierarchical clustering to organize them into groups at different levels of granularity. A HierarchySelect then explores these hierarchical levels to select one pool of classifiers per level, each representing a distinct intra-pool diversity. The most diverse pool is identified and selected for ensemble construction from these. The selection process incorporates an evaluation metric reflecting each classifiers's performance to ensure the ensemble also generalises well. We conduct experiments with 40 heterogeneous classifiers across six datasets from different application domains and with varying numbers of classes. Our method is compared against the Elbow heuristic and state-of-the-art baselines. Results show that our approach achieves the highest accuracy on two of six datasets. The implementation details are available on the project's repository: https://github.com/SaraBCoutinho/HSFN .


BLUEX Revisited: Enhancing Benchmark Coverage with Automatic Captioning

arXiv.org Artificial Intelligence

With the growing capabilities of Large Language Models (LLMs), there is an increasing need for robust evaluation methods, especially in multilingual and non-English contexts. W e present an updated version of the BLUEX dataset, now including 2024-2025 exams and automatically generated image captions using state-of-the-art models, enhancing its relevance for data contamination studies in LLM pretraining. Captioning strategies increase accessibility to text-only models by more than 40%, producing 1,422 usable questions, more than doubling the number in the original BLUEX. W e evaluated commercial and open-source LLMs and their ability to leverage visual context through captions.