AITopics

2508.21589

Country: Europe > Austria (0.28)

Genre: Research Report (0.83)

Industry:

Education (0.67)
Energy > Renewable > Geothermal > Geothermal Energy Systems and Facilities > Geothermal System for Power Generation > Advanced Geothermal System (AGS) (0.62)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceOct-23-2025

AgentTTS: Large Language Model Agent for Test-time Compute-optimal Scaling Strategy in Complex Tasks

Wang, Fali, Liu, Hui, Dai, Zhenwei, Zeng, Jingying, Zhang, Zhiwei, Wu, Zongyu, Luo, Chen, Li, Zhen, Tang, Xianfeng, He, Qi, Wang, Suhang

Test-time scaling (TTS) enhances the performance of large language models (LLMs) by allocating additional compute resources during inference. However, existing research primarily investigates TTS in single-stage tasks; while many real-world problems are multi-stage complex tasks, composed of a sequence of heterogeneous subtasks with each subtask requires LLM of specific capability. Therefore, we study a novel problem: the test-time compute-optimal scaling in multi-stage complex tasks, aiming to select suitable models and allocate budgets per subtask to maximize overall performance. TTS in multi-stage tasks introduces two fundamental challenges: (i) The combinatorial search space of model and budget allocations, combined with the high cost of inference, makes brute-force search impractical. (ii) The optimal model and budget allocations across subtasks are interdependent, increasing the complexity of the compute-optimal search. To address this gap, we conduct extensive pilot experiments on four tasks across six datasets, deriving three empirical insights characterizing the behavior of LLMs in multi-stage complex tasks. Informed by these insights, we propose AgentTTS, an LLM-agent-based framework that autonomously searches for compute-optimal allocations through iterative feedback-driven interactions with the execution environment. Experimental results demonstrate that AgentTTS significantly outperforms traditional and other LLM-based baselines in search efficiency, and shows improved robustness to varying training set sizes and enhanced interpretability.

large language model, machine learning, natural language, (18 more...)

2508.0089

Country: North America > United States (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.87)

Industry:

Information Technology (0.46)
Energy (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Potfer, Marius, Perchet, Vianney

Comparing Uniform Price and Discriminatory Multi-Unit Auctions through Regret Minimization

arXiv.org Machine LearningOct-23-2025

Repeated multi-unit auctions, where a seller allocates multiple identical items over many rounds, are common mechanisms in electricity markets and treasury auctions. We compare the two predominant formats: uniform-price and discriminatory auctions, focusing on the perspective of a single bidder learning to bid against stochastic adversaries. We characterize the learning difficulty in each format, showing that the regret scales similarly for both auction formats under both full-information and bandit feedback, as $\tildeΘ ( \sqrt{T} )$ and $\tildeΘ ( T^{2/3} )$, respectively. However, analysis beyond worst-case regret reveals structural differences: uniform-price auctions may admit faster learning rates, with regret scaling as $\tildeΘ ( \sqrt{T} )$ in settings where discriminatory auctions remain at $\tildeΘ ( T^{2/3} )$. Finally, we provide a specific analysis for auctions in which the other participants are symmetric and have unit-demand, and show that in these instances, a similar regret rate separation appears.

artificial intelligence, auction, machine learning, (19 more...)

arXiv.org Machine Learning

2510.19591

Country: Europe > United Kingdom > England (0.28)

Genre: Research Report (0.40)

Industry: Energy (0.54)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

WIREDOct-22-2025, 10:31:00 GMT

Essential Gear for an Emergency Kit--for Cars or Go-Bags

What Should Be in Your Emergency Kit Before Disaster Strikes? We consulted preparedness experts and WIRED's team of testers on the essential bug-out gear to keep your family safe during an unplanned exit. All products featured on WIRED are independently selected by our editors. However, we may receive compensation from retailers and/or from purchases of products through these links. You never know when you're going to have to bug out on short notice.

amazon, courtesy, wired, (16 more...)

WIRED

Country:

North America > United States > Oregon (0.04)
North America > United States > California (0.04)
Europe > Slovakia (0.04)
Europe > Czechia (0.04)

Industry:

Health & Medicine (1.00)
Information Technology (0.69)
Government > Regional Government > North America Government > United States Government (0.69)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.69)
Information Technology > Communications > Mobile (0.47)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

WIREDOct-22-2025, 04:01:00 GMT

New Report Finds Efforts to Slow Climate Change Are Working--Just Not Fast Enough

By virtually every key metric, efforts to fight climate change are going too slowly, according to findings by a coalition of climate groups. In some cases, things are moving in the wrong direction. An eroded iceberg is seen is seen floating near Horseshoe Island, Antarctica. In the 10 years since the signing of the Paris Agreement, the backbone of international climate action, humanity has made impressive progress. Renewable energy is increasingly cheap and reliable, while electric vehicles are becoming better every year.

climate change, indicator, slow climate change, (14 more...)

WIRED

Country:

Antarctica (0.24)
Asia > China (0.05)
South America > Brazil (0.04)
(9 more...)

Industry:

Transportation > Ground > Road (1.00)
Transportation > Electric Vehicle (1.00)
Law (1.00)
(5 more...)

Technology:

Information Technology > Communications (0.47)
Information Technology > Artificial Intelligence > Natural Language (0.31)

Joint Optimization of Cooperation Efficiency and Communication Covertness for Target Detection with AUVs

Zhang, Xueyao, Yang, Bo, Yu, Zhiwen, Cao, Xuelin, Xiang, Wei, Guo, Bin, Wang, Liang, Lau, Billy Pik Lik, Alexandropoulos, George C., Luo, Jun, Debbah, Mérouane, Han, Zhu, Yuen, Chau

This paper investigates underwater cooperative target detection using autonomous underwater vehicles (AUVs), with a focus on the critical trade-off between cooperation efficiency and communication covertness. To tackle this challenge, we first formulate a joint trajectory and power control optimization problem, and then present an innovative hierarchical action management framework to solve it. According to the hierarchical formulation, at the macro level, the master AUV models the agent selection process as a Markov decision process and deploys the proximal policy optimization algorithm for strategic task allocation. At the micro level, each selected agent's decentralized decision-making is modeled as a partially observable Markov decision process, and a multi-agent proximal policy optimization algorithm is used to dynamically adjust its trajectory and transmission power based on its local observations. Under the centralized training and decentralized execution paradigm, our target detection framework enables adaptive covert cooperation while satisfying both energy and mobility constraints. By comprehensively modeling the considered system, the involved signals and tasks, as well as energy consumption, theoretical insights and practical solutions for the efficient and secure operation of multiple AUVs are provided, offering significant implications for the execution of underwater covert communication tasks.

artificial intelligence, machine learning, reinforcement learning, (20 more...)

2510.18225

Country:

North America > Canada (0.46)
Asia > China (0.28)
North America > United States (0.28)

Genre: Research Report (1.00)

Industry: Energy (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
(2 more...)

Zhang, Chenxu, Huang, Fuxiang, Zhang, Lei

Automated urban waterlogging assessment and early warning through a mixture of foundation models

With climate change intensifying, urban waterlogging poses an increasingly severe threat to global public safety and infrastructure. However, existing monitoring approaches rely heavily on manual reporting and fail to provide timely and comprehensive assessments. In this study, we present Urban Waterlogging Assessment (UWAssess), a foundation model-driven framework that automatically identif ies waterlogged areas in surveillance images and generates structured assessment reports. To address the scarcity of labeled data, we design a semi-supervised fine-tuning strategy and a chain-of-thought (CoT) prompting strategy to unleash the potential of the foundation model for data-scarce downstream tasks. Evaluations on challenging visual benchmarks demonstrate substantial improvements in perception performance . GPT-based evaluations confirm the ability of UWAssess to generate reliable textual reports that accurately describe waterlogging extent, depth, risk and impact. This dual capability enables a shift of waterlogging monitoring from perception to generation, while the collaborative framework of multiple foundation models lays the groundwork for intelligent and scalable systems, supporting urban management, disaster response and climate resilience.

large language model, machine learning, natural language, (20 more...)

2510.18425

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.48)

Industry:

Energy (0.93)
Transportation > Ground > Road (0.46)
Transportation > Infrastructure & Services (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

From Unaligned to Aligned: Scaling Multilingual LLMs with Multi-Way Parallel Corpora

Shen, Yingli, Lai, Wen, Wang, Shuo, Gao, Ge, Luo, Kangyang, Fraser, Alexander, Sun, Maosong

Continued pretraining and instruction tuning on large-scale multilingual data have proven to be effective in scaling large language models (LLMs) to low-resource languages. However, the unaligned nature of such data limits its ability to effectively capture cross-lingual semantics. In contrast, multi-way parallel data, where identical content is aligned across multiple languages, provides stronger cross-lingual consistency and offers greater potential for improving multilingual performance. In this paper, we introduce a large-scale, high-quality multi-way parallel corpus, TED2025, based on TED Talks. The corpus spans 113 languages, with up to 50 languages aligned in parallel, ensuring extensive multilingual coverage. Using this dataset, we investigate best practices for leveraging multi-way parallel data to enhance LLMs, including strategies for continued pretraining, instruction tuning, and the analysis of key influencing factors. Experiments on six multilingual benchmarks show that models trained on multiway parallel data consistently outperform those trained on unaligned multilingual data.

computational linguistic, large language model, natural language, (13 more...)

2505.14045

Country:

North America > United States (1.00)
Europe (1.00)
Asia (1.00)

Genre: Research Report > New Finding (1.00)

Industry:

Energy (0.92)
Education (0.68)
Government (0.67)
Health & Medicine > Therapeutic Area > Neurology (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Zollicoffer, Geigh, Vu, Minh, Bhattarai, Manish

MTRE: Multi-Token Reliability Estimation for Hallucination Detection in VLMs

Vision-language models (VLMs) now rival human performance on many multimodal tasks, yet they still hallucinate objects or generate unsafe text. Current hallucination detectors, e.g., single-token linear probing (LP) and PTrue, typically analyze only the logit of the first generated token or just its highest-scoring component, overlooking richer signals embedded within earlier token distributions. We demonstrate that analyzing the complete sequence of early logits potentially provides substantially more diagnostic information. We emphasize that hallucinations may only emerge after several tokens, as subtle inconsistencies accumulate over time. By analyzing the Kullback-Leibler (KL) divergence between logits corresponding to hallucinated and non-hallucinated tokens, we underscore the importance of incorporating later-token logits to more accurately capture the reliability dynamics of VLMs. In response, we introduce Multi-Token Reliability Estimation (MTRE), a lightweight, white-box method that aggregates logits from the first ten tokens using multi-token log-likelihood ratios and self-attention. Despite the challenges posed by large vocabulary sizes and long logit sequences, MTRE remains efficient and tractable. Across MAD-Bench, MM-SafetyBench, MathVista, and four compositional-geometry benchmarks, MTRE achieves a 9.4% gain in accuracy and a 14.8% gain in AUROC over standard detection methods, establishing a new state of the art in hallucination detection for open-source VLMs.

artificial intelligence, machine learning, natural language, (19 more...)

2505.11741

Country: North America > United States (0.46)

Genre: Research Report > New Finding (1.00)

Industry:

Energy (0.67)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
Information Technology > Artificial Intelligence > Vision (0.88)
(3 more...)

van der Plas, Thijs L, Law, Stephen, Pocock, Michael JO

Predicting butterfly species presence from satellite imagery using soft contrastive regularisation

The growing demand for scalable biodiversity monitoring methods has fuelled interest in remote sensing data, due to its widespread availability and extensive coverage. Traditionally, the application of remote sensing to biodiversity research has focused on mapping and monitoring habitats, but with increasing availability of large-scale citizen-science wildlife observation data, recent methods have started to explore predicting multi-species presence directly from satellite images. This paper presents a new data set for predicting butterfly species presence from satellite data in the United Kingdom. W e experimentally optimise a Resnet-based model to predict multi-species presence from 4-band satellite images, and find that this model especially outperforms the mean rate baseline for locations with high species biodiversity. T o improve performance, we develop a soft, supervised contrastive regularisation loss that is tailored to probabilistic labels (such as species-presence data), and demonstrate that this improves prediction accuracy. In summary, our new data set and contrastive regularisation method contribute to the open challenge of accurately predicting species biodiversity from remote sensing data, which is key for efficient biodiversity monitoring.

artificial intelligence, contrastive learning, machine learning, (13 more...)

2505.09306

Country: Europe > United Kingdom (1.00)

Genre: Research Report (0.50)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)