AITopics

Abdi, Abdulhakim M., Wang, Fan

Forest tree species classification and entropy-derived uncertainty mapping using extreme gradient boosting and Sentinel-1/2 data

arXiv.org Machine LearningSep-24-2025

We present a wall-to - wall map of dominant tree species in Swedish forests accompanied by pixel - level uncertainty estimates. The tree species classification is based on spatiotemporal metrics derived from Sentinel-1 and Sentinel - 2 satellite data, combined with field observations from the Swedish National Forest Inventory and auxiliary data on geomorphometry and canopy height. We apply an extreme gradient boosting model with Bayesian optimization to relate field observations to satellite-derived features and generate the final species map. Classification uncertainty is quantified using Shannon's entropy of the predicted class probabilities, which provide a spatially explicit measure of model confidence. The final model achieved an overall accuracy of 85% (F1 score = 0.82, Matthews correlation coefficient = 0.81), and mapped species distributions showed strong agreement with official forest statistics (r = 0.96). V ariable importance analysis revealed that the most influential predictors were optical bands from Sentinel - 2, particularly those acquired in spring and summer. This study provides scalable, interpretable, and policy-relevant method for tree species mapping with integrated uncertainty that are well-suited to meet emerging legislative and environmental goals.

classification, sentinel, tree species, (14 more...)

arXiv.org Machine Learning

2509.18228

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Norway (0.05)
Europe > Sweden > Skåne County (0.04)
(8 more...)

Genre: Research Report > New Finding (0.93)

Industry:

Energy (0.71)
Government > Regional Government > Europe Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)

MSCoRe: A Benchmark for Multi-Stage Collaborative Reasoning in LLM Agents

Lei, Yuzhen, Xie, Hongbin, Zhao, Jiaxing, Liu, Shuangxue, Song, Xuan

Large Language Models (LLMs) have excelled in question-answering (QA) tasks within single domains. However, their reasoning and coordination capabilities in complex, multi-stage scenarios remain underexplored. Existing benchmarks typically focus on isolated tasks or narrow domains, overlooking models' abilities for multi-stage collaboration and optimization without explicit external guidance. To bridge this gap, we propose \textbf{MSCoRe}, a novel benchmark comprising 126696 domain-specific QA instances spanning scenarios in automotive, pharmaceutical, electronics, and energy sectors. The dataset is created using a structured three-phase pipeline: dynamic sampling, iterative question-answer generation, and a multi-level quality assessment to ensure data quality. Tasks are further categorized into three difficulty levels according to stage coverage and complexity. With MSCoRe, we have conducted a comprehensive evaluation of various state-of-the-art LLM agents. The commercial models performed best across all tasks and scenarios, but a notable gap in ROUGE scores remains between simple and complex tasks. We also tested the models' robustness and found that their performance is negatively affected by noisy data. MSCoRe provides a valuable new resource for the community to evaluate and improve multi-stage reasoning in LLM agents. The code and data are available at https://github.com/D3E0-source/MSCoRE.

benchmark, large language model, machine learning, (18 more...)

2509.17628

Genre: Research Report (0.82)

Industry:

Energy (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Dynamical Alignment: A Principle for Adaptive Neural Computation

Chen, Xia

The computational capabilities of a neural network are widely assumed to be determined by its static architecture. Here we challenge this view by establishing that a fixed neural structure can operate in fundamentally different computational modes, driven not by its structure but by the temporal dynamics of its input signals. We term this principle 'Dynamical Alignment'. Applying this principle offers a novel resolution to the long-standing paradox of why brain-inspired spiking neural networks (SNNs) underperform. By encoding static input into controllable dynamical trajectories, we uncover a bimodal optimization landscape with a critical phase transition governed by phase space volume dynamics. A 'dissipative' mode, driven by contracting dynamics, achieves superior energy efficiency through sparse temporal codes. In contrast, an 'expansive' mode, driven by expanding dynamics, unlocks the representational power required for SNNs to match or even exceed their artificial neural network counterparts on diverse tasks, including classification, reinforcement learning, and cognitive integration. We find this computational advantage emerges from a timescale alignment between input dynamics and neuronal integration. This principle, in turn, offers a unified, computable perspective on long-observed dualities in neuroscience, from stability-plasticity dilemma to segregation-integration dynamic. It demonstrates that computation in both biological and artificial systems can be dynamically sculpted by 'software' on fixed 'hardware', pointing toward a potential paradigm shift for AI research: away from designing complex static architectures and toward mastering adaptive, dynamic computation principles.

artificial intelligence, dynamical alignment, machine learning, (18 more...)

2508.10064

Country:

North America (0.28)
Europe (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Energy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Chen, Chi-Sheng, Chen, Samuel Yen-Chi

Q-DPTS: Quantum Differentially Private Time Series Forecasting via Variational Quantum Circuits

Time series forecasting is vital in domains where data sensitivity is paramount, such as finance and energy systems. While Differential Privacy (DP) provides theoretical guarantees to protect individual data contributions, its integration especially via DP-SGD often impairs model performance due to injected noise. In this paper, we propose Q-DPTS, a hybrid quantum-classical framework for Quantum Differentially Private Time Series Forecasting. Q-DPTS combines Variational Quantum Circuits (VQCs) with per-sample gradient clipping and Gaussian noise injection, ensuring rigorous $(ε, δ)$-differential privacy. The expressiveness of quantum models enables improved robustness against the utility loss induced by DP mechanisms. We evaluate Q-DPTS on the ETT (Electricity Transformer Temperature) dataset, a standard benchmark for long-term time series forecasting. Our approach is compared against both classical and quantum baselines, including LSTM, QASA, QRWKV, and QLSTM. Results demonstrate that Q-DPTS consistently achieves lower prediction error under the same privacy budget, indicating a favorable privacy-utility trade-off. This work presents one of the first explorations into quantum-enhanced differentially private forecasting, offering promising directions for secure and accurate time series modeling in privacy-critical scenarios.

artificial intelligence, data mining, machine learning, (18 more...)

2508.05036

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.48)

Industry:

Energy (0.48)
Information Technology > Security & Privacy (0.47)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Intra-DP: A High Performance Collaborative Inference System for Mobile Edge Computing

Sun, Zekai, Guan, Xiuxian, Lin, Zheng, Fang, Zihan, Cai, Xiangming, Chen, Zhe, Liu, Fangming, Cui, Heming, Xiong, Jie, Ni, Wei, Yuen, Chau

Deploying deep neural networks (DNNs) on resource-constrained mobile devices presents significant challenges, particularly in achieving real-time performance while simultaneously coping with limited computational resources and battery life. While Mobile Edge Computing (MEC) offers collaborative inference with GPU servers as a promising solution, existing approaches primarily rely on layer-wise model partitioning and undergo significant transmission bottlenecks caused by the sequential execution of DNN operations. To address this challenge, we present Intra-DP, a high-performance collaborative inference system optimized for DNN inference on MEC. Intra DP employs a novel parallel computing technique based on local operators (i.e., operators whose minimum unit input is not the entire input tensor, such as the convolution kernel). By decomposing their computations (operations) into several independent sub-operations and overlapping the computation and transmission of different sub-operations through parallel execution, Intra-DP mitigates transmission bottlenecks in MEC, achieving fast and energy-efficient inference. The evaluation demonstrates that Intra-DP reduces per-inference latency by up to 50% and energy consumption by up to 75% compared to state-of-the-art baselines, without sacrificing accuracy.

artificial intelligence, cloud computing, machine learning, (17 more...)

2507.05829

Country: Asia > China (0.93)

Genre: Research Report (1.00)

Industry:

Information Technology (1.00)
Energy (0.67)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Communications > Mobile (1.00)
(5 more...)

Can Global XAI Methods Reveal Injected Bias in LLMs? SHAP vs Rule Extraction vs RuleSHAP

Sovrano, Francesco

Large language models (LLMs) can amplify misinformation, undermining societal goals like the UN SDGs. We study three documented drivers of misinformation (valence framing, information overload, and oversimplification) which are often shaped by one's default beliefs. Building on evidence that LLMs encode such defaults (e.g., "joy is positive," "math is complex") and can act as "bags of heuristics," we ask: can general belief-driven heuristics behind misinformative behaviour be recovered from LLMs as clear rules? A key obstacle is that global rule-extraction methods in explainable AI (XAI) are built for numerical inputs/outputs, not text. We address this by eliciting global LLM beliefs and mapping them to numerical scores via statistically reliable abstractions, thereby enabling off-the-shelf global XAI to detect belief-related heuristics in LLMs. To obtain ground truth, we hard-code bias-inducing nonlinear heuristics of increasing complexity (univariate, conjunctive, nonconvex) into popular LLMs (ChatGPT and Llama) via system instructions. This way, we find that RuleFit under-detects non-univariate biases, while global SHAP better approximates conjunctive ones but does not yield actionable rules. To bridge this gap, we propose RuleSHAP, a rule-extraction algorithm that couples global SHAP-value aggregations with rule induction to better capture non-univariate bias, improving heuristics detection over RuleFit by +94% (MRR@1) on average. Our results provide a practical pathway for revealing belief-driven biases in LLMs.

large language model, machine learning, natural language, (20 more...)

2505.11189

Country:

North America > United States (1.00)
Europe (0.92)

Genre: Research Report > New Finding (1.00)

Industry:

Law (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)

Delavande, Julien, Pierrard, Regis, Luccioni, Sasha

Video Killed the Energy Budget: Characterizing the Latency and Power Regimes of Open Text-to-Video Models

Recent advances in text-to-video (T2V) generation have enabled the creation of high-fidelity, temporally coherent clips from natural language prompts. Yet these systems come with significant computational costs, and their energy demands remain poorly understood. In this paper, we present a systematic study of the latency and energy consumption of state-of-the-art open-source T2V models. We first develop a compute-bound analytical model that predicts scaling laws with respect to spatial resolution, temporal length, and denoising steps. We then validate these predictions through fine-grained experiments on WAN2.1-T2V, showing quadratic growth with spatial and temporal dimensions, and linear scaling with the number of denoising steps. Finally, we extend our analysis to six diverse T2V models, comparing their runtime and energy profiles under default settings. Our results provide both a benchmark reference and practical insights for designing and deploying more sustainable generative video systems.

machine learning, natural language, resolution, (20 more...)

2509.19222

Genre: Research Report > New Finding (0.34)

Industry: Energy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Cuevas, Alejandro, Dash, Saloni, Nayak, Bharat Kumar, Vann, Dan, Daepp, Madeleine I. G.

Anecdoctoring: Automated Red-Teaming Across Language and Place

Disinformation is among the top risks of generative artificial intelligence (AI) misuse. Global adoption of generative AI necessitates red-teaming evaluations (i.e., systematic adversarial probing) that are robust across diverse languages and cultures, but red-teaming datasets are commonly US- and English-centric. To address this gap, we propose "anecdoctoring", a novel red-teaming approach that automatically generates adversarial prompts across languages and cultures. We collect misinformation claims from fact-checking websites in three languages (English, Spanish, and Hindi) and two geographies (US and India). We then cluster individual claims into broader narratives and characterize the resulting clusters with knowledge graphs, with which we augment an attacker LLM. Our method produces higher attack success rates and offers interpretability benefits relative to few-shot prompting. Results underscore the need for disinformation mitigations that scale globally and are grounded in real-world adversarial misuse.

large language model, machine learning, natural language, (18 more...)

2509.19143

Country:

North America > United States (1.00)
Asia (1.00)

Genre: Research Report > New Finding (1.00)

Industry:

Media > News (1.00)
Energy (1.00)
Government > Regional Government > North America Government > United States Government (0.93)
Health & Medicine > Therapeutic Area > Immunology (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.69)

Memory in Large Language Models: Mechanisms, Evaluation and Evolution

Zhang, Dianxing, Li, Wendong, Song, Kani, Lu, Jiaye, Li, Gang, Yang, Liuchun, Li, Sheng

Under a unified operational definition, we define LLM memory as a persistent state written during pretraining, finetuning, or inference that can later be addressed and that stably influences outputs. We propose a four-part taxonomy (parametric, contextual, external, procedural/episodic) and a memory quadruple (location, persistence, write/access path, controllability). We link mechanism, evaluation, and governance via the chain write -> read -> inhibit/update. To avoid distorted comparisons across heterogeneous setups, we adopt a three-setting protocol (parametric only, offline retrieval, online retrieval) that decouples capability from information availability on the same data and timeline. On this basis we build a layered evaluation: parametric (closed-book recall, edit differential, memorization/privacy), contextual (position curves and the mid-sequence drop), external (answer correctness vs snippet attribution/faithfulness), and procedural/episodic (cross-session consistency and timeline replay, E MARS+). The framework integrates temporal governance and leakage auditing (freshness hits, outdated answers, refusal slices) and uncertainty reporting via inter-rater agreement plus paired tests with multiple-comparison correction. For updating and forgetting, we present DMM Gov: coordinating DAPT/TAPT, PEFT, model editing (ROME, MEND, MEMIT, SERAC), and RAG to form an auditable loop covering admission thresholds, rollout, monitoring, rollback, and change audits, with specs for timeliness, conflict handling, and long-horizon consistency. Finally, we give four testable propositions: minimum identifiability; a minimal evaluation card; causally constrained editing with verifiable forgetting; and when retrieval with small-window replay outperforms ultra-long-context reading. This yields a reproducible, comparable, and governable coordinate system for research and deployment.

large language model, machine learning, natural language, (20 more...)

2509.18868

Country:

Europe (0.67)
Asia > Middle East (0.27)

Genre: Research Report > Experimental Study (1.00)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Energy (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)