AITopics

We present the Hierarchical AI-Meteorologist, an LLM-agent system that generates explainable weather reports using a hierarchical forecast reasoning and weather keyword generation. Unlike standard approaches that treat forecasts as flat time series, our framework performs multi-scale reasoning across hourly, 6-hour, and daily aggregations to capture both short-term dynamics and long-term trends. Its core reasoning agent converts structured meteorological inputs into coherent narratives while simultaneously extracting a few keywords effectively summarizing the dominant meteorological events. These keywords serve as semantic anchors for validating consistency, temporal coherence and factual alignment of the generated reports. Using OpenWeather and Meteostat data, we demonstrate that hierarchical context and keyword-based validation substantially improve interpretability and robustness of LLM-generated weather narratives, offering a reproducible framework for semantic evaluation of automated meteorological reporting and advancing agent-based scientific reasoning.

artificial intelligence, large language model, natural language, (18 more...)

2511.23387

Country:

North America > United States (0.48)
Europe > Russia (0.48)
Asia > Philippines > Luzon > National Capital Region > City of Manila (0.15)

Genre: Research Report (0.40)

Industry: Government > Regional Government > North America Government > United States Government (0.30)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.66)

Learning to Predict Aboveground Biomass from RGB Images with 3D Synthetic Scenes

Zuffi, Silvia

Forests play a critical role in global ecosystems by supporting biodiversity and mitigating climate change via carbon sequestration. Accurate aboveground biomass (AGB) estimation is essential for assessing carbon storage and wildfire fuel loads, yet traditional methods rely on labor-intensive field measurements or remote sensing approaches with significant limitations in dense vegetation. In this work, we propose a novel learning-based method for estimating AGB from a single ground-based RGB image. We frame this as a dense prediction task, introducing AGB density maps, where each pixel represents tree biomass normalized by the plot area and each tree's image area. We leverage the recently introduced synthetic 3D SPREAD dataset, which provides realistic forest scenes with per-image tree attributes (height, trunk and canopy diameter) and instance segmentation masks. Using these assets, we compute AGB via allometric equations and train a model to predict AGB density maps, integrating them to recover the AGB estimate for the captured scene. Our approach achieves a median AGB estimation error of 1.22 kg/m^2 on held-out SPREAD data and 1.94 kg/m^2 on a real-image dataset. To our knowledge, this is the first method to estimate aboveground biomass directly from a single RGB image, opening up the possibility for a scalable, interpretable, and cost-effective solution for forest monitoring, while also enabling broader participation through citizen science initiatives.

artificial intelligence, density map, machine learning, (13 more...)

2511.23249

Country: North America > United States (0.93)

Genre: Research Report (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (0.68)
Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Communications (0.88)

Heteroscedastic Neural Networks for Path Loss Prediction with Link-Specific Uncertainty

Ethier, Jonathan

Traditional and modern machine learning-based path loss models typically assume a constant prediction variance. We propose a neural network that jointly predicts the mean and link-specific variance by minimizing a Gaussian negative log-likelihood, enabling heteroscedastic uncertainty estimates. We compare shared, partially shared, and independent-parameter architectures using accuracy, calibration, and sharpness metrics on blind test sets from large public RF drive-test datasets. The shared-parameter architecture performs best, achieving an RMSE of 7.4 dB, 95.1 percent coverage for 95 percent prediction intervals, and a mean interval width of 29.6 dB. These uncertainty estimates further support link-specific coverage margins, improve RF planning and interference analyses, and provide effective self-diagnostics of model weaknesses.

architecture, artificial intelligence, machine learning, (15 more...)

2511.23243

Country: Europe > United Kingdom (0.47)

Genre: Research Report (1.00)

Industry: Government (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Filho, Sebastião Alves de Jesus, Bernardo, Gustavo Di Giovanni, Gabriel, Paulo Henrique Ribeiro, Zarpelão, Bruno Bogaz, Miani, Rodrigo Sanches

Identification of Malicious Posts on the Dark Web Using Supervised Machine Learning

Given the constant growth and increasing sophistication of cyberattacks, cybersecurity can no longer rely solely on traditional defense techniques and tools. Proactive detection of cyber threats has become essential to help security teams identify potential risks and implement effective mitigation measures. Cyber Threat Intelligence (CTI) plays a key role by providing security analysts with evidence-based knowledge about cyber threats. CTI information can be extracted using various techniques and data sources; however, machine learning has proven promising. As for data sources, social networks and online discussion forums are commonly explored. In this study, we apply text mining techniques and machine learning to data collected from Dark Web forums in Brazilian Portuguese to identify malicious posts. Our contributions include the creation of three original datasets, a novel multi-stage labeling process combining indicators of compromise (IoCs), contextual keywords, and manual analysis, and a comprehensive evaluation of text representations and classifiers. To our knowledge, this is the first study to focus specifically on Brazilian Portuguese content in this domain. The best-performing model, using LightGBM and TF-IDF, was able to detect relevant posts with high accuracy. We also applied topic modeling to validate the model's outputs on unlabeled data, confirming its robustness in real-world scenarios.

artificial intelligence, machine learning, natural language, (19 more...)

2511.23183

Country: South America > Brazil (0.68)

Genre: Research Report > New Finding (0.66)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (0.88)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.93)
(2 more...)

Are LLMs Good Safety Agents or a Propaganda Engine?

Yadav, Neemesh, Ortu, Francesco, Liu, Jiarui, Yook, Joeun, Schölkopf, Bernhard, Mihalcea, Rada, Cazzaniga, Alberto, Jin, Zhijing

Large Language Models (LLMs) are trained to refuse to respond to harmful content. However, systematic analyses of whether this behavior is truly a reflection of its safety policies or an indication of political censorship, that is practiced globally by countries, is lacking. Differentiating between safety influenced refusals or politically motivated censorship is hard and unclear. For this purpose we introduce PSP, a dataset built specifically to probe the refusal behaviors in LLMs from an explicitly political context. PSP is built by formatting existing censored content from two data sources, openly available on the internet: sensitive prompts in China generalized to multiple countries, and tweets that have been censored in various countries. We study: 1) impact of political sensitivity in seven LLMs through data-driven (making PSP implicit) and representation-level approaches (erasing the concept of politics); and, 2) vulnerability of models on PSP through prompt injection attacks (PIAs). Associating censorship with refusals on content with masked implicit intent, we find that most LLMs perform some form of censorship. We conclude with summarizing major attributes that can cause a shift in refusal distributions across models and contexts of different countries.

large language model, machine learning, natural language, (18 more...)

2511.23174

Country:

Europe (1.00)
Asia (1.00)
North America > United States (0.93)

Genre: Research Report > New Finding (1.00)

Industry:

Law > Civil Rights & Constitutional Law (1.00)
Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Papasotiropoulos, Georgios, Pishbin, Zein

Fairness in the Multi-Secretary Problem

This paper bridges two perspectives: it studies the multi-secretary problem through the fairness lens of social choice, and examines multi-winner elections from the viewpoint of online decision making. After identifying the limitations of the prominent proportionality notion of Extended Justified Representation (EJR) in the online domain, the work proposes a set of mechanisms that merge techniques from online algorithms with rules from social choice -- such as the Method of Equal Shares and the Nash Rule -- and supports them through both theoretical analysis and extensive experimental evaluation.

artificial intelligence, machine learning, proceedings, (14 more...)

2511.23097

Country: Europe (0.46)

Genre: Research Report (0.63)

Industry: Government > Voting & Elections (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Mahmudlu, Mikayil, Karakuş, Oktay, Arkadaş, Hasan

What If They Took the Shot? A Hierarchical Bayesian Framework for Counterfactual Expected Goals

This study develops a hierarchical Bayesian framework that integrates expert domain knowledge to quantify player-specific effects in expected goals (xG) estimation, addressing a limitation of standard models that treat all players as identical finishers. Using 9,970 shots from StatsBomb's 2015-16 data and Football Manager 2017 ratings, we combine Bayesian logistic regression with informed priors to stabilise player-level estimates, especially for players with few shots. The hierarchical model reduces posterior uncertainty relative to weak priors and achieves strong external validity: hierarchical and baseline predictions correlate at R2 = 0.75, while an XGBoost benchmark validated against StatsBomb xG reaches R2 = 0.833. The model uncovers interpretable specialisation profiles, including one-on-one finishing (Aguero, Suarez, Belotti, Immobile, Martial), long-range shooting (Pogba), and first-touch execution (Insigne, Salah, Gameiro). It also identifies latent ability in underperforming players such as Immobile and Belotti. The framework supports counterfactual "what-if" analysis by reallocating shots between players under identical contexts. Case studies show that Sansone would generate +2.2 xG from Berardi's chances, driven largely by high-pressure situations, while Vardy-Giroud substitutions reveal strong asymmetry: replacing Vardy with Giroud results in a large decline (about -7 xG), whereas the reverse substitution has only a small effect (about -1 xG). This work provides an uncertainty-aware tool for player evaluation, recruitment, and tactical planning, and offers a general approach for domains where individual skill and contextual factors jointly shape performance.

artificial intelligence, giroud, machine learning, (18 more...)

2511.23072

Country: Europe > United Kingdom > England (0.15)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.88)

Industry:

Leisure & Entertainment > Sports > Soccer (1.00)
Government > Military (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.85)

Rony, Sidharth, Patman, Jack

Standard Occupation Classifier -- A Natural Language Processing Approach

Standard Occupational Classifiers (SOC) are systems used to categorize and classify different types of jobs and occupations based on their similarities in terms of job duties, skills, and qualifications. Integrating these facets with Big Data from job advertisement offers the prospect to investigate labour demand that is specific to various occupations. This project investigates the use of recent developments in natural language processing to construct a classifier capable of assigning an occupation code to a given job advertisement. We develop various classifiers for both UK ONS SOC and US O*NET SOC, using different Language Models. We find that an ensemble model, which combines Google BERT and a Neural Network classifier while considering job title, description, and skills, achieved the highest prediction accuracy. Specifically, the ensemble model exhibited a classification accuracy of up to 61% for the lower (or fourth) tier of SOC, and 72% for the third tier of SOC. This model could provide up to date, accurate information on the evolution of the labour market using job advertisements.

classification, large language model, machine learning, (18 more...)

2511.23057

Country:

Europe > United Kingdom (0.46)
North America > United States (0.46)

Genre:

Overview (1.00)
Research Report > New Finding (0.93)

Industry:

Marketing (0.93)
Government > Regional Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

A Game-Theoretic Approach for Adversarial Information Fusion in Distributed Sensor Networks

Kallas, Kassem

Every day we share our personal information through digital systems which are constantly exposed to threats. For this reason, security-oriented disciplines of signal processing have received increasing attention in the last decades: multimedia forensics, digital watermarking, biometrics, network monitoring, steganography and steganalysis are just a few examples. Even though each of these fields has its own peculiarities, they all have to deal with a common problem: the presence of one or more adversaries aiming at making the system fail. Adversarial Signal Processing lays the basis of a general theory that takes into account the impact that the presence of an adversary has on the design of effective signal processing tools. By focusing on the application side of Adversarial Signal Processing, namely adversarial information fusion in distributed sensor networks, and adopting a game-theoretic approach, this thesis contributes to the above mission by addressing four issues. First, we address decision fusion in distributed sensor networks by developing a novel soft isolation defense scheme that protect the network from adversaries, specifically, Byzantines. Second, we develop an optimum decision fusion strategy in the presence of Byzantines. In the next step, we propose a technique to reduce the complexity of the optimum fusion by relying on a novel near-optimum message passing algorithm based on factor graphs. Finally, we introduce a defense mechanism to protect decentralized networks running consensus algorithm against data falsification attacks.

artificial intelligence, detection and information fusion, machine learning, (17 more...)

2511.23026

Country:

North America > United States > California (1.00)
Europe (1.00)
Asia > Middle East (0.67)

Genre:

Overview (1.00)
Research Report > New Finding (0.67)

Industry:

Telecommunications (1.00)
Leisure & Entertainment > Games (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
(3 more...)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Communications > Networks > Sensor Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
(2 more...)

Wang, Xiaoxuan, Stadler, Rolf

A Modular Framework for Rapidly Building Intrusion Predictors

Abstract-- We study automated intrusion prediction in an IT system using statistical learning methods. The focus is on developing online attack predictors that detect attacks in real time and identify the current stage of the attack. While such predictors have been proposed in the recent literature, these works typically rely on constructing a monolithic predictor tailored to a specific attack type and scenario. Given that hundreds of attack types are cataloged in the MITRE framework, training a separate monolithic predictor for each of them is infeasible. In this paper, we propose a modular framework for rapidly assembling online attack predictors from reusable components. Using public datasets for training and evaluation, we provide many examples of modular predictors and show how an effective predictor can be dynamically assembled during training from a network of modular components. Traditional intrusion detection systems (IDS), such as Snort [1] or Suricata [2], rely on rule-based configurations that are manually crafted and maintained by domain experts. The growing complexity and rapid evolution of IT systems make the maintenance of these rules increasingly challenging and time-consuming. As a response, research efforts into automated cyberdefence have started, based on the idea that attack patterns can be dynamically learned. The rules are no longer defined by humans, but automatically inferred from observing systems under attack. Over the last decade, various approaches have been proposed for automated cyberdefence, most of them based on statistical learning, e.g., [3], [4], [5], [6]. We follow this direction in the paper. We are specifically interested in predicting the stage of an ongoing attack in real time, based on current and earlier observations of an IT system.

artificial intelligence, machine learning, predictor, (16 more...)

2511.23

Genre: Research Report (0.50)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military (0.94)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)