AITopics

--Digital twin (DT)-driven deep reinforcement learning (DRL) has emerged as a promising paradigm for wireless network optimization, offering safe and efficient training environment for policy exploration. However, in theory existing methods cannot always guarantee real-world performance of DT - trained policies before actual deployment, due to the absence of a universal metric for assessing DT's ability to support reliable DRL training transferrable to physical networks. In this paper, we propose the DT bisimulation metric (DT -BSM), a novel metric based on the Wasserstein distance, to quantify the discrepancy between Markov decision processes (MDPs) in both the DT and the corresponding real-world wireless network environment. We prove that for any DT -trained policy, the sub-optimality of its performance (regret) in the real-world deployment is bounded by a weighted sum of the DT -BSM and its sub-optimality within the MDP in the DT . Then, a modified DT -BSM based on the total variation distance is also introduced to avoid the prohibitive calculation complexity of Wasserstein distance for large-scale wireless network scenarios. Further, to tackle the challenge of obtaining accurate transition probabilities of the MDP in real world for the DT -BSM calculation, we propose an empirical DT - BSM method based on statistical sampling. We prove that the empirical DT -BSM always converges to the desired theoretical one, and quantitatively establish the relationship between the required sample size and the target level of approximation accuracy. Index T erms --Digital twin, Markov decision process (MDP), deep reinforcement learning (DRL), transfer learning, bisimula-tion metric. HE long-term evolution of cellular networks, marked by growing scale, density, and heterogeneity, substantially increases the difficulty of wireless network optimization [1]. Deep reinforcement learning (DRL) emerges as a promising solution for tackling extensive state and action spaces and nonconvex optimization problems. It has been successfully applied to various network optimization tasks, such as admission control [2], resource allocation [3], node selection [4], and task offloading [5] in wireless networks. Z. Tao, W . Xu, and X. Y ou are with the National Mobile Communications Research Lab, Southeast University, Nanjing 210096, China, and also with the Pervasive Communication Research Center, Purple Mountain Laboratories, Nanjing 211111, China (email: {zhenyu tao, wxu, xhyu }@seu.edu.cn). To overcome these issues, the concept of digital twin (DT) has been introduced [7].

dt -bsm, inequality, transition probability, (15 more...)

2502.17983

Country:

Asia > China > Jiangsu Province > Nanjing (0.44)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
(3 more...)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Industry: Telecommunications (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Liu, Qianying, Wang, Katrina Qiyao, Cheng, Fei, Kurohashi, Sadao

Assessing Large Language Models in Agentic Multilingual National Bias

Large Language Models have garnered significant attention for their capabilities in multilingual natural language processing, while studies on risks associated with cross biases are limited to immediate context preferences. Cross-language disparities in reasoning-based recommendations remain largely unexplored, with a lack of even descriptive analysis. This study is the first to address this gap. We test LLM's applicability and capability in providing personalized advice across three key scenarios: university applications, travel, and relocation. We investigate multilingual bias in state-of-the-art LLMs by analyzing their responses to decision-making tasks across multiple languages. We quantify bias in model-generated scores and assess the impact of demographic factors and reasoning strategies (e.g., Chain-of-Thought prompting) on bias patterns. Our findings reveal that local language bias is prevalent across different tasks, with GPT-4 and Sonnet reducing bias for English-speaking countries compared to GPT-3.5 but failing to achieve robust multilingual alignment, highlighting broader implications for multilingual AI agents and applications such as education.

computational linguistic, language group, recommendation, (15 more...)

2502.17945

Country:

Europe > Germany (0.04)
Asia > Thailand > Bangkok > Bangkok (0.04)
Asia > Singapore (0.04)
(21 more...)

Genre: Research Report > New Finding (0.88)

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.60)

Subudhi, Sonalika, Pati, Alok Kumar, Bose, Sephali, Sahoo, Subhasmita, Pattanaik, Avipsa, Acharya, Biswa Mohan

Integrating Boosted learning with Differential Evolution (DE) Optimizer: A Prediction of Groundwater Quality Risk Assessment in Odisha

Groundwater is eventually undermined by human exercises, such as fast industrialization, urbanization, over-extraction, and contamination from agrarian and urban sources. From among the different contaminants, the presence of heavy metals like cadmium (Cd), chromium (Cr), arsenic (As), and lead (Pb) proves to have serious dangers when present in huge concentrations in groundwater. Long-term usage of these poisonous components may lead to neurological disorders, kidney failure and different sorts of cancer. To address these issues, this study developed a machine learning-based predictive model to evaluate the Groundwater Quality Index (GWQI) and identify the main contaminants which are affecting the water quality. It has been achieved with the help of a hybrid machine learning model i.e. LCBoost Fusion . The model has undergone several processes like data preprocessing, hyperparameter tuning using Differential Evolution (DE) optimization, and evaluation through cross-validation. The LCBoost Fusion model outperforms individual models (CatBoost and LightGBM), by achieving low RMSE (0.6829), MSE (0.5102), MAE (0.3147) and a high R$^2$ score of 0.9809. Feature importance analysis highlights Potassium (K), Fluoride (F) and Total Hardness (TH) as the most influential indicators of groundwater contamination. This research successfully demonstrates the application of machine learning in assessing groundwater quality risks in Odisha. The proposed LCBoost Fusion model offers a reliable and efficient approach for real-time groundwater monitoring and risk mitigation. These findings will help the environmental organizations and the policy makers to map out targeted places for sustainable groundwater management. Future work will focus on using remote sensing data and developing an interactive decision-making system for groundwater quality assessment.

lcboost fusion, prediction, quality indicator, (13 more...)

2502.17929

Country:

North America > United States (0.69)
Africa > South Africa (0.04)
Oceania > New Zealand (0.04)
(14 more...)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.88)

Industry:

Water & Waste Management > Water Management > Water Supplies & Services (1.00)
Law (1.00)
Health & Medicine (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.67)

Kheya, Tahsin Alamgir, Bouadjenek, Mohamed Reda, Aryal, Sunil

Unmasking Gender Bias in Recommendation Systems and Enhancing Category-Aware Fairness

Recommendation systems are now an integral part of our daily lives. We rely on them for tasks such as discovering new movies, finding friends on social media, and connecting job seekers with relevant opportunities. Given their vital role, we must ensure these recommendations are free from societal stereotypes. Therefore, evaluating and addressing such biases in recommendation systems is crucial. Previous work evaluating the fairness of recommended items fails to capture certain nuances as they mainly focus on comparing performance metrics for different sensitive groups. In this paper, we introduce a set of comprehensive metrics for quantifying gender bias in recommendations. Specifically, we show the importance of evaluating fairness on a more granular level, which can be achieved using our metrics to capture gender bias using categories of recommended items like genres for movies. Furthermore, we show that employing a category-aware fairness metric as a regularization term along with the main recommendation loss during training can help effectively minimize bias in the models' output. We experiment on three real-world datasets, using five baseline models alongside two popular fairness-aware models, to show the effectiveness of our metrics in evaluating gender bias. Our metrics help provide an enhanced insight into bias in recommended items compared to previous metrics. Additionally, our results demonstrate how incorporating our regularization term significantly improves the fairness in recommendations for different categories without substantial degradation in overall recommendation performance.

fairness, proceedings, recommendation, (12 more...)

doi: 10.1145/3696410.3714528

2502.17921

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > New York > New York County > New York City (0.07)
(21 more...)

Genre: Research Report > New Finding (0.86)

Industry:

Leisure & Entertainment (1.00)
Information Technology (0.93)
Media > Film (0.66)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)

Krupp, Lars, Geißler, Daniel, Lukowicz, Paul, Karolus, Jakob

Towards Sustainable Web Agents: A Plea for Transparency and Dedicated Metrics for Energy Consumption

Improvements in the area of large language models have shifted towards the construction of models capable of using external tools and interpreting their outputs. These so-called web agents have the ability to interact autonomously with the internet. This allows them to become powerful daily assistants handling time-consuming, repetitive tasks while supporting users in their daily activities. While web agent research is thriving, the sustainability aspect of this research direction remains largely unexplored. We provide an initial exploration of the energy and CO2 cost associated with web agents. Our results show how different philosophies in web agent creation can severely impact the associated expended energy. We highlight lacking transparency regarding the disclosure of model parameters and processes used for some web agents as a limiting factor when estimating energy consumption. As such, our work advocates a change in thinking when evaluating web agents, warranting dedicated metrics for energy consumption and sustainability.

agent, energy consumption, mindact, (13 more...)

2502.17903

Country:

North America > United States (1.00)
Europe > Germany > Rhineland-Palatinate > Kaiserslautern (0.05)
Oceania > Australia (0.04)
(5 more...)

Genre: Research Report > New Finding (0.54)

Industry:

Energy (1.00)
Government > Regional Government > North America Government > United States Government (0.68)

Technology:

Information Technology > Communications > Web (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Schreyer, W. Max, Anderson, Christopher, Thompson, Reid F.

Generalization is not a universal guarantee: Estimating similarity to training data with an ensemble out-of-distribution metric

Failure of machine learning models to generalize to new data is a core problem limiting the reliability of AI systems, partly due to the lack of simple and robust methods for comparing new data to the original training dataset. We propose a standardized approach for assessing data similarity in a model-agnostic manner by constructing a supervised autoencoder for generalizability estimation (SAGE). We compare points in a low-dimensional embedded latent space, defining empirical probability measures for k -Nearest Neighbors (kNN) distance, reconstruction of inputs and task-based performance. As proof of concept for classification tasks, we use MNIST and CIFAR-10 to demonstrate how an ensemble output probability score can separate deformed images from a mixture of typical test examples, and how this SAGE score is robust to transformations of increasing severity. As further proof of concept, we extend this approach to a regression task using non-imaging data (UCI Abalone). In all cases, we show that out-of-the-box model performance increases after SAGE score filtering, even when applied to data from the model's own training and test datasets. Our out-of-distribution scoring method can be introduced during several steps of model construction and assessment, leading to future improvements in responsible deep learning implementation. 1 Background The presence of generalization gaps, where machine learning performance degrades when a trained model encounters previously-unseen data, represents a critical ongoing challenge in the implementation of AI systems.

dataset, reconstruction error, sage score, (16 more...)

2502.16329

Country:

North America > United States > Oregon > Multnomah County > Portland (0.05)
Oceania > Australia > Tasmania (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Indian Ocean > Bass Strait (0.04)

Genre: Research Report (0.82)

Industry:

Transportation (1.00)
Health & Medicine (1.00)
Government > Military (0.68)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Madureira, Brielen, Schlangen, David

Can Visual Dialogue Models Do Scorekeeping? Exploring How Dialogue Representations Incrementally Encode Shared Knowledge

Cognitively plausible visual dialogue models should keep a mental scoreboard of shared established facts in the dialogue context. We propose a theory-based evaluation method for investigating to what degree models pretrained on the VisDial dataset incrementally build representations that appropriately do scorekeeping. Our conclusion is that the ability to make the distinction between shared and privately known statements along the dialogue is moderately present in the analysed models, but not always incrementally consistent, which may partially be due to the limited need for grounding interactions in the original task.

computational linguistic, dialogue, proposition, (14 more...)

2204.0697

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Italy > Tuscany > Florence (0.04)
Europe > Germany > Brandenburg > Potsdam (0.04)
(10 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Kim, Dangchan, Lim, Chae Young

Differentially private synthesis of Spatial Point Processes

arXiv.org Machine LearningFeb-25-2025

This paper proposes a method to generate synthetic data for spatial point patterns within the differential privacy (DP) framework. Specifically, we define a differentially private Poisson point synthesizer (PPS) and Cox point synthesizer (CPS) to generate synthetic point patterns with the concept of the $\alpha$-neighborhood that relaxes the original definition of DP. We present three example models to construct a differentially private PPS and CPS, providing sufficient conditions on their parameters to ensure the DP given a specified privacy budget. In addition, we demonstrate that the synthesizers can be applied to point patterns on the linear network. Simulation experiments demonstrate that the proposed approaches effectively maintain the privacy and utility of synthetic data.

differentially private synthesis, point synthesizer, synthesizer, (15 more...)

arXiv.org Machine Learning

2502.18198

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Asia > South Korea > Seoul > Seoul (0.04)
Oceania > Australia > Queensland > Brisbane (0.04)
(2 more...)

Genre: Research Report > New Finding (0.93)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Modeling & Simulation (0.88)

Farahbakhsh, Ehsan, Goel, Dakshi, Pimparkar, Dhiraj, Muller, R. Dietmar, Chandra, Rohitash

Convolutional neural networks for mineral prospecting through alteration mapping with remote sensing data

arXiv.org Artificial IntelligenceFeb-24-2025

Traditional geological mapping methods, which rely on field observations and rock sample analysis, are ine fficient for continuous spatial mapping of geological features such as alteration zones. Deep learning models such as convolutional neural networks (CNNs) have ushered in a transformative era in remote sensing data analysis. CNNs excel in automatically extracting features from image data for classification and regression problems. CNNs have the ability to pinpoint specific mineralogical changes attributed to mineralisation processes by discerning subtle features within remote sensing data. Our methodology involves model training using two distinct sets of training samples generated through ground truth data and a fully automated approach through selective principal component analysis (PCA). We also compare CNNs with conventional machine learning models, including k-nearest neighbours, support vector machines, and multilayer perceptron. Our findings indicate that training with a ground truth-based dataset produces more reliable alteration maps. Additionally, we find that CNNs perform slightly better when compared to conventional machine learning models, which further demonstrates the ability of CNNs to capture spatial patterns in remote sensing data e ffectively. We find that Landsat 9 surpasses Landsat 8 in mapping iron oxide areas when employing the CNNs model trained with ground truth data obtained by field surveys. We also observe that using ASTER data with the CNNs model trained on the ground truth-based dataset produces the most accurate maps for two other important types of alteration zones, argillic and propylitic. This underscores the utility of CNNs in enhancing the e fficiency and precision of geological mapping, particularly in discerning subtle alterations indicative of mineralisation processes, especially those associated with critical metal resources. Introduction Geological maps are traditionally crafted through ground surveys and founded on field observations. They frequently incur inevitable errors due to the lack of spatial continuity of the field observations, thus yielding inaccurate representations (Campbell et al., 2005). Recognising these limitations, geologists have been prompted to seek innovative approaches and e fficient methodologies to accurately map geological features, particularly alteration zones (Kesler, 2007; McCuaig et al., 2010). The utilisation of remote sensing data for alteration mapping emerges as a pivotal technique in regional mineral exploration, enabling the precise spatial identification of alteration zones associated with mineralisation processes (Mohamed et al., 2021).

artificial intelligence, machine learning, survey article, (18 more...)

2502.18533

Country:

North America > United States (0.68)
Oceania > Australia > New South Wales (0.14)
Europe (0.14)
Asia > India (0.14)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Materials > Metals & Mining (1.00)
Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (1.00)
Energy > Oil & Gas > Upstream (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Koenig, Michael, Rauch, Jakob, Woerter, Martin

Real-time Monitoring of Economic Shocks using Company Websites

arXiv.org Artificial IntelligenceFeb-24-2025

Understanding the effects of economic shocks on firms is critical for analyzing economic growth and resilience. We introduce a Web-Based Affectedness Indicator (W AI), a general-purpose tool for real-time monitoring of economic disruptions across diverse contexts. By leveraging Large Language Model (LLM) assisted classification and information extraction on texts from over five million company websites, W AI quantifies the degree and nature of firms' responses to external shocks. Using the COVID-19 pandemic as a specific application, we show that W AI is highly correlated with pandemic containment measures and reliably predicts firm performance. Unlike traditional data sources, W AI provides timely firm-level information across industries and geographies worldwide that would otherwise be unavailable due to institutional and data availability constraints. This methodology offers significant potential for monitoring and mitigating the impact of technological, political, financial, health or environmental crises, and represents a transformative tool for adaptive policy-making and economic resilience. Economic shocks, whether driven by public health crises, technological disruptions, geopolitical conflicts, or climate events, pose significant challenges to businesses and policymakers alike. Timely and accurate monitoring of these shocks is critical for crafting effective responses and enhancing economic resilience. However, traditional methods for measuring the impacts of such disruptions - such as surveys and administrative data - are often limited by costs, time lags, and coverage. In this study, we introduce the Web-Based Affectedness Indicator (W AI), a scalable and cost-effective tool for real-time monitoring of economic disruptions at the firm level. By analyzing textual data from millions of company websites, W AI provides granular insights into how firms experience and respond to external shocks. This 1 methodology overcomes traditional limitations by leveraging ubiquitous online content and state-of-the-art natural language processing (NLP) models to generate a dynamic and comprehensive view of economic affectedness. W AI can provide information on a wide range of challenges, including supply chain disruptions, financial crises, and climate-related shocks.

large language model, machine learning, real time system, (21 more...)

2502.17161

Country:

Europe > United Kingdom (0.14)
Asia > Indonesia (0.14)
South America > Colombia (0.14)
(71 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Banking & Finance > Economy (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.96)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Architecture > Real Time Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)