Government
High-Resolution Water Sampling via a Solar-Powered Autonomous Surface Vehicle
Mamani, Misael, Fernandez, Mariel, Luna, Grace, Limachi, Steffani, Apaza, Leonel, Montes-Dávalos, Carolina, Herrera, Marcelo, Salcedo, Edwin
Accurate water quality assessment requires spatially resolved sampling, yet most unmanned surface vehicles (USVs) can collect only a limited number of samples or rely on single-point sensors with poor representativeness. This work presents a solar-powered, fully autonomous USV featuring a novel syringe-based sampling architecture capable of acquiring 72 discrete, contamination-minimized water samples per mission. The vehicle incorporates a ROS 2 autonomy stack with GPS-RTK navigation, LiDAR and stereo-vision obstacle detection, Nav2-based mission planning, and long-range LoRa supervision, enabling dependable execution of sampling routes in unstructured environments. The platform integrates a behavior-tree autonomy architecture adapted from Nav2, enabling mission-level reasoning and perception-aware navigation. A modular 6x12 sampling system, controlled by distributed micro-ROS nodes, provides deterministic actuation, fault isolation, and rapid module replacement, achieving spatial coverage beyond previously reported USV-based samplers. Field trials in Achocalla Lagoon (La Paz, Bolivia) demonstrated 87% waypoint accuracy, stable autonomous navigation, and accurate physicochemical measurements (temperature, pH, conductivity, total dissolved solids) comparable to manually collected references. These results demonstrate that the platform enables reliable high-resolution sampling and autonomous mission execution, providing a scalable solution for aquatic monitoring in remote environments.
Ethics Readiness of Artificial Intelligence: A Practical Evaluation Method
Adomaitis, Laurynas, Israel-Jost, Vincent, Grinbaum, Alexei
In the governance of emerging technologies, ethical guidance has often relied on so-called soft law instruments--codes of conduct, guidelines, or frameworks--designed to promote responsible behavior without imposing binding legal constraints. This is partly due to the difficulty of imposing harmonized regulations across the EU, especially in a global context characterized by strong reservations expressed by other international actors, e.g. the United States of America, with regard to the regulation of artificial intelligence (AI) that "unduly burdens AI innovation" (Kratsios, Sacks, and Rubio 2025) . Another reason is related to the principle, upheld in several member states such as Germany, that protects scientific freedom by constitutional law. Nevertheless, the recent trajectory of technological regulation in the European Union shows that soft law can evolve into hard law: this has been the case, notably, with the adoption of the AI Act (European Commission 2022; Terpan 2015) .
The Gender Code: Gendering the Global Governance of Artificial Intelligence
This paper examines how international AI governance frameworks address gender issues and gender-based harms. The analysis covers binding regulations, such as the EU AI Act; soft law instruments, like the UNESCO Recommendations on AI Ethics; and global initiatives, such as the Global Partnership on AI (GPAI). These instruments reveal emerging trends, including the integration of gender concerns into broader human rights frameworks, a shift toward explicit gender-related provisions, and a growing emphasis on inclusivity and diversity. Yet, some critical gaps persist, including inconsistent treatment of gender across governance documents, limited engagement with intersectionality, and a lack of robust enforcement mechanisms. However, this paper argues that effective AI governance must be intersectional, enforceable, and inclusive. This is key to moving beyond tokenism toward meaningful equity and preventing reinforcement of existing inequalities. The study contributes to ethical AI debates by highlighting the importance of gender-sensitive governance in building a just technological future.
Comparative Analysis of Hash-based Malware Clustering via K-Means
Thein, Aink Acrie Soe, Pitropakis, Nikolaos, Papadopoulos, Pavlos, Grierson, Sam, Jan, Sana Ullah
With the adoption of multiple digital devices in everyday life, the cyber-attack surface has increased. Adversaries are continuously exploring new avenues to exploit them and deploy malware. On the other hand, detection approaches typically employ hashing-based algorithms such as SSDeep, TLSH, and IMPHash to capture structural and behavioural similarities among binaries. This work focuses on the analysis and evaluation of these techniques for clustering malware samples using the K-means algorithm. More specifically, we experimented with established malware families and traits and found that TLSH and IMPHash produce more distinct, semantically meaningful clusters, whereas SSDeep is more efficient for broader classification tasks. The findings of this work can guide the development of more robust threat-detection mechanisms and adaptive security mechanisms.
Source Coverage and Citation Bias in LLM-based vs. Traditional Search Engines
Zhang, Peixian, Ye, Qiming, Peng, Zifan, Garimella, Kiran, Tyson, Gareth
LLM-based Search Engines (LLM-SEs) introduces a new paradigm for information seeking. Unlike Traditional Search Engines (TSEs) (e.g., Google), these systems summarize results, often providing limited citation transparency. The implications of this shift remain largely unexplored, yet raises key questions regarding trust and transparency. In this paper, we present a large-scale empirical study of LLM-SEs, analyzing 55,936 queries and the corresponding search results across six LLM-SEs and two TSEs. We confirm that LLM-SEs cites domain resources with greater diversity than TSEs. Indeed, 37% of domains are unique to LLM-SEs. However, certain risks still persist: LLM-SEs do not outperform TSEs in credibility, political neutrality and safety metrics. Finally, to understand the selection criteria of LLM-SEs, we perform a feature-based analysis to identify key factors influencing source choice. Our findings provide actionable insights for end users, website owners, and developers.
Cauchy-Schwarz Fairness Regularizer
Liu, Yezi, Chen, Hanning, Huang, Wenjun, Ni, Yang, Imani, Mohsen
Group fairness in machine learning is often enforced by adding a regularizer that reduces the dependence between model predictions and sensitive attributes. However, existing regularizers are built on heterogeneous distance measures and design choices, which makes their behavior hard to reason about and their performance inconsistent across tasks. This raises a basic question: what properties make a good fairness regularizer? We address this question by first organizing existing in-process methods into three families: (i) matching prediction statistics across sensitive groups, (ii) aligning latent representations, and (iii) directly minimizing dependence between predictions and sensitive attributes. Through this lens, we identify desirable properties of the underlying distance measure, including tight generalization bounds, robustness to scale differences, and the ability to handle arbitrary prediction distributions. Motivated by these properties, we propose a Cauchy-Schwarz (CS) fairness regularizer that penalizes the empirical CS divergence between prediction distributions conditioned on sensitive groups. Under a Gaussian comparison, we show that CS divergence yields a tighter bound than Kullback-Leibler divergence, Maximum Mean Discrepancy, and the mean disparity used in Demographic Parity, and we discuss how these advantages translate to a distribution-free, kernel-based estimator that naturally extends to multiple sensitive attributes. Extensive experiments on four tabular benchmarks and one image dataset demonstrate that the proposed CS regularizer consistently improves Demographic Parity and Equal Opportunity metrics while maintaining competitive accuracy, and achieves a more stable utility-fairness trade-off across hyperparameter settings compared to prior regularizers.
CourtPressGER: A German Court Decision to Press Release Summarization Dataset
Nagl, Sebastian, Elganayni, Mohamed, Pospisil, Melanie, Grabmair, Matthias
Official court press releases from Germany's highest courts present and explain judicial rulings to the public, as well as to expert audiences. Prior NLP efforts emphasize technical headnotes, ignoring citizen-oriented communication needs. We introduce CourtPressGER, a 6.4k dataset of triples: rulings, human-drafted press releases, and synthetic prompts for LLMs to generate comparable releases. This benchmark trains and evaluates LLMs in generating accurate, readable summaries from long judicial texts. We benchmark small and large LLMs using reference-based metrics, factual-consistency checks, LLM-as-judge, and expert ranking. Large LLMs produce high-quality drafts with minimal hierarchical performance loss; smaller models require hierarchical setups for long judgments. Initial benchmarks show varying model performance, with human-drafted releases ranking highest.
Black-Box Behavioral Distillation Breaks Safety Alignment in Medical LLMs
As medical large language models (LLMs) become increasingly integrated into clinical workflows, concerns around alignment robustness, and safety are escalating. Prior work on model extraction has focused on classification models or memorization leakage, leaving the vulnerability of safety-aligned generative medical LLMs underexplored. We present a black-box distillation attack that replicates the domain-specific reasoning of safety-aligned medical LLMs using only output-level access. By issuing 48,000 instruction queries to Meditron-7B and collecting 25,000 benign instruction response pairs, we fine-tune a LLaMA3 8B surrogate via parameter efficient LoRA under a zero-alignment supervision setting, requiring no access to model weights, safety filters, or training data. With a cost of $12, the surrogate achieves strong fidelity on benign inputs while producing unsafe completions for 86% of adversarial prompts, far exceeding both Meditron-7B (66%) and the untuned base model (46%). This reveals a pronounced functional-ethical gap, task utility transfers, while alignment collapses. To analyze this collapse, we develop a dynamic adversarial evaluation framework combining Generative Query (GQ)-based harmful prompt generation, verifier filtering, category-wise failure analysis, and adaptive Random Search (RS) jailbreak attacks. We also propose a layered defense system, as a prototype detector for real-time alignment drift in black-box deployments. Our findings show that benign-only black-box distillation exposes a practical and under-recognized threat: adversaries can cheaply replicate medical LLM capabilities while stripping safety mechanisms, underscoring the need for extraction-aware safety monitoring.
Towards Resilient Transportation: A Conditional Transformer for Accident-Informed Traffic Forecasting
Wang, Hongjun, Yong, Jiawei, Wang, Jiawei, Fukushima, Shintaro, Jiang, Renhe
Traffic prediction remains a key challenge in spatio-temporal data mining, despite progress in deep learning. Accurate forecasting is hindered by the complex influence of external factors such as traffic accidents and regulations, often overlooked by existing models due to limited data integration. To address these limitations, we present two enriched traffic datasets from Tokyo and California, incorporating traffic accident and regulation data. Leveraging these datasets, we propose ConFormer (Conditional Transformer), a novel framework that integrates graph propagation with guided normalization layer. This design dynamically adjusts spatial and temporal node relationships based on historical patterns, enhancing predictive accuracy. Our model surpasses the state-of-the-art STAEFormer in both predictive performance and efficiency, achieving lower computational costs and reduced parameter demands. Extensive evaluations demonstrate that ConFormer consistently outperforms mainstream spatio-temporal baselines across multiple metrics, underscoring its potential to advance traffic prediction research.
A Granular Framework for Construction Material Price Forecasting: Econometric and Machine-Learning Approaches
Lyu, Boge, Yin, Qianye, Tommelein, Iris Denise, Liu, Hanyang, Ranka, Karnamohit, Yeluripati, Karthik, Shi, Junzhe
This study develops a forecasting framework t hat leverages the Construction Specifications Institute (CSI) MasterFormat as the target data structure, enabling predictions at the six - digit section level and supporting detailed cost projections across a wide spectrum of building materials. To enhance p redictive accuracy, the framework integrates explanatory variables such as raw material prices, commodity indexes, and macroeconomic indicators. Four time - series models, Long Short - Term Memory (LSTM), Autoregressive Integrated Moving Average (ARIMA), Vecto r Error Correction Model (VECM), and Chronos - Bolt, were evaluated under both baseline configurations (using CSI data only) and extended versions with explanatory variables. Results demonstrate that incorporating explanatory variables significantly improves predictive performance across all models. Among the tested approaches, the LSTM model consistently ach ieved the highest accuracy, with RMSE values as low as 1.390 and MAPE values of 0.957, representing improvements of up to 59 % over traditional statistical time - series model, ARIMA. Validation across multiple CSI divisions confirmed the framework's scalability, while Division 06 (Wood, Plastics, and Composites) is presented in detail as a demonstration case. This research offers a robust methodology that enables owners and contractors to improve budgeting practices and achieve more reliable cost estimation at the Definitive level. INTRODUCTION 1.1 Motivation The construction industry continues to demonstrate steady long - term growth, with global activity projected to reach US$9.8 trillion by 2026 [1] . Major upcoming programs in the United States, such as the Los Angeles 2028 Olympics and TSMC's fabrication facility in Arizona [2] [3], highlight the scale of high - value projects in the near future. However, volatility in construction material prices has emerged as a critical challenge, creating significant uncertainty for contractors in project planning, budgeting, and cost management. Price fluctuations, driven by raw material costs, macroeconomic conditions such as inflation and interest rates, and supply - demand imbalances, have amplified risks of cost overruns and delays [4] [5] [6] [7] [8] . Traditional econometric methods (i.e.,multiple regression analysis) and modern econometric methods (i.e., univariate, and multivariate time series methods) have faced limitations in effectively capturing the high - frequency volatility observed in constructi on material prices [9] . These models often struggle to handle the complexity of input data and exhibit limited predictive accuracy in real - world applications.