Energy
SwiftTS: A Swift Selection Framework for Time Series Pre-trained Models via Multi-task Meta-Learning
Zhang, Tengxue, Ouyang, Biao, Shu, Yang, Chen, Xinyang, Guo, Chenjuan, Yang, Bin
Pre-trained models exhibit strong generalization to various downstream tasks. However, given the numerous models available in the model hub, identifying the most suitable one by individually fine-tuning is time-consuming. In this paper, we propose \textbf{SwiftTS}, a swift selection framework for time series pre-trained models. To avoid expensive forward propagation through all candidates, SwiftTS adopts a learning-guided approach that leverages historical dataset-model performance pairs across diverse horizons to predict model performance on unseen datasets. It employs a lightweight dual-encoder architecture that embeds time series and candidate models with rich characteristics, computing patchwise compatibility scores between data and model embeddings for efficient selection. To further enhance the generalization across datasets and horizons, we introduce a horizon-adaptive expert composition module that dynamically adjusts expert weights, and the transferable cross-task learning with cross-dataset and cross-horizon task sampling to enhance out-of-distribution (OOD) robustness. Extensive experiments on 14 downstream datasets and 8 pre-trained models demonstrate that SwiftTS achieves state-of-the-art performance in time series pre-trained model selection.
Artificial Hivemind: The Open-Ended Homogeneity of Language Models (and Beyond)
Jiang, Liwei, Chai, Yuanjun, Li, Margaret, Liu, Mickel, Fok, Raymond, Dziri, Nouha, Tsvetkov, Yulia, Sap, Maarten, Albalak, Alon, Choi, Yejin
Language models (LMs) often struggle to generate diverse, human-like creative content, raising concerns about the long-term homogenization of human thought through repeated exposure to similar outputs. Yet scalable methods for evaluating LM output diversity remain limited, especially beyond narrow tasks such as random number or name generation, or beyond repeated sampling from a single model. We introduce Infinity-Chat, a large-scale dataset of 26K diverse, real-world, open-ended user queries that admit a wide range of plausible answers with no single ground truth. We introduce the first comprehensive taxonomy for characterizing the full spectrum of open-ended prompts posed to LMs, comprising 6 top-level categories (e.g., brainstorm & ideation) that further breaks down to 17 subcategories. Using Infinity-Chat, we present a large-scale study of mode collapse in LMs, revealing a pronounced Artificial Hivemind effect in open-ended generation of LMs, characterized by (1) intra-model repetition, where a single model consistently generates similar responses, and more so (2) inter-model homogeneity, where different models produce strikingly similar outputs. Infinity-Chat also includes 31,250 human annotations, across absolute ratings and pairwise preferences, with 25 independent human annotations per example. This enables studying collective and individual-specific human preferences in response to open-ended queries. Our findings show that LMs, reward models, and LM judges are less well calibrated to human ratings on model generations that elicit differing idiosyncratic annotator preferences, despite maintaining comparable overall quality. Overall, INFINITY-CHAT presents the first large-scale resource for systematically studying real-world open-ended queries to LMs, revealing critical insights to guide future research for mitigating long-term AI safety risks posed by the Artificial Hivemind.
PASS-Enhanced MEC: Joint Optimization of Task Offloading and Uplink PASS Beamforming
Hu, Zhaoming, Zhong, Ruikang, Mu, Xidong, Li, Dengao, Liu, Yuanwei
A pinching-antenna system (PASS)-enhanced mobile edge computing (MEC) architecture is investigated to improve the task offloading efficiency and latency performance in dynamic wireless environments. By leveraging dielectric waveguides and flexibly adjustable pinching antennas, PASS establishes short-distance line-of-sight (LoS) links while effectively mitigating the significant path loss and potential signal blockage, making it a promising solution for high-frequency MEC systems. We formulate a network latency minimization problem to joint optimize uplink PASS beamforming and task offloading. The resulting problem is modeled as a Markov decision process (MDP) and solved via the deep reinforcement learning (DRL) method. To address the instability introduced by the $\max$ operator in the objective function, we propose a load balancing-aware proximal policy optimization (LBPPO) algorithm. LBPPO incorporates both node-level and waveguide-level load balancing information into the policy design, maintaining computational and transmission delay equilibrium, respectively. Simulation results demonstrate that the proposed PASS-enhanced MEC with adaptive uplink PASS beamforming exhibit stronger convergence capability than fixed-PA baselines and conventional MIMO-assisted MEC, especially in scenarios with a large number of UEs or high transmit power.
Hazard-Responsive Digital Twin for Climate-Driven Urban Resilience and Equity
Complex events such as wildfires, floods, and heatwaves are no longer isolated phenomena but interlinked hazards that propagate through interconnected infrastructure networks. When one system fails, others that depend on it often cascade toward collapse, producing widespread disruption and social inequity. Recent crises including the 2023 Vermont flooding, the 2024 Texas winter freeze, and the 2025 Southern California wildfire illustrate how climate - amplified events can simultaneously strain energy, water, communication, and transportation systems. Traditional risk assessments, which often treat hazards as discrete and static events, are insufficient to capture the evolving and compounding nature of modern disasters. Digital Twin (DT) technology offers a promising avenue for improving situational awareness and decision - making under such conditions. Originally introduced for aerospace engineering and later adopted across industrial sectors, DTs create real - time virtual counterparts of physical systems using sensor data, predictive modeling, and feedback control (Grieves & Vickers, 2018; Tao et al., 2019) . Within the built environment, DTs have been applied to asset monitoring, predictive maintenance, and urban system management (Errandonea et al., 2020; Fogli, 2019; Fuller et al., 2020) . However, most conventional DTs rely on stable connectivity, complete datasets, and deterministic control assumptions that are not held during crises characterized by cascading failures and data disruption. To address these challenges, the concept of the Risk - Informed Digital Twin (RDT) integrates probabilistic modeling, uncertainty quantification, and decision support within the DT architecture (Pignatta & Alibrandi, 2022; Zio & Miqueles, 2024) .
Diffuse to Detect: A Generalizable Framework for Anomaly Detection with Diffusion Models Applications to UAVs and Beyond
Gong, Mingze, Du, Juan, You, Jianbang
Anomaly detection in complex, high-dimensional data, such as UA V sensor readings, is essential for operational safety but challenging for existing methods due to their limited sensitivity, scalability, and inability to capture intricate dependencies. We propose the Diffuse to Detect (DTD) framework, a novel approach that innovatively adapts diffusion models for anomaly detection, diverging from their conventional use in generative tasks with high inference time. By comparison, DTD employs a single-step diffusion process to predict noise patterns, enabling rapid and precise identification of anomalies without reconstruction errors. This approach is grounded in robust theoretical foundations that link noise prediction to the data distribution's score function, ensuring reliable deviation detection. By integrating Graph Neural Networks to model sensor relationships as dynamic graphs, DTD effectively captures spatial (inter-sensor) and temporal anomalies. Its two-branch architecture, with parametric neural network-based energy scoring for scalability and nonparametric statistical methods for interpretability, provides flexible trade-offs between computational efficiency and transparency. Extensive evaluations on UA V sensor data, multivariate time series, and images demonstrate DTD's superior performance over existing methods, underscoring its generality across diverse data modalities. This versatility, combined with its adaptability, positions DTD as a transformative solution for safety-critical applications, including industrial monitoring and beyond.
A Review of End-to-End Precipitation Prediction Using Remote Sensing Data: from Divination to Machine Learning
Precipitation prediction has undergone a profound transformation -- from early symbolic and empirical methods rooted in divination and observation, to modern technologies based on atmospheric physics and artificial intelligence. This review traces the historical and technological evolution of precipitation forecasting, presenting a survey about end-to-end precipitation prediction technologies that spans ancient practices, the foundations of meteorological science, the rise of numerical weather prediction (NWP), and the emergence of machine learning (ML) and deep learning (DL) models. We first explore traditional and indigenous forecasting methods, then describe the development of physical modeling and statistical frameworks that underpin contemporary operational forecasting. Particular emphasis is placed on recent advances in neural network-based approaches, including automated deep learning, interpretability-driven design, and hybrid physical-data models. By compositing research across multiple eras and paradigms, this review not only depicts the history of end-to-end precipitation prediction but also outlines future directions in next generation forecasting systems.
Logical GANs: Adversarial Learning through Ehrenfeucht Fraisse Games
Modern generative models excel at producing realistic samples--images, text, molecules-- but often lack guarantees about structural properties. A protein generator may produce plausible sequences that violate stability constraints; a network topology generator may create graphs that fail connectivity requirements; a molecule generator may output structures violating chemical valence rules. Standard GAN discriminators provide a global "real vs. fake" signal, but they cannot pinpoint which specific structural constraint failed or guarantee that generated samples satisfy formal specifications. Meanwhile, mathematical logic has developed precise tools for reasoning about structural properties. Ehrenfeucht-Fraรฏssรฉ (EF) games [1, 2] characterize when two structures are indistinguishable by logical formulas up to a given complexity (quantifier depth k). First-order (FO) and monadic second-order (MSO) logics can express rich structural properties--connectivity, bipartiteness, planarity, acyclicity--that are crucial in applications but invisible to standard discriminators.
Critical Insights into Leading Conversational AI Models
Kohli, Urja, Singh, Aditi, Sharma, Arun
Big Language Models (LLMs) are changing the way businesses use software, the way people live their lives and the way industries work. Companies like Google, High-Flyer, Anthropic, OpenAI and Meta are making better LLMs. So, it's crucial to look at how each model is different in terms of performance, moral behaviour and usability, as these differences are based on the different ideas that built them. This study compares five top LLMs: Google's Gemini, High-Flyer's DeepSeek, Anthropic's Claude, OpenAI's GPT models and Meta's LLaMA. It performs this by analysing three important factors: Performance and Accuracy, Ethics and Bias Mitigation and Usability and Integration. It was found that Claude has good moral reasoning, Gemini is better at multimodal capabilities and has strong ethical frameworks. DeepSeek is great at reasoning based on facts, LLaMA is good for open applications and ChatGPT delivers balanced performance with a focus on usage. It was concluded that these models are different in terms of how well they work, how easy they are to use and how they treat people ethically, making it a point that each model should be utilised by the user in a way that makes the most of its strengths.
Atlas Urban Index: A VLM-Based Approach for Spatially and Temporally Calibrated Urban Development Monitoring
Chander, Mithul, Ranga, Sai Pragnya, Mayekar, Prathamesh
We introduce the {\em Atlas Urban Index} (AUI), a metric for measuring urban development computed using Sentinel-2 \citep{spoto2012sentinel2} satellite imagery. Existing approaches, such as the {\em Normalized Difference Built-up Index} (NDBI), often struggle to accurately capture urban development due to factors like atmospheric noise, seasonal variation, and cloud cover. These limitations hinder large-scale monitoring of human development and urbanization. To address these challenges, we propose an approach that leverages {\em Vision-Language Models }(VLMs) to provide a development score for regions. Specifically, we collect a time series of Sentinel-2 images for each region. Then, we further process the images within fixed time windows to get an image with minimal cloud cover, which serves as the representative image for that time window. To ensure consistent scoring, we adopt two strategies: (i) providing the VLM with a curated set of reference images representing different levels of urbanization, and (ii) supplying the most recent past image to both anchor temporal consistency and mitigate cloud-related noise in the current image. Together, these components enable AUI to overcome the challenges of traditional urbanization indices and produce more reliable and stable development scores. Our qualitative experiments on Bangalore suggest that AUI outperforms standard indices such as NDBI.
A Closed-Loop Personalized Learning Agent Integrating Neural Cognitive Diagnosis, Bounded-Ability Adaptive Testing, and LLM-Driven Feedback
Wang, Zhifeng, Zheng, Xinyue, Zeng, Chunyan
As information technology advances, education is moving from one-size-fits-all instruction toward personalized learning. However, most methods handle modeling, item selection, and feedback in isolation rather than as a closed loop. This leads to coarse or opaque student models, assumption-bound adaptivity that ignores diagnostic posteriors, and generic, non-actionable feedback. To address these limitations, this paper presents an end-to-end personalized learning agent, EduLoop-Agent, which integrates a Neural Cognitive Diagnosis model (NCD), a Bounded-Ability Estimation Computerized Adaptive Testing strategy (BECAT), and large language models (LLMs). The NCD module provides fine-grained estimates of students' mastery at the knowledge-point level; BECAT dynamically selects subsequent items to maximize relevance and learning efficiency; and LLMs convert diagnostic signals into structured, actionable feedback. Together, these components form a closed-loop framework of ``Diagnosis--Recommendation--Feedback.'' Experiments on the ASSISTments dataset show that the NCD module achieves strong performance on response prediction while yielding interpretable mastery assessments. The adaptive recommendation strategy improves item relevance and personalization, and the LLM-based feedback offers targeted study guidance aligned with identified weaknesses. Overall, the results indicate that the proposed design is effective and practically deployable, providing a feasible pathway to generating individualized learning trajectories in intelligent education.