Energy
Q-Learning-Based Time-Critical Data Aggregation Scheduling in IoT
Vo, Van-Vi, Nguyen, Tien-Dung, Le, Duc-Tai, Choo, Hyunseung
Time-critical data aggregation in Internet of Things (IoT) networks demands efficient, collision-free scheduling to minimize latency for applications like smart cities and industrial automation. Traditional heuristic methods, with two-phase tree construction and scheduling, often suffer from high computational overhead and suboptimal delays due to their static nature. To address this, we propose a novel Q-learning framework that unifies aggregation tree construction and scheduling, modeling the process as a Markov Decision Process (MDP) with hashed states for scalability. By leveraging a reward function that promotes large, interference-free batch transmissions, our approach dynamically learns optimal scheduling policies. Simulations on static networks with up to 300 nodes demonstrate up to 10.87% lower latency compared to a state-of-the-art heuristic algorithm, highlighting its robustness for delay-sensitive IoT applications. This framework enables timely insights in IoT environments, paving the way for scalable, low-latency data aggregation.
XAI-on-RAN: Explainable, AI-native, and GPU-Accelerated RAN Towards 6G
Basaran, Osman Tugay, Dressler, Falko
Artificial intelligence (AI)-native radio access networks (RANs) will serve vertical industries with stringent requirements: smart grids, autonomous vehicles, remote healthcare, industrial automation, etc. To achieve these requirements, modern 5G/6G design increasingly leverage AI for network optimization, but the opacity of AI decisions poses risks in mission-critical domains. These use cases are often delivered via non-public networks (NPNs) or dedicated network slices, where reliability and safety are vital. In this paper, we motivate the need for transparent and trustworthy AI in high-stakes communications (e.g., healthcare, industrial automation, and robotics) by drawing on 3rd generation partnership project (3GPP)'s vision for non-public networks. We design a mathematical framework to model the trade-offs between transparency (explanation fidelity and fairness), latency, and graphics processing unit (GPU) utilization in deploying explainable AI (XAI) models. Empirical evaluations demonstrate that our proposed hybrid XAI model xAI-Native, consistently surpasses conventional baseline models in performance.
CascadedViT: Cascaded Chunk-FeedForward and Cascaded Group Attention Vision Transformer
Sivakumar, Srivathsan, Qureshi, Faisal Z.
Vision Transformers (ViTs) have demonstrated remarkable performance across a range of computer vision tasks; however, their high computational, memory, and energy demands hinder deployment on resource-constrained platforms. In this paper, we propose \emph{Cascaded-ViT (CViT)}, a lightweight and compute-efficient vision transformer architecture featuring a novel feedforward network design called \emph{Cascaded-Chunk Feed Forward Network (CCFFN)}. By splitting input features, CCFFN improves parameter and FLOP efficiency without sacrificing accuracy. Experiments on ImageNet-1K show that our \emph{CViT-XL} model achieves 75.5\% Top-1 accuracy while reducing FLOPs by 15\% and energy consumption by 3.3\% compared to EfficientViT-M5. Across various model sizes, the CViT family consistently exhibits the lowest energy consumption, making it suitable for deployment on battery-constrained devices such as mobile phones and drones. Furthermore, when evaluated using a new metric called \emph{Accuracy-Per-FLOP (APF)}, which quantifies compute efficiency relative to accuracy, CViT models consistently achieve top-ranking efficiency. Particularly, CViT-L is 2.2\% more accurate than EfficientViT-M2 while having comparable APF scores.
California rattled by rapid succession of earthquakes with shaking felt hundreds of miles from epicenter
Leaked recording reveals Campbell's exec's sickening remarks about iconic soup's ingredients How Lauren Sanchez would REALLY look if she'd never had rumored plastic surgery Trump's losing control... MAGA's imploding... and White House insiders tell me why they're REALLY worried: ANDREW NEIL Billionaire family posts VERY unusual obituary after heir, 40, met violent end at $2.8m hunting lodge following marriage scandal These women have lost as much as nine stone WITHOUT jabs: Now they reveal secret to their stunning success, the extraordinary event that brought them together and how it's changed their lives... Judge throws out Comey and James cases as Trump's beauty queen prosecutor is humiliated Her moving videos about the handsome boyfriend who ghosted her went viral and catapulted her to overnight fame. Kate Gosselin's ex Jon is seen at his splashy wedding for the first time as son Collin weighs in on his siblings not attending Fugitive'Slender Man' stabber Morgan Geyser snapped'just Google me' when asked for ID by cops who found her with MUCH older lover It all seems to be falling apart now! Pete Hegseth drops hammer on Democrat senator in'sedition' storm as court martial looms after Trump's execution threat Sabrina Carpenter looks unrecognisable in throwback snap from seven years ago as fans call her rebranding'wild' Neuralink's'Patient 4' feared missing months after getting revolutionary brain chip... now his wife tells the REAL heartbreaking story NFL's first transgender cheerleader makes explosive allegation against Carolina Panthers Slash your cholesterol by a third in just a month... hundreds of thousands are on a new diet that's transforming lives. California was shaken early Monday as a series of earthquakes struck in quick succession, raising concern in the seismically active region. At least seven tremors have been reported, ranging in magnitude from 1.1 to 4.1, with the epicenter near The Geysers.
In Northern Scotland, the Neolithic Age Never Ended
Megalithic monuments in the otherworldly Orkney Islands remain a fundamental part of the landscape. Sheep linger at the Stones of Stenness, the remnants of a ceremonial circle. The Stones of Stenness, a brood of lichen-encrusted megaliths in the far north of the British Isles, could be mistaken for a latter-day work of land art, one with ominous overtones. The stones stand between two lochs on the largest of the Orkney Islands, off the northeastern tip of mainland Scotland. Three colossal planks of sandstone, ranging in height from fifteen feet nine inches to eighteen feet eight inches, rise from the grass, along with a smaller stone that has the bent shape of a boomerang. In contrast to the rectilinear blocks at Stonehenge, the Stenness megaliths are thin slabs with angled upper edges, like upside-down guillotine blades. Remnants of a ceremonial circle, they are placed twenty or more feet apart, creating a chasm of negative space. The monoliths in "2001: A Space Odyssey" inevitably come to mind. Given that the stones were erected five thousand years ago by a culture that left no trace of its belief system, it is unwise to project modern aesthetics onto them. Still, they can be seen only with living eyes. During a recent visit to Orkney, I kept returning to Stenness, at all hours and in all weather. On drizzly days, with skies hanging low, the stones resemble ladders to nowhere. In bright sun, hidden colors emerge: streaks of blue against gray; white and green spatters of lichen; yellowish stains indicating the presence of limonite, an iron ore. Pockmarks and brittle edges show the abrading action of millennia of wind and rain. I watched as tourists approached the stones and hesitantly touched them, as if afraid. When I put my own hands on the rock, I felt no obvious emanations, though I did not feel nothing. One evening, I leaned on a fence as the sun went down, the horizon glowing orange against a cobalt sky.
A Framework for Adaptive Stabilisation of Nonlinear Stochastic Systems
Siriya, Seth, Zhu, Jingge, Neลกiฤ, Dragan, Pu, Ye
We consider the adaptive control problem for discrete-time, nonlinear stochastic systems with linearly parameterised uncertainty. Assuming access to a parameterised family of controllers that can stabilise the system in a bounded set within an informative region of the state space when the parameter is well-chosen, we propose a certainty equivalence learning-based adaptive control strategy, and subsequently derive stability bounds on the closed-loop system that hold for some probabilities. We then show that if the entire state space is informative, and the family of controllers is globally stabilising with appropriately chosen parameters, high probability stability guarantees can be derived.
Platonic Representations for Poverty Mapping: Unified Vision-Language Codes or Agent-Induced Novelty?
Murugaboopathy, Satiyabooshan, Jerzak, Connor T., Daoud, Adel
We investigate whether socio-economic indicators like household wealth leave recoverable imprints in satellite imagery (capturing physical features) and Internet-sourced text (reflecting historical/economic narratives). Using Demographic and Health Survey (DHS) data from African neighborhoods, we pair Landsat images with LLM-generated textual descriptions conditioned on location/year and text retrieved by an AI search agent from web sources. We develop a multimodal framework predicting household wealth (International Wealth Index) through five pipelines: (i) vision model on satellite images, (ii) LLM using only location/year, (iii) AI agent searching/synthesizing web text, (iv) joint image-text encoder, (v) ensemble of all signals. Our framework yields three contributions. First, fusing vision and agent/LLM text outperforms vision-only baselines in wealth prediction (e.g., R-squared of 0.77 vs. 0.63 on out-of-sample splits), with LLM-internal knowledge proving more effective than agent-retrieved text, improving robustness to out-of-country and out-of-time generalization. Second, we find partial representational convergence: fused embeddings from vision/language modalities correlate moderately (median cosine similarity of 0.60 after alignment), suggesting a shared latent code of material well-being while retaining complementary details, consistent with the Platonic Representation Hypothesis. Although LLM-only text outperforms agent-retrieved data, challenging our Agent-Induced Novelty Hypothesis, modest gains from combining agent data in some splits weakly support the notion that agent-gathered information introduces unique representational structures not fully captured by static LLM knowledge. Third, we release a large-scale multimodal dataset comprising more than 60,000 DHS clusters linked to satellite images, LLM-generated descriptions, and agent-retrieved texts.
REMSA: An LLM Agent for Foundation Model Selection in Remote Sensing
Chen, Binger, Bรถk, Tacettin Emre, Rasti, Behnood, Markl, Volker, Demir, Begรผm
Foundation Models (FMs) are increasingly used in remote sensing (RS) for tasks such as environmental monitoring, disaster assessment, and land-use mapping. These models include unimodal vision encoders trained on a single data modality and multimodal architectures trained on combinations of SAR, multispectral, hyperspectral, and image-text data. They support diverse RS tasks including semantic segmentation, image classification, change detection, and visual question answering. However, selecting an appropriate remote sensing foundation model (RSFM) remains difficult due to scattered documentation, heterogeneous formats, and varied deployment constraints. We introduce the RSFM Database (RS-FMD), a structured resource covering over 150 RSFMs spanning multiple data modalities, resolutions, and learning paradigms. Built on RS-FMD, we present REMSA, the first LLM-based agent for automated RSFM selection from natural language queries. REMSA interprets user requirements, resolves missing constraints, ranks candidate models using in-context learning, and provides transparent justifications. We also propose a benchmark of 75 expert-verified RS query scenarios, producing 900 configurations under an expert-centered evaluation protocol. REMSA outperforms several baselines, including naive agents, dense retrieval, and unstructured RAG-based LLMs. It operates entirely on publicly available metadata and does not access private or sensitive data.
Sparse Mixture-of-Experts for Multi-Channel Imaging: Are All Channel Interactions Required?
Yun, Sukwon, Yao, Heming, Hoeckendorf, Burkhard, Richmond, David, Regev, Aviv, Littman, Russell
Vision Transformers ($\text{ViTs}$) have become the backbone of vision foundation models, yet their optimization for multi-channel domains - such as cell painting or satellite imagery - remains underexplored. A key challenge in these domains is capturing interactions between channels, as each channel carries different information. While existing works have shown efficacy by treating each channel independently during tokenization, this approach naturally introduces a major computational bottleneck in the attention block - channel-wise comparisons leads to a quadratic growth in attention, resulting in excessive $\text{FLOPs}$ and high training cost. In this work, we shift focus from efficacy to the overlooked efficiency challenge in cross-channel attention and ask: "Is it necessary to model all channel interactions?". Inspired by the philosophy of Sparse Mixture-of-Experts ($\text{MoE}$), we propose MoE-ViT, a Mixture-of-Experts architecture for multi-channel images in $\text{ViTs}$, which treats each channel as an expert and employs a lightweight router to select only the most relevant experts per patch for attention. Proof-of-concept experiments on real-world datasets - JUMP-CP and So2Sat - demonstrate that $\text{MoE-ViT}$ achieves substantial efficiency gains without sacrificing, and in some cases enhancing, performance, making it a practical and attractive backbone for multi-channel imaging.
Convergence and stability of Q-learning in Hierarchical Reinforcement Learning
Manenti, Massimiliano, Iannelli, Andrea
Decision-making architectures have played a central role for decades [1] both in engineering and other domains, e.g., guidance, navigation and control of Apollo missions [2], chemical plants [3], smart grids [4], unmanned aerial vehicles [5], recommender systems [6], and algorithms [7]. Moreover, architectures are ubiquitous in nature, e.g., diversity in the nervous system enables humans to have fast and accurate sensorimotor control [8]. Reinforcement Learning (RL) is a framework in which an agent learns to make sequential decisions through interaction with an environment in order to maximize cumulative reward [9]. Decision-making architectures have also been proposed and studied in RL. Hierarchical Reinforcement Learning (HRL) is a subfield of RL that deals with hierarchical structures for decision-making agents. Prospective advantages include improved long-term credit assignment, continual learning, interpretability, and the integration of preexisting policies [10], [11].