woodstock
UniExtreme: A Universal Foundation Model for Extreme Weather Forecasting
Ni, Hang, Zhang, Weijia, Liu, Hao
Recent advancements in deep learning have led to the development of Foundation Models (FMs) for weather forecasting, yet their ability to predict extreme weather events remains limited. Existing approaches either focus on general weather conditions or specialize in specific-type extremes, neglecting the real-world atmospheric patterns of diversified extreme events. In this work, we identify two key characteristics of extreme events: (1) the spectral disparity against normal weather regimes, and (2) the hierarchical drivers and geographic blending of diverse extremes. Along this line, we propose UniExtreme, a universal extreme weather forecasting foundation model that integrates (1) an Adaptive Frequency Modulation (AFM) module that captures region-wise spectral differences between normal and extreme weather, through learnable Beta-distribution filters and multi-granularity spectral aggregation, and (2) an Event Prior Augmentation (EPA) module which incorporates region-specific extreme event priors to resolve hierarchical extreme diversity and composite extreme schema, via a dual-level memory fusion network. Extensive experiments demonstrate that UniExtreme outperforms state-of-the-art baselines in both extreme and general weather forecasting, showcasing superior adaptability across diverse extreme scenarios.
- Asia > China > Hong Kong (0.40)
- Asia > China > Guangdong Province > Guangzhou (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States > New York > New York County > New York City (0.04)
An Index-based Approach for Efficient and Effective Web Content Extraction
Chen, Yihan, Xu, Benfeng, Wang, Xiaorui, Mao, Zhendong
As web agents (e.g., Deep Research) routinely consume massive volumes of web pages to gather and analyze information, LLM context management -- under large token budgets and low signal density -- emerges as a foundational, high-importance, and technically challenging problem for agentic and RAG pipelines. Existing solutions for extracting relevant content are inadequate: generative extraction models suffer from high latency, rule-based heuristics lack adaptability, and chunk-and-rerank methods are blind to webpage structure. To overcome these issues, we introduce Index-based Web Content Extraction to reframe the extraction process from slow, token-by-token generation into a highly efficient, discriminative task of index prediction, achieving both effectiveness and efficiency. We partition HTML into structure-aware, addressable segments, and extract only the positional indices of content relevant to a given query. This method decouples extraction latency from content length, enabling rapid, query-relevant extraction. We first evaluate our method as a post-retrieval processing component within an RAG QA system and find that it improves QA accuracy. Then we directly measure its match rate with the target content in two scenarios: main content extraction (ME) and query-relevant extraction (QE). Experimental results show that our method outperforms existing works in both accuracy and speed, effectively bridging the gap between LLMs and the vast webpages.
- North America > United States > New York > New York County > New York City (0.04)
- Asia > China > Anhui Province > Hefei (0.04)
- North America > United States > Montana > Roosevelt County (0.04)
- (5 more...)
- Information Technology > Communications > Web (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Invisible Load: Uncovering the Challenges of Neurodivergent Women in Software Engineering
Zaib, Munazza, Wang, Wei, Hidellaarachchi, Dulaji, Siddiqui, Isma Farah
Neurodivergent women in Software Engineering (SE) encounter distinctive challenges at the intersection of gender bias and neurological differences. To the best of our knowledge, no prior work in SE research has systematically examined this group, despite increasing recognition of neurodiversity in the workplace. Underdiagnosis, masking, and male-centric workplace cultures continue to exacerbate barriers that contribute to stress, burnout, and attrition. In response, we propose a hybrid methodological approach that integrates InclusiveMag's inclusivity framework with the GenderMag walkthrough process, tailored to the context of neurodivergent women in SE. The overarching design unfolds across three stages, scoping through literature review, deriving personas and analytic processes, and applying the method in collaborative workshops. We present a targeted literature review that synthesize challenges into cognitive, social, organizational, structural and career progression challenges neurodivergent women face in SE, including how under/late diagnosis and masking intensify exclusion. These findings lay the groundwork for subsequent stages that will develop and apply inclusive analytic methods to support actionable change.
- Oceania > Australia > Victoria > Melbourne (0.05)
- Europe > Ukraine > Kyiv Oblast > Kyiv (0.05)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- (2 more...)
- Research Report (0.82)
- Overview (0.55)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
- Health & Medicine > Therapeutic Area > Neurology > Autism (0.54)
Label Forensics: Interpreting Hard Labels in Black-Box Text Classifier
Du, Mengyao, Yang, Gang, Fang, Han, Yin, Quanjun, Chang, Ee-chien
The widespread adoption of natural language processing techniques has led to an unprecedented growth of text classifiers across the modern web. Yet many of these models circulate with their internal semantics undocumented or even intentionally withheld. Such opaque classifiers, which may expose only hard-label outputs, can operate in unregulated web environments or be repurposed for unknown intents, raising legitimate forensic and auditing concerns. In this paper, we position ourselves as investigators and work to infer the semantic concept each label encodes in an undocumented black-box classifier. Specifically, we introduce label forensics, a black-box framework that reconstructs a label's semantic meaning. Concretely, we represent a label by a sentence embedding distribution from which any sample reliably reflects the concept the classifier has implicitly learned for that label. We believe this distribution should maintain two key properties: precise, with samples consistently classified into the target label, and general, covering the label's broad semantic space. To realize this, we design a semantic neighborhood sampler and an iterative optimization procedure to select representative seed sentences that jointly maximize label consistency and distributional coverage. The final output, an optimized seed sentence set combined with the sampler, constitutes the empirical distribution representing the label's semantics. Experiments on multiple black-box classifiers achieve an average label consistency of around 92.24 percent, demonstrating that the embedding regions accurately capture each classifier's label semantics. We further validate our framework on an undocumented HuggingFace classifier, enabling fine-grained label interpretation and supporting responsible AI auditing.
- Asia > Singapore > Central Region > Singapore (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Asia > China > Hunan Province > Changsha (0.04)
- (3 more...)
- Transportation > Air (1.00)
- Information Technology > Security & Privacy (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Nonholonomic Narrow Dead-End Escape with Deep Reinforcement Learning
Xiong, Denghan, Zhao, Yanzhe, Chen, Yutong, Wang, Zichun
Nonholonomic constraints restrict feasible velocities without reducing configuration-space dimension, which makes collision-free geometric paths generally non-executable for car-like robots. Ackermann steering further imposes curvature bounds and forbids in-place rotation, so escaping from narrow dead ends typically requires tightly sequenced forward and reverse maneuvers. Classical planners that decouple global search and local steering struggle in these settings because narrow passages occupy low-measure regions and nonholonomic reachability shrinks the set of valid connections, which degrades sampling efficiency and increases sensitivity to clearances. We study nonholonomic narrow dead-end escape for Ackermann vehicles and contribute three components. First, we construct a generator that samples multi-phase forward-reverse trajectories compatible with Ackermann kinematics and inflates their envelopes to synthesize families of narrow dead ends that are guaranteed to admit at least one feasible escape. Second, we construct a training environment that enforces kinematic constraints and train a policy using the soft actor-critic algorithm. Third, we evaluate against representative classical planners that combine global search with nonholonomic steering. Across parameterized dead-end families, the learned policy solves a larger fraction of instances, reduces maneuver count, and maintains comparable path length and planning time while under the same sensing and control limits. We provide our project as open source at https://github.com/gitagitty/cisDRL-RobotNav.git
- Asia > China > Zhejiang Province > Ningbo (0.04)
- Asia > China > Tianjin Province > Tianjin (0.04)
- Asia > China > Beijing > Beijing (0.04)
- (4 more...)
Beyond Relational: Semantic-Aware Multi-Modal Analytics with LLM-Native Query Optimization
Zhu, Junhao, Chen, Lu, Ke, Xiangyu, Fang, Ziquan, Li, Tianyi, Gao, Yunjun, Jensen, Christian S.
Multi-modal analytical processing has the potential to transform applications in e-commerce, healthcare, entertainment, and beyond. However, real-world adoption remains elusive due to the limited ability of traditional relational query operators to capture query semantics. The emergence of foundation models, particularly the large language models (LLMs), opens up new opportunities to develop flexible, semantic-aware data analytics systems that transcend the relational paradigm. We present Nirvana, a multi-modal data analytics framework that incorporates programmable semantic operators while leveraging both logical and physical query optimization strategies, tailored for LLM-driven semantic query processing. Nirvana addresses two key challenges. First, it features an agentic logical optimizer that uses natural language-specified transformation rules and random-walk-based search to explore vast spaces of semantically equivalent query plans -- far beyond the capabilities of conventional optimizers. Second, it introduces a cost-aware physical optimizer that selects the most effective LLM backend for each operator using a novel improvement-score metric. To further enhance efficiency, Nirvana incorporates computation reuse and evaluation pushdown techniques guided by model capability hypotheses. Experimental evaluations on three real-world benchmarks demonstrate that Nirvana is able to reduce end-to-end runtime by 10%--85% and reduces system processing costs by 76% on average, outperforming state-of-the-art systems at both efficiency and scalability.
- Europe > Denmark > North Jutland > Aalborg (0.04)
- Asia > China > Zhejiang Province > Hangzhou (0.04)
- North America > United States > Texas > Kleberg County (0.04)
- (2 more...)
- Research Report (0.64)
- Workflow (0.46)
- Leisure & Entertainment (1.00)
- Information Technology (1.00)
- Media > Film (0.95)
An Exterior-Embedding Neural Operator Framework for Preserving Conservation Laws
Dong, Huanshuo, Wang, Hong, Wu, Hao, Zhuang, Zhiwei, Yang, Xuanze, Shu, Ruiqi, Gao, Yuan, Huang, Xiaomeng
Neural operators have demonstrated considerable effectiveness in accelerating the solution of time-dependent partial differential equations (PDEs) by directly learning governing physical laws from data. However, for PDEs governed by conservation laws(e.g., conservation of mass, energy, or matter), existing neural operators fail to satisfy conservation properties, which leads to degraded model performance and limited generalizability. Moreover, we observe that distinct PDE problems generally require different optimal neural network architectures. This finding underscores the inherent limitations of specialized models in generalizing across diverse problem domains. To address these limitations, we propose Exterior-Embedded Conservation Framework (ECF), a universal conserving framework that can be integrated with various data-driven neural operators to enforce conservation laws strictly in predictions. The framework consists of two key components: a conservation quantity encoder that extracts conserved quantities from input data, and a conservation quantity decoder that adjusts the neural operator's predictions using these quantities to ensure strict conservation compliance in the final output. Since our architecture enforces conservation laws, we theoretically prove that it enhances model performance. To validate the performance of our method, we conduct experiments on multiple conservation-law-constrained PDE scenarios, including adiabatic systems, shallow water equations, and the Allen-Cahn problem. These baselines demonstrate that our method effectively improves model accuracy while strictly enforcing conservation laws in the predictions.
- Asia > China > Beijing > Beijing (0.04)
- Asia > China > Anhui Province > Hefei (0.04)
- North America > United States > New York > New York County > New York City (0.04)
VISAGNN: Versatile Staleness-Aware Efficient Training on Large-Scale Graphs
Graph Neural Networks (GNNs) have shown exceptional success in graph representation learning and a wide range of real-world applications. However, scaling deeper GNNs poses challenges due to the neighbor explosion problem when training on large-scale graphs. To mitigate this, a promising class of GNN training algorithms utilizes historical embeddings to reduce computation and memory costs while preserving the expressiveness of the model. These methods leverage historical embeddings for out-of-batch nodes, effectively approximating full-batch training without losing any neighbor information-a limitation found in traditional sampling methods. However, the staleness of these historical embeddings often introduces significant bias, acting as a bottleneck that can adversely affect model performance. In this paper, we propose a novel VersatIle Staleness-Aware GNN, named VISAGNN, which dynamically and adaptively incorporates staleness criteria into the large-scale GNN training process. By embedding staleness into the message passing mechanism, loss function, and historical embeddings during training, our approach enables the model to adaptively mitigate the negative effects of stale embeddings, thereby reducing estimation errors and enhancing downstream accuracy. Comprehensive experiments demonstrate the effectiveness of our method in overcoming the staleness issue of existing historical embedding techniques, showcasing its superior performance and efficiency on large-scale benchmarks, along with significantly faster convergence.
- North America > United States > North Carolina (0.40)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Research Report > New Finding (0.67)
- Research Report > Promising Solution (0.46)
FLAD: Federated Learning for LLM-based Autonomous Driving in Vehicle-Edge-Cloud Networks
Xiang, Tianao, Zhi, Mingjian, Bi, Yuanguo, Cai, Lin, Chen, Yuhao
Large Language Models (LLMs) have impressive data fusion and reasoning capabilities for autonomous driving (AD). However, training LLMs for AD faces significant challenges including high computation transmission costs, and privacy concerns associated with sensitive driving data. Federated Learning (FL) is promising for enabling autonomous vehicles (AVs) to collaboratively train models without sharing raw data. We present Federated LLM-based Autonomous Driving (FLAD), an FL framework that leverages distributed multimodal sensory data across AVs in heterogeneous environment. FLAD has three key innovations: (1) a cloud-edge-vehicle collaborative architecture that reduces communication delay and preserving data privacy; (2) an intelligent parallelized collaborative training with a communication scheduling mechanism that optimizes training efficiency, leveraging end-devices otherwise having insufficient resources for model training; and (3) a knowledge distillation method that personalizes LLM according to heterogeneous edge data. In addition, we prototype FLAD in a testbed with NVIDIA Jetsons, overcoming practical implementation challenges including CPU/GPU memory sharing in resource-constrained devices, dynamic model partitions, and fault-tolerant training.Extensive experimental evaluation demonstrates that FLAD achieves superior end-to-end AD performance while efficiently utilizing distributed vehicular resources, opening up new possibilities for future collaborative AD model training and knowledge sharing.
- North America > United States > New York > New York County > New York City (0.04)
- Asia > China (0.04)
- North America > United States > Washington > King County > Seattle (0.04)
- (18 more...)
- Transportation > Ground > Road (1.00)
- Information Technology (1.00)
- Automobiles & Trucks (1.00)
RedOne 2.0: Rethinking Domain-specific LLM Post-Training in Social Networking Services
Zhao, Fei, Lu, Chonggang, Qian, Haofu, Shi, Fangcheng, Meng, Zijie, Huang, Jianzhao, Tang, Xu, Xie, Zheyong, Ye, Zheyu, Xu, Zhe, Hu, Yao, Cao, Shaosheng
As a key medium for human interaction and information exchange, social networking services (SNS) pose unique challenges for large language models (LLMs): heterogeneous workloads, fast-shifting norms and slang, and multilingual, culturally diverse corpora that induce sharp distribution shift. Supervised fine-tuning (SFT) can specialize models but often triggers a ``seesaw'' between in-distribution gains and out-of-distribution robustness, especially for smaller models. To address these challenges, we introduce RedOne 2.0, an SNS-oriented LLM trained with a progressive, RL-prioritized post-training paradigm designed for rapid and stable adaptation. The pipeline consist in three stages: (1) Exploratory Learning on curated SNS corpora to establish initial alignment and identify systematic weaknesses; (2) Targeted Fine-Tuning that selectively applies SFT to the diagnosed gaps while mixing a small fraction of general data to mitigate forgetting; and (3) Refinement Learning that re-applies RL with SNS-centric signals to consolidate improvements and harmonize trade-offs across tasks. Across various tasks spanning three categories, our 4B scale model delivers an average improvements about 2.41 over the 7B sub-optimal baseline. Additionally, RedOne 2.0 achieves average performance lift about 8.74 from the base model with less than half the data required by SFT-centric method RedOne, evidencing superior data efficiency and stability at compact scales. Overall, RedOne 2.0 establishes a competitive, cost-effective baseline for domain-specific LLMs in SNS scenario, advancing capability without sacrificing robustness.
- Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)
- Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (2 more...)