concat
AImpactofhyper-parameters
Label smoothing and HXE achieve their best accuracy when set to zero, which is equivalent to a flat softmax. Notethatweuse confidence threshold inference for all loss functions, regardless of the inference function that was usedintheoriginalpublication. Algorithm 1Algorithm for finding ordered Pareto set. The inputsxandy are lists with equal length.
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
- Europe > United Kingdom > Scotland > Stirling > Stirling (0.04)
- (2 more...)
Scene-agnostic Hierarchical Bimanual Task Planning via Visual Affordance Reasoning
Lee, Kwang Bin, Kang, Jiho, Lee, Sung-Hee
Embodied agents operating in open environments must translate high-level instructions into grounded, executable behaviors, often requiring coordinated use of both hands. While recent foundation models offer strong semantic reasoning, existing robotic task planners remain predominantly unimanual and fail to address the spatial, geometric, and coordination challenges inherent to bimanual manipulation in scene-agnostic settings. We present a unified framework for scene-agnostic bimanual task planning that bridges high-level reasoning with 3D-grounded two-handed execution. Our approach integrates three key modules. Visual Point Grounding (VPG) analyzes a single scene image to detect relevant objects and generate world-aligned interaction points. Bimanual Subgoal Planner (BSP) reasons over spatial adjacency and cross-object accessibility to produce compact, motion-neutralized subgoals that exploit opportunities for coordinated two-handed actions. Interaction-Point-Driven Bimanual Prompting (IPBP) binds these subgoals to a structured skill library, instantiating synchronized unimanual or bimanual action sequences that satisfy hand-state and affordance constraints. Together, these modules enable agents to plan semantically meaningful, physically feasible, and parallelizable two-handed behaviors in cluttered, previously unseen scenes. Experiments show that it produces coherent, feasible, and compact two-handed plans, and generalizes to cluttered scenes without retraining, demonstrating robust scene-agnostic affordance reasoning for bimanual tasks.
- Workflow (0.67)
- Research Report (0.41)
Dual-Path Knowledge-Augmented Contrastive Alignment Network for Spatially Resolved Transcriptomics
Zhang, Wei, Chu, Jiajun, Liu, Xinci, Tong, Chen, Li, Xinyue
Spatial Transcriptomics (ST) is a technology that measures gene expression profiles within tissue sections while retaining spatial context. It reveals localized gene expression patterns and tissue heterogeneity, both of which are essential for understanding disease etiology. However, its high cost has driven efforts to predict spatial gene expression from whole slide images. Despite recent advancements, current methods still face significant limitations, such as under-exploitation of high-level biological context, over-reliance on exemplar retrievals, and inadequate alignment of heterogeneous modalities. To address these challenges, we propose DKAN, a novel Dual-path Knowledge-Augmented contrastive alignment Network that predicts spatially resolved gene expression by integrating histopathological images and gene expression profiles through a biologically informed approach. Specifically, we introduce an effective gene semantic representation module that leverages the external gene database to provide additional biological insights, thereby enhancing gene expression prediction. Further, we adopt a unified, one-stage contrastive learning paradigm, seamlessly combining contrastive learning and supervised learning to eliminate reliance on exemplars, complemented with an adaptive weighting mechanism. Additionally, we propose a dual-path contrastive alignment module that employs gene semantic features as dynamic cross-modal coordinators to enable effective heterogeneous feature integration. Through extensive experiments across three public ST datasets, DKAN demonstrates superior performance over state-of-the-art models, establishing a new benchmark for spatial gene expression prediction and offering a powerful tool for advancing biological and clinical research.
- Research Report > New Finding (0.66)
- Research Report > Promising Solution (0.48)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- (2 more...)
Energy-Aware Pattern Disentanglement: A Generalizable Pattern Assisted Architecture for Multi-task Time Series Analysis
Ma, Xiangkai, Hong, Xiaobin, Li, Wenzhong, Lu, Sanglu
Time series analysis has found widespread applications in areas such as weather forecasting, anomaly detection, and healthcare. While deep learning approaches have achieved significant success in this field, existing methods often adopt a "one-model one-task" architecture, limiting their generalization across different tasks. To address these limitations, we perform local energy analysis in the time-frequency domain to more precisely capture and disentangle transient and non-stationary oscillatory components. Furthermore, our representational analysis reveals that generative tasks tend to capture long-period patterns from low-frequency components, whereas discriminative tasks focus on high-frequency abrupt signals, which constitutes our core contribution. Concretely, we propose Pets, a novel "one-model many-tasks" architecture based on the General fluctuation Pattern Assisted (GPA) framework that is adaptable to versatile model structures for time series analysis. Pets integrates a Fluctuation Pattern Assisted (FPA) module and a Context-Guided Mixture of Predictors (MoP). The FPA module facilitates information fusion among diverse fluctuation patterns by capturing their dependencies and progressively modeling these patterns as latent representations at each layer. Meanwhile, the MoP module leverages these generalizable pattern representations to guide and regulate the reconstruction of distinct fluctuations hierarchically by energy proportion. Pets demonstrates strong versatility and achieves state-of-the-art performance across 60 benchmarks on various tasks, including forecasting, imputation, anomaly detection, and classification, while demonstrating strong generalization and robustness.
- Health & Medicine (0.87)
- Energy > Power Industry (0.46)