AITopics | residual

Collaborating Authors

residual

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Residual-SwinCA-Net: A Channel-Aware Integrated Residual CNN-Swin Transformer for Malignant Lesion Segmentation in BUSI

Naz, Saeeda, Khan, Saddam Hussain

arXiv.org Artificial IntelligenceDec-10-2025

A novel deep hybrid Residual-SwinCA-Net segmentation framework is proposed in the study for addressing such challenges by extracting locally correlated and robust features, incorporating residual CNN modules. Furthermore, for learning global dependencies, Swin Transformer blocks are customized using internal residual pathways, which reinforce gradient stability, refine local patterns, and facilitate global feature fusion. Formerly, for enhancing tissue continuity, ultrasound noise suppressions, and accentuating fine structural transitions Laplacian-of-Gaussian regional operator is applied, and for maintaining the morphological integrity of malignant lesion contours, a boundary-oriented operator has been incorporated. Subsequently, a contraction strategy was applied stage-wise by progressively reducing features-map progressively for capturing scale invariance and enhancing the robustness of structural variability. In addition, each decoder level prior augmentation integrates a new Multi-Scale Channel Attention and Squeezing (MSCAS) module. The MSCAS selectively emphasizes encoder salient maps, retains discriminative global context, and complementary local structures with minimal computational cost while suppressing redundant activations. Finally, the Pixel-Attention module encodes class-relevant spatial cues by adaptively weighing malignant lesion pixels while suppressing background interference. The Residual-SwinCA-Net and existing CNNs/ViTs techniques have been implemented on the publicly available BUSI dataset. The proposed Residual-SwinCA-Net framework outperformed and achieved 99.29% mean accuracy, 98.74% IoU, and 0.9041 Dice for breast lesion segmentation. The proposed Residual-SwinCA-Net framework improves the BUSI lesion diagnostic performance and strengthens timely clinical decision-making.

artificial intelligence, machine learning, segmentation, (18 more...)

arXiv.org Artificial Intelligence

2512.08243

Genre: Research Report (0.64)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.70)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Enforcing governing equation constraints in neural PDE solvers via training-free projections

Rochman, Omer, Louppe, Gilles

arXiv.org Artificial IntelligenceNov-24-2025

Neural PDE solvers used for scientific simulation often violate governing equation constraints. While linear constraints can be projected cheaply, many constraints are nonlinear, complicating projection onto the feasible set. Dynamical PDEs are especially difficult because constraints induce long-range dependencies in time. In this work, we evaluate two training-free, post hoc projections of approximate solutions: a nonlinear optimization-based projection, and a local linearization-based projection using Jacobian-vector and vector-Jacobian products. We analyze constraints across representative PDEs and find that both projections substantially reduce violations and improve accuracy over physics-informed baselines.

artificial intelligence, machine learning, projection, (18 more...)

arXiv.org Artificial Intelligence

2511.17258

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

MoRe-ERL: Learning Motion Residuals using Episodic Reinforcement Learning

Huang, Xi, Zhou, Hongyi, Li, Ge, Tang, Yucheng, Liao, Weiran, Hein, Björn, Asfour, Tamim, Lioutikov, Rudolf

arXiv.org Artificial IntelligenceOct-21-2025

Abstract--We propose MoRe-ERL, a framework that combines Episodic Reinforcement Learning (ERL) and residual learning, which refines preplanned reference trajectories into safe, feasible, and efficient task-specific trajectories. This framework is general enough to incorporate into arbitrary ERL methods and motion generators seamlessly. MoRe-ERL identifies trajectory segments requiring modification while preserving critical task-related maneuvers. Then it generates smooth residual adjustments using B-Spline-based movement primitives to ensure adaptability to dynamic task contexts and smoothness in trajectory refinement. Experimental results demonstrate that residual learning significantly outperforms training from scratch using ERL methods, achieving superior sample efficiency and task performance. Hardware evaluations further validate the framework, showing that policies trained in simulation can be directly deployed in real-world systems, exhibiting a minimal sim-to-real gap. OBOTIC applications, such as multi-arm cooperation, often require frequent motion adaptation to ensure safety and task efficiency.

machine learning, reinforcement learning, trajectory, (15 more...)

arXiv.org Artificial Intelligence

2508.01409

Country: Europe > Germany (0.28)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

MC-GNNAS-Dock: Multi-criteria GNN-based Algorithm Selection for Molecular Docking

Cao, Siyuan, Wu, Hongxuan, Wang, Jiabao Brad, Yuan, Yiliang, Misir, Mustafa

arXiv.org Artificial IntelligenceOct-1-2025

Molecular docking is a core tool in drug discovery for predicting ligand-target interactions. Despite the availability of diverse search-based and machine learning approaches, no single docking algorithm consistently dominates, as performance varies by context. To overcome this challenge, algorithm selection frameworks such as GNNAS-Dock, built on graph neural networks, have been proposed. This study introduces an enhanced system, MC-GNNAS-Dock, with three key advances. First, a multi-criteria evaluation integrates binding-pose accuracy (RMSD) with validity checks from PoseBusters, offering a more rigorous assessment. Second, architectural refinements by inclusion of residual connections strengthen predictive robustness. Third, rank-aware loss functions are incorporated to sharpen rank learning. Extensive experiments are performed on a curated dataset containing approximately 3200 protein-ligand complexes from PDBBind. MC-GNNAS-Dock demonstrates consistently superior performance, achieving up to 5.4% (3.4%) gains under composite criteria of RMSD below 1Å (2Å) with PoseBuster-validity compared to the single best solver (SBS) Uni-Mol Docking V2.

artificial intelligence, machine learning, mc-gnna-dock, (13 more...)

arXiv.org Artificial Intelligence

2509.26377

Genre: Research Report (0.65)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.90)

Add feedback

On Creating a Causally Grounded Usable Rating Method for Assessing the Robustness of Foundation Models Supporting Time Series

Lakkaraju, Kausik, Kaur, Rachneet, Zehtabi, Parisa, Patra, Sunandita, Valluru, Siva Likitha, Zeng, Zhen, Srivastava, Biplav, Valtorta, Marco

arXiv.org Artificial IntelligenceFeb-17-2025

Foundation Models (FMs) have improved time series forecasting in various sectors, such as finance, but their vulnerability to input disturbances can hinder their adoption by stakeholders, such as investors and analysts. To address this, we propose a causally grounded rating framework to study the robustness of Foundational Models for Time Series (FMTS) with respect to input perturbations. We evaluate our approach to the stock price prediction problem, a well-studied problem with easily accessible public data, evaluating six state-of-the-art (some multi-modal) FMTS across six prominent stocks spanning three industries. The ratings proposed by our framework effectively assess the robustness of FMTS and also offer actionable insights for model selection and deployment. Within the scope of our study, we find that (1) multi-modal FMTS exhibit better robustness and accuracy compared to their uni-modal versions and, (2) FMTS pre-trained on time series forecasting task exhibit better robustness and forecasting accuracy compared to general-purpose FMTS pre-trained across diverse settings. Further, to validate our framework's usability, we conduct a user study showcasing FMTS prediction errors along with our computed ratings. The study confirmed that our ratings reduced the difficulty for users in comparing the robustness of different systems.

data mining, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2502.12226

Country: North America > United States (0.46)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.94)

Industry:

Health & Medicine (1.00)
Information Technology (0.67)
Banking & Finance (0.67)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Modeling & Simulation (0.89)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
(2 more...)

Add feedback

Ladder-residual: parallelism-aware architecture for accelerating large model inference with communication overlapping

Zhang, Muru, Mishra, Mayank, Zhou, Zhongzhu, Brandon, William, Wang, Jue, Kim, Yoon, Ragan-Kelley, Jonathan, Song, Shuaiwen Leon, Athiwaratkun, Ben, Dao, Tri

arXiv.org Artificial IntelligenceJan-11-2025

Large language model inference is both memory-intensive and time-consuming, often requiring distributed algorithms to efficiently scale. Various model parallelism strategies are used in multi-gpu training and inference to partition computation across multiple devices, reducing memory load and computation time. However, using model parallelism necessitates communication of information between GPUs, which has been a major bottleneck and limits the gains obtained by scaling up the number of devices. We introduce Ladder Residual, a simple architectural modification applicable to all residual-based models that enables straightforward overlapping that effectively hides the latency of communication. Our insight is that in addition to systems optimization, one can also redesign the model architecture to decouple communication from computation. While Ladder Residual can allow communication-computation decoupling in conventional parallelism patterns, we focus on Tensor Parallelism in this paper, which is particularly bottlenecked by its heavy communication. For a Transformer model with 70B parameters, applying Ladder Residual to all its layers can achieve 30% end-to-end wall clock speed up at inference time with TP sharding over 8 devices. We refer the resulting Transformer model as the Ladder Transformer. We train a 1B and 3B Ladder Transformer from scratch and observe comparable performance to a standard dense transformer baseline. We also show that it is possible to convert parts of the Llama-3.1 8B model to our Ladder Residual architecture with minimal accuracy degradation by only retraining for 3B tokens.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2501.06589

Country:

North America > United States (0.67)
Europe (0.46)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.89)

Add feedback

Event USKT : U-State Space Model in Knowledge Transfer for Event Cameras

Lin, Yuhui, Zhang, Jiahao, Li, Siyuan, Xiao, Jimin, Xu, Ding, Wu, Wenjun, Lu, Jiaxuan

arXiv.org Artificial IntelligenceNov-22-2024

Event cameras, as an emerging imaging technology, offer distinct advantages over traditional RGB cameras, including reduced energy consumption and higher frame rates. However, the limited quantity of available event data presents a significant challenge, hindering their broader development. To alleviate this issue, we introduce a tailored U-shaped State Space Model Knowledge Transfer (USKT) framework for Event-to-RGB knowledge transfer. This framework generates inputs compatible with RGB frames, enabling event data to effectively reuse pre-trained RGB models and achieve competitive performance with minimal parameter tuning. Within the USKT architecture, we also propose a bidirectional reverse state space model. Unlike conventional bidirectional scanning mechanisms, the proposed Bidirectional Reverse State Space Model (BiR-SSM) leverages a shared weight strategy, which facilitates efficient modeling while conserving computational resources. In terms of effectiveness, integrating USKT with ResNet50 as the backbone improves model performance by 0.95%, 3.57%, and 2.9% on DVS128 Gesture, N-Caltech101, and CIFAR-10-DVS datasets, respectively, underscoring USKT's adaptability and effectiveness. The code will be made available upon acceptance.

artificial intelligence, computer vision, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2411.15276

Country:

North America > United States > Illinois (0.04)
Asia > China > Shanghai > Shanghai (0.04)
Asia > China > Shaanxi Province > Xi'an (0.04)
Asia > China > Liaoning Province > Dalian (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine > Health Care Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Add feedback

Rating Multi-Modal Time-Series Forecasting Models (MM-TSFM) for Robustness Through a Causal Lens

Lakkaraju, Kausik, Kaur, Rachneet, Zeng, Zhen, Zehtabi, Parisa, Patra, Sunandita, Srivastava, Biplav, Valtorta, Marco

arXiv.org Machine LearningJun-12-2024

AI systems are notorious for their fragility; minor input changes can potentially cause major output swings. When such systems are deployed in critical areas like finance, the consequences of their uncertain behavior could be severe. In this paper, we focus on multi-modal time-series forecasting, where imprecision due to noisy or incorrect data can lead to erroneous predictions, impacting stakeholders such as analysts, investors, and traders. Recently, it has been shown that beyond numeric data, graphical transformations can be used with advanced visual models to achieve better performance. In this context, we introduce a rating methodology to assess the robustness of Multi-Modal Time-Series Forecasting Models (MM-TSFM) through causal analysis, which helps us understand and quantify the isolated impact of various attributes on the forecasting accuracy of MM-TSFM. We apply our novel rating method on a variety of numeric and multi-modal forecasting models in a large experimental setup (six input settings of control and perturbations, ten data distributions, time series from six leading stocks in three industries over a year of data, and five time-series forecasters) to draw insights on robust forecasting models and the context of their strengths. Within the scope of our study, our main result is that multi-modal (numeric + visual) forecasting, which was found to be more accurate than numeric forecasting in previous studies, can also be more robust in diverse settings. Our work will help different stakeholders of time-series forecasting understand the models` behaviors along trust (robustness) and accuracy dimensions to select an appropriate model for forecasting using our rating method, leading to improved decision-making.

mm-tsfm, perturbation, stock price, (15 more...)

arXiv.org Machine Learning

2406.12908

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
North America > Greenland (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine (1.00)
Information Technology (0.93)
Banking & Finance > Trading (0.90)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

Less is More: Physical-enhanced Radar-Inertial Odometry

Huang, Qiucan, Liang, Yuchen, Qiao, Zhijian, Shen, Shaojie, Yin, Huan

arXiv.org Artificial IntelligenceFeb-3-2024

Radar offers the advantage of providing additional physical properties related to observed objects. In this study, we design a physical-enhanced radar-inertial odometry system that capitalizes on the Doppler velocities and radar cross-section information. The filter for static radar points, correspondence estimation, and residual functions are all strengthened by integrating the physical properties. We conduct experiments on both public datasets and our self-collected data, with different mobile platforms and sensor types. Our quantitative results demonstrate that the proposed radar-inertial odometry system outperforms alternative methods using the physical-enhanced components. Our findings also reveal that using the physical properties results in fewer radar points for odometry estimation, but the performance is still guaranteed and even improved, thus aligning with the ``less is more'' principle.

doppler velocity, estimation, radar point, (12 more...)

arXiv.org Artificial Intelligence

2402.022

Country:

Asia > China > Hong Kong (0.05)
North America > United States > Texas (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Robots (0.99)
(2 more...)

Add feedback

Broadband Ground Motion Synthesis via Generative Adversarial Neural Operators: Development and Validation

Shi, Yaozhong, Lavrentiadis, Grigorios, Asimaki, Domniki, Ross, Zachary E., Azizzadenesheli, Kamyar

arXiv.org Artificial IntelligenceJan-5-2024

We present a data-driven model for ground-motion synthesis using a Generative Adversarial Neural Operator (GANO) that combines recent advancements in machine learning and open access strong motion data sets to generate three-component acceleration time histories conditioned on moment magnitude ($M$), rupture distance ($R_{rup}$), time-average shear-wave velocity at the top $30m$ ($V_{S30}$), and tectonic environment or style of faulting. We use Neural Operators, a resolution invariant architecture that guarantees that the model training is independent of the data sampling frequency. We first present the conditional ground-motion synthesis algorithm (referred to heretofore as cGM-GANO) and discuss its advantages compared to previous work. Next, we verify the cGM-GANO framework using simulated ground motions generated with the Southern California Earthquake Center (SCEC) Broadband Platform (BBP). We lastly train cGM-GANO on a KiK-net dataset from Japan, showing that the framework can recover the magnitude, distance, and $V_{S30}$ scaling of Fourier amplitude and pseudo-spectral accelerations. We evaluate cGM-GANO through residual analysis with the empirical dataset as well as by comparison with conventional Ground Motion Models (GMMs) for selected ground motion scenarios. Results show that cGM-GANO produces consistent median scaling with the GMMs for the corresponding tectonic environments. The largest misfit is observed at short distances due to the scarcity of training data. With the exception of short distances, the aleatory variability of the response spectral ordinates is also well captured, especially for subduction events due to the adequacy of training data. Applications of the presented framework include generation of risk-targeted ground motions for site-specific engineering applications.

artificial intelligence, ground motion, machine learning, (11 more...)

arXiv.org Artificial Intelligence

2309.03447

Country:

North America > United States > California (1.00)
Asia > Japan (0.24)

Genre: Research Report > New Finding (0.34)

Industry: Energy > Oil & Gas > Upstream (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback