AITopics

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (0.47)

Technology:

Information Technology > Artificial Intelligence > Vision (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Neural Information Processing SystemsFeb-10-2026, 23:27:48 GMT

Mining

We have conducted the experiments of replacing proposal generator, including MaskFormer [3] and RPN in Mask R-CNN combined with class-agnostic segmentation head [6, 7] (denote as RPN+Seghead). We also conduct the results for generating different numbers of proposals (N) with Mask2Former. Note that the original setting of MicroSeg is Mask2Former (N = 100).

artificial intelligence, mask2former, segmentation, (17 more...)

Country: Asia > China > Guangxi Province > Nanning (0.05)

Technology: Information Technology > Artificial Intelligence (0.30)

Neural Information Processing SystemsOct-10-2025, 23:21:10 GMT

GNeSF: Generalizable Neural Semantic Fields Supplementary Material

To extract features from these source views, we employ a network with shared weights. Each vertex of the feature volume grids is projected to the image feature maps and obtains its image features by interpolation. In comparison, our method is able to segment accurately for various scenes. In several instances, our method correctly segments objects while Mask2Former produces incorrect results. Mask2Former fails to segment some objects such as the table in the third row. We show more qualitative comparison with NeuralRecon in Figure 1.

artificial intelligence, machine learning, prediction, (14 more...)

Country: Asia > Singapore (0.04)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsAug-14-2025, 05:49:14 GMT

3776558654d8db1bfcb9ebde0e01184e-Supplemental-Conference.pdf

ade20k semantic segmentation, batch frozen full ft, frozen full ft, (10 more...)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Jarry, Gabriel, Dalmau, Ramon, Very, Philippe, Ballerini, Franck, Bocu, Stefania-Denisa

GVCCS: A Dataset for Contrail Identification and Tracking on Visible Whole Sky Camera Sequences

arXiv.org Artificial IntelligenceJul-28-2025

Aviation's climate impact includes not only CO2 emissions but also significant non-CO2 effects, especially from contrails. These ice clouds can alter Earth's radiative balance, potentially rivaling the warming effect of aviation CO2. Physics-based models provide useful estimates of contrail formation and climate impact, but their accuracy depends heavily on the quality of atmospheric input data and on assumptions used to represent complex processes like ice particle formation and humidity-driven persistence. Observational data from remote sensors, such as satellites and ground cameras, could be used to validate and calibrate these models. However, existing datasets don't explore all aspect of contrail dynamics and formation: they typically lack temporal tracking, and do not attribute contrails to their source flights. To address these limitations, we present the Ground Visible Camera Contrail Sequences (GVCCS), a new open data set of contrails recorded with a ground-based all-sky camera in the visible range. Each contrail is individually labeled and tracked over time, allowing a detailed analysis of its lifecycle. The dataset contains 122 video sequences (24,228 frames) and includes flight identifiers for contrails that form above the camera. As reference, we also propose a unified deep learning framework for contrail analysis using a panoptic segmentation model that performs semantic segmentation (contrail pixel identification), instance segmentation (individual contrail separation), and temporal tracking in a single architecture. By providing high-quality, temporally resolved annotations and a benchmark for model evaluation, our work supports improved contrail monitoring and will facilitate better calibration of physical models. This sets the groundwork for more accurate climate impact understanding and assessments.

artificial intelligence, contrail, machine learning, (20 more...)

2507.1833

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry: Transportation > Air (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Salagame, Adarsh, Potluri, Sasank, Vaidyanathan, Keshav Bharadwaj, Gangaraju, Kruthika, Sihite, Eric, Ramezani, Milad, Ramezani, Alireza

Vision-Guided Loco-Manipulation with a Snake Robot

arXiv.org Artificial IntelligenceMar-23-2025

This paper presents the development and integration of a vision-guided loco-manipulation pipeline for Northeastern University's snake robot, COBRA. The system leverages a YOLOv8-based object detection model and depth data from an onboard stereo camera to estimate the 6-DOF pose of target objects in real time. We introduce a framework for autonomous detection and control, enabling closed-loop loco-manipulation for transporting objects to specified goal locations. Additionally, we demonstrate open-loop experiments in which COBRA successfully performs real-time object detection and loco-manipulation tasks.

artificial intelligence, machine learning, module, (17 more...)

2503.18308

Country:

Oceania > Australia > Queensland > Brisbane (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > California > Los Angeles County > Pasadena (0.04)
(3 more...)

Genre: Research Report (0.64)

Industry: Energy (0.35)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

arXiv.org Artificial IntelligenceDec-19-2024

Spike2Former: Efficient Spiking Transformer for High-performance Image Segmentation

Lei, Zhenxin, Yao, Man, Hu, Jiakui, Luo, Xinhao, Lu, Yanye, Xu, Bo, Li, Guoqi

Spiking Neural Networks (SNNs) have a low-power advantage but perform poorly in image segmentation tasks. The reason is that directly converting neural networks with complex architectural designs for segmentation tasks into spiking versions leads to performance degradation and non-convergence. To address this challenge, we first identify the modules in the architecture design that lead to the severe reduction in spike firing, make targeted improvements, and propose Spike2Former architecture. Second, we propose normalized integer spiking neurons to solve the training stability problem of SNNs with complex architectures. We set a new state-of-the-art for SNNs in various semantic segmentation datasets, with a significant improvement of +12.7% mIoU and 5.0 efficiency on ADE20K, +14.3% mIoU and 5.2 efficiency on VOC2012, and +9.1% mIoU and 6.6 efficiency on CityScapes.

dataset, segmentation, spike2former, (16 more...)

2412.14587

Country: Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.40)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

de Jong, Ronald L. P. D., Khalil, Yasmina al, Jaspers, Tim J. M., van Jaarsveld, Romy C., Kuiper, Gino M., Li, Yiping, van Hillegersberg, Richard, Ruurda, Jelle P., Breeuwer, Marcel, van der Sommen, Fons

Benchmarking Pretrained Attention-based Models for Real-Time Recognition in Robot-Assisted Esophagectomy

arXiv.org Artificial IntelligenceDec-18-2024

Esophageal cancer is among the most common types of cancer worldwide. It is traditionally treated using open esophagectomy, but in recent years, robot-assisted minimally invasive esophagectomy (RAMIE) has emerged as a promising alternative. However, robot-assisted surgery can be challenging for novice surgeons, as they often suffer from a loss of spatial orientation. Computer-aided anatomy recognition holds promise for improving surgical navigation, but research in this area remains limited. In this study, we developed a comprehensive dataset for semantic segmentation in RAMIE, featuring the largest collection of vital anatomical structures and surgical instruments to date. Handling this diverse set of classes presents challenges, including class imbalance and the recognition of complex structures such as nerves. This study aims to understand the challenges and limitations of current state-of-the-art algorithms on this novel dataset and problem. Therefore, we benchmarked eight real-time deep learning models using two pretraining datasets. We assessed both traditional and attention-based networks, hypothesizing that attention-based networks better capture global patterns and address challenges such as occlusion caused by blood or other tissues. The benchmark includes our RAMIE dataset and the publicly available CholecSeg8k dataset, enabling a thorough assessment of surgical segmentation tasks. Our findings indicate that pretraining on ADE20k, a dataset for semantic segmentation, is more effective than pretraining on ImageNet. Furthermore, attention-based models outperform traditional convolutional neural networks, with SegNeXt and Mask2Former achieving higher Dice scores, and Mask2Former additionally excelling in average symmetric surface distance.

artificial intelligence, deep learning, machine learning, (16 more...)