AITopics | score map

We present region-based, fully convolutional networks for accurate and efficient object detection. In contrast to previous region-based detectors such as Fast/Faster R-CNN [7, 19] that apply a costly per-region subnetwork hundreds of times, our region-based detector is fully convolutional with almost all computation shared on the entire image. To achieve this goal, we propose position-sensitive score maps to address a dilemma between translation-invariance in image classification and translation-variance in object detection. Our method can thus naturally adopt fully convolutional image classifier backbones, such as the latest Residual Networks (ResNets) [10], for object detection. We show competitive results on the PASCAL VOC datasets (e.g., 83.6% mAP on the 2007 set) with the 101-layer ResNet. Meanwhile, our result is achieved at a test-time speed of 170ms per image, 2.5-20 faster than the Faster R-CNN counterpart.

artificial intelligence, convolutional layer, machine learning, (17 more...)

Neural Information Processing Systems

Country: Asia (0.40)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

8395fdf356059eaa92afd39e3952a677-Paper-Conference.pdf

Neural Information Processing SystemsFeb-16-2026, 06:36:17 GMT

artificial intelligence, machine learning, natural language, (15 more...)

Neural Information Processing Systems

Country:

Europe > Switzerland (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

LF-Net: Learning Local Features from Images

Yuki Ono, Eduard Trulls, Pascal Fua, Kwang Moo Yi

Neural Information Processing SystemsFeb-15-2026, 07:07:10 GMT

Neural Information Processing Systems http://nips.cc/

descriptor, keypoint, pipeline, (16 more...)

Neural Information Processing Systems

Country:

North America > United States (0.28)
Europe > Switzerland > Vaud > Lausanne (0.04)
North America > Canada > Quebec > Montreal (0.04)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

9a49a25d845a483fae4be7e341368e36-Paper.pdf

Neural Information Processing SystemsFeb-10-2026, 03:44:42 GMT

arxiv preprint arxiv, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: Asia > Singapore (0.04)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Interpretable and Globally Optimal Prediction for Textual Grounding using Image Concepts

Raymond Yeh, Jinjun Xiong, Wen-Mei Hwu, Minh Do, Alexander Schwing

Neural Information Processing SystemsNov-21-2025, 09:03:36 GMT

Textual grounding is an important but challenging task for human-computer interaction, robotics and knowledge mining.

machine learning, natural language, proc, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois (0.05)
North America > United States > California > Los Angeles County > Long Beach (0.04)

Industry: Information Technology (0.30)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

LF-Net: Learning Local Features from Images

Yuki Ono, Eduard Trulls, Pascal Fua, Kwang Moo Yi

Neural Information Processing SystemsNov-20-2025, 21:16:53 GMT

Establishing correspondences across images is at the heart of many Computer Vision algorithms, such as those for wide-baseline stereo, object detection, and image retrieval.

artificial intelligence, keypoint, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States (0.28)
Europe > Switzerland > Vaud > Lausanne (0.04)
North America > Canada > Quebec > Montreal (0.04)
(2 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.88)
Information Technology > Artificial Intelligence > Vision (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Search-TTA: A Multimodal Test-Time Adaptation Framework for Visual Search in the Wild

Tan, Derek Ming Siang, Shailesh, null, Liu, Boyang, Raj, Alok, Ang, Qi Xuan, Dai, Weiheng, Duhan, Tanishq, Chiun, Jimmy, Cao, Yuhong, Shkurti, Florian, Sartoretti, Guillaume

arXiv.org Artificial IntelligenceNov-10-2025

To perform outdoor visual navigation and search, a robot may leverage satellite imagery to generate visual priors. This can help inform high-level search strategies, even when such images lack sufficient resolution for target recognition. However, many existing informative path planning or search-based approaches either assume no prior information, or use priors without accounting for how they were obtained. Recent work instead utilizes large Vision Language Models (VLMs) for generalizable priors, but their outputs can be inaccurate due to hallucination, leading to inefficient search. To address these challenges, we introduce Search-TTA, a multimodal test-time adaptation framework with a flexible plug-and-play interface compatible with various input modalities (e.g., image, text, sound) and planning methods (e.g., RL-based). First, we pretrain a satellite image encoder to align with CLIP's visual encoder to output probability distributions of target presence used for visual search. Second, our TTA framework dynamically refines CLIP's predictions during search using uncertainty-weighted gradient updates inspired by Spatial Poisson Point Processes. To train and evaluate Search-TTA, we curate AVS-Bench, a visual search dataset based on internet-scale ecological data containing 380k images and taxonomy data. We find that Search-TTA improves planner performance by up to 30.0%, particularly in cases with poor initial CLIP predictions due to domain mismatch and limited training data. It also performs comparably with significantly larger VLMs, and achieves zero-shot generalization via emergent alignment to unseen modalities. Finally, we deploy Search-TTA on a real UAV via hardware-in-the-loop testing, by simulating its operation within a large-scale simulation that provides onboard sensing.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2505.1135

Country:

Asia (0.28)
North America (0.28)

Genre: Research Report (0.82)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.34)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(3 more...)

Add feedback

8395fdf356059eaa92afd39e3952a677-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 08:04:06 GMT

international conference, optimization, score map, (13 more...)

Neural Information Processing Systems

Country:

Europe > Switzerland (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

RANet: Region Attention Network for Semantic Segmentation - Supplementary Material - Dingguo Shen

Neural Information Processing SystemsAug-22-2025, 00:34:13 GMT

The first two authors share the contribution equally. Di Lin is the corresponding author of this paper. However, using the intermediate pixels requires extra computation. In Figure 3, we provide the segmentation results with/without using the intermediate pixel. In Table 2, we compare different strategies of using the representative scores in the region interaction. We also study the strategy of using only the representative scores in the region interaction.

intermediate pixel, pixel, representative score, (11 more...)

Neural Information Processing Systems

Country: