AITopics | Pattern Recognition

NAVI: Category-Agnostic Image Collections with High-Quality 3D Shape and Pose Annotations

Neural Information Processing SystemsMay-25-2025, 16:08:14 GMT

Recent advances in neural reconstruction enable high-quality 3D object reconstruction from casually captured image collections. Current techniques mostly analyze their progress on relatively simple image collections where Structurefrom-Motion (SfM) techniques can provide ground-truth (GT) camera poses. We note that SfM techniques tend to fail on in-the-wild image collections such as image search results with varying backgrounds and illuminations. To enable systematic research progress on 3D reconstruction from casual image captures, we propose'NAVI': a new dataset of category-agnostic image collections of objects with high-quality 3D scans along with per-image 2D-3D alignments providing near-perfect GT camera parameters. These 2D-3D alignments allow us to extract accurate derivative annotations such as dense pixel correspondences, depth and segmentation maps. We demonstrate the use of NAVI image collections on different problem settings and show that NAVI enables more thorough evaluations that were not possible with existing datasets. We believe NAVI is beneficial for systematic research progress on 3D reconstruction and correspondence estimation.

artificial intelligence, machine learning, pattern recognition, (17 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)

Industry: Media > Photography (0.88)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Communications (0.93)
Information Technology > Sensing and Signal Processing > Image Processing (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.35)

Add feedback

e24570da4fa1c005b189104250993aee-Paper-Conference.pdf

Neural Information Processing SystemsMay-25-2025, 14:32:33 GMT

machine learning, natural language, pattern recognition, (20 more...)

Neural Information Processing Systems

Country:

North America > United States > New York (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report > New Finding (0.68)

Industry: Law > Civil Rights & Constitutional Law (0.67)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
(2 more...)

Add feedback

EV-Eye: Rethinking High-frequency Eye Tracking through the Lenses of Event Cameras,Yurun Yang

Neural Information Processing SystemsMay-25-2025, 11:07:15 GMT

In this paper, we present EV-Eye, a first-of-its-kind large-scale multimodal eye tracking dataset aimed at inspiring research on high-frequency eye/gaze tracking. EV-Eye utilizes the emerging bio-inspired event camera to capture independent pixel-level intensity changes induced by eye movements, achieving submicrosecond latency. Our dataset was curated over two weeks and collected from 48 participants encompassing diverse genders and age groups. It comprises over 1.5 million near-eye grayscale images and 2.7 billion event samples generated by two DAVIS346 event cameras. Additionally, the dataset contains 675 thousand scene images and 2.7 million gaze references captured by a Tobii Pro Glasses 3 eye tracker for cross-modality validation. Compared with existing event-based high-frequency eye tracking datasets, our dataset is significantly larger in size, and the gaze references involve more natural and diverse eye movement patterns, i.e., fixation, saccade, and smooth pursuit. Alongside the event data, we also present a hybrid eye tracking method as a benchmark, which leverages both the near-eye grayscale images and event data for robust and high-frequency eye tracking. We show that our method achieves higher accuracy for both pupil and gaze estimation tasks compared to the existing solution.

artificial intelligence, machine learning, pattern recognition, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.29)

Industry: Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
(2 more...)

Add feedback

Meta-Adapter: An Online Few-shot Learner for Vision-Language Model Lin Song 2 Hang Wang 1

Neural Information Processing SystemsMay-25-2025, 08:37:59 GMT

The contrastive vision-language pre-training, known as CLIP, demonstrates remarkable potential in perceiving open-world visual concepts, enabling effective zero-shot image recognition. Nevertheless, few-shot learning methods based on CLIP typically require offline fine-tuning of the parameters on few-shot samples, resulting in longer inference time and the risk of over-fitting in certain domains. To tackle these challenges, we propose the Meta-Adapter, a lightweight residualstyle adapter, to refine the CLIP features guided by the few-shot samples in an online manner. With a few training samples, our method can enable effective few-shot learning capabilities and generalize to unseen data or tasks without additional fine-tuning, achieving competitive performance and high efficiency. Without bells and whistles, our approach outperforms the state-of-the-art online few-shot learning method by an average of 3.6% on eight image classification datasets with higher inference speed. Furthermore, our model is simple and flexible, serving as a plug-and-play module directly applicable to downstream tasks. Without further fine-tuning, Meta-Adapter obtains notable performance improvements in open-vocabulary object detection and segmentation tasks.

large language model, machine learning, pattern recognition, (17 more...)

Neural Information Processing Systems

Country: Asia > China (0.28)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.37)
(2 more...)

Add feedback

a547d86953a4e36aa8a1390e6f4708e2-Paper-Conference.pdf

Neural Information Processing SystemsMay-25-2025, 07:38:11 GMT

machine learning, natural language, pattern recognition, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Maryland (0.14)
North America > United States > Hawaii (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
Asia > Middle East > Israel (0.14)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

a0a53fefef4c2ad72d5ab79703ba70cb-Paper-Conference.pdf

Neural Information Processing SystemsMay-25-2025, 07:03:41 GMT

artificial intelligence, computer vision and pattern recognition, machine learning, (13 more...)

Neural Information Processing Systems

Country:

Asia > China (0.14)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
North America > United States (0.14)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (0.68)
(2 more...)

Add feedback

Disentangling Voice and Content with Self-Supervision for Speaker Recognition, Kong Aik Lee

Neural Information Processing SystemsMay-25-2025, 06:41:47 GMT

For speaker recognition, it is difficult to extract an accurate speaker representation from speech because of its mixture of speaker traits and content. This paper proposes a disentanglement framework that simultaneously models speaker traits and content variability in speech. It is realized with the use of three Gaussian inference layers, each consisting of a learnable transition model that extracts distinct speech components. Notably, a strengthened transition model is specifically designed to model complex speech dynamics. We also propose a self-supervision method to dynamically disentangle content without the use of labels other than speaker identities. The efficacy of the proposed framework is validated via experiments conducted on the VoxCeleb and SITW datasets with 9.56% and 8.24% average reductions in EER and minDCF, respectively. Since neither additional model training nor data is specifically needed, it is easily applicable in practical use.

machine learning, natural language, pattern recognition, (17 more...)

Neural Information Processing Systems

Country: Asia > China (0.28)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Speech Recognition (0.63)

Add feedback

Enhancing User Intent Capture in Session-Based Recommendation with Attribute Patterns Zheng Li2 Yifan Gao 2 Jingfeng Yang

Neural Information Processing SystemsMay-24-2025, 03:12:51 GMT

The goal of session-based recommendation in E-commerce is to predict the next item that an anonymous user will purchase based on the browsing and purchase history. However, constructing global or local transition graphs to supplement session data can lead to noisy correlations and user intent vanishing. In this work, we propose the Frequent Attribute Pattern Augmented Transformer (FAPAT) that characterizes user intents by building attribute transition graphs and matching attribute patterns. Specifically, the frequent and compact attribute patterns are served as memory to augment session representations, followed by a gate and a transformer block to fuse the whole session information. Through extensive experiments on two public benchmarks and 100 million industrial data in three domains, we demonstrate that FAPAT consistently outperforms state-of-the-art methods by an average of 4.5% across various evaluation metrics (Hits, NDCG, MRR). Besides evaluating the next-item prediction, we estimate the models' capabilities to capture user intents via predicting items' attributes and period-item recommendations.

data mining, machine learning, pattern recognition, (22 more...)

Neural Information Processing Systems

Country: Asia > China (0.14)

Genre: Research Report > Promising Solution (0.34)

Industry: Information Technology > Services > e-Commerce Services (0.35)

Technology:

Information Technology > Data Science > Data Mining (0.94)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.69)
(2 more...)

Add feedback

ADT and Yale partner on Z-Wave lock with fingerprint recognition

PCWorldApr-30-2025, 13:00:00 GMT

ADT offers Yale Assure locks with its ADT home security systems, and now the security service provider has partnered with Yale and the Z-Wave Alliance to introduce the Yale Assure Lock 2 Touch with Z-Wave. This is the first Z-Wave lock with fingerprint recognition that is certified to use the Z-Wave User Credential Command Class specification that was released in June 2024. The new lock also features the latest generation Z-Wave 800 chipset, which promises longer battery life and improved range on a Z-Wave mesh network. Thanks to its use of the Z-Wave User Credential Command Class spec, ADT subscribers will be able arm and disarm their security system at the same time they lock or unlock the new deadbolt, all by just touching their previously enrolled finger to the new lock. ADT offers the Yale Assure Lock 2 Touch with Z-Wave with its ADT home security systems, which can be self- or professionally installed.

artificial intelligence, machine learning, pattern recognition, (10 more...)

PCWorld

Industry: Information Technology > Smart Houses & Appliances (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Fingerprint Recognition (0.62)

Add feedback

Google AI Mode rolls out to more testers with new image search feature

EngadgetApr-7-2025, 16:00:54 GMT

Google is bringing AI Mode to more people in the US. The company announced on Monday it would make the new search tool, first launched at the start of last month, to millions of more Labs users across the country. For uninitiated, AI Mode is a new dedicated tab within Search. It allows you to ask more complicated questions of Google, with a custom version of Gemini 2.0 doing the legwork to deliver a nuanced AI-generated response. Labs, meanwhile, is a beta program you can enroll your Google account in to gain access to new Search features before the company rolls them out to the public.

ai mode, machine learning, pattern recognition, (7 more...)

Engadget

Country: North America > United States (0.28)

Technology:

Information Technology > Information Management > Search (0.64)
Information Technology > Sensing and Signal Processing > Image Processing (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.40)

Add feedback

Filters

Collaborating Authors

Pattern Recognition

NAVI: Category-Agnostic Image Collections with High-Quality 3D Shape and Pose Annotations

e24570da4fa1c005b189104250993aee-Paper-Conference.pdf

EV-Eye: Rethinking High-frequency Eye Tracking through the Lenses of Event Cameras,Yurun Yang

Meta-Adapter: An Online Few-shot Learner for Vision-Language Model Lin Song 2 Hang Wang 1

a547d86953a4e36aa8a1390e6f4708e2-Paper-Conference.pdf

a0a53fefef4c2ad72d5ab79703ba70cb-Paper-Conference.pdf

Disentangling Voice and Content with Self-Supervision for Speaker Recognition, Kong Aik Lee

Enhancing User Intent Capture in Session-Based Recommendation with Attribute Patterns Zheng Li2 Yifan Gao 2 Jingfeng Yang

ADT and Yale partner on Z-Wave lock with fingerprint recognition

Google AI Mode rolls out to more testers with new image search feature