AITopics | object recognition

Collaborating Authors

object recognition

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

GeoDE: a Geographically Diverse Evaluation Dataset for Object Recognition

Neural Information Processing SystemsDec-26-2025, 20:15:52 GMT

Current dataset collection methods typically scrape large amounts of data from the web. While this technique is extremely scalable, data collected in this way tends to reinforce stereotypical biases, can contain personally identifiable information, and typically originates from Europe and North America. In this work, we rethink the dataset collection paradigm and introduce GeoDE, a geographically diverse dataset with 61,940 images from 40 classes and 6 world regions, and no personally identifiable information, collected by soliciting images from people across the world. We analyse GeoDE to understand differences in images collected in this manner compared to web-scraping. Despite the smaller size of this dataset, we demonstrate its use as both an evaluation and training dataset, allowing us to highlight shortcomings in current models, as well as demonstrate improved performance even when training on this small dataset.

geode, geographically diverse evaluation dataset, name change, (3 more...)

Neural Information Processing Systems

Country:

North America (0.28)
Europe (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.41)

Add feedback

Deep Predictive Coding Network with Local Recurrent Processing for Object Recognition

Neural Information Processing SystemsNov-20-2025, 21:52:16 GMT

Inspired by predictive coding - a theory in neuroscience, we develop a bi-directional and dynamic neural network with local recurrent processing, namely predictive coding network (PCN). Unlike feedforward-only convolutional neural networks, PCN includes both feedback connections, which carry top-down predictions, and feedforward connections, which carry bottom-up errors of prediction. Feedback and feedforward connections enable adjacent layers to interact locally and recurrently to refine representations towards minimization of layer-wise prediction errors. When unfolded over time, the recurrent processing gives rise to an increasingly deeper hierarchy of non-linear transformation, allowing a shallow network to dynamically extend itself into an arbitrarily deep network.

deep predictive coding network, local recurrent processing, name change, (6 more...)

Neural Information Processing Systems

Industry: Law > Litigation (0.90)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.84)

Add feedback

AI Assisted AR Assembly: Object Recognition and Computer Vision for Augmented Reality Assisted Assembly

Kyaw, Alexander Htet, Ma, Haotian, Zivkovic, Sasa, Sabin, Jenny

arXiv.org Artificial IntelligenceNov-17-2025

We present an AI-assisted Augmented Reality assembly workflow that uses deep learning-based object recognition to identify different assembly components and display step-by-step instructions. For each assembly step, the system displays a bounding box around the corresponding components in the physical space, and where the component should be placed. By connecting assembly instructions with the real-time location of relevant components, the system eliminates the need for manual searching, sorting, or labeling of different components before each assembly. To demonstrate the feasibility of using object recognition for AR-assisted assembly, we highlight a case study involving the assembly of LEGO sculptures.

artificial intelligence, assembly, machine learning, (10 more...)

arXiv.org Artificial Intelligence

2511.05394

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.18)

Genre: Workflow (0.70)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

GeoDE: a Geographically Diverse Evaluation Dataset for Object Recognition

Neural Information Processing SystemsJan-19-2025, 22:49:23 GMT

geode, geographically diverse evaluation dataset, object recognition, (1 more...)

Neural Information Processing Systems

Country:

North America (0.29)
Europe (0.29)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.44)
Information Technology > Artificial Intelligence > Vision (0.40)

Add feedback

Reviews: Deep Predictive Coding Network with Local Recurrent Processing for Object Recognition

Neural Information Processing SystemsOct-7-2024, 06:07:01 GMT

This paper presents the predictive coding network (PCN), a convolutional architecture with local recurrent and feedback connections. Higher layers provide top-down predictions while the lower layers provide the prediction errors, which are refined over time by the local recurrence. This idea is not new, other work (such as that of Lotter et al. and others) have used this for other tasks, such as video prediction and object recognition, though this has yet to be shown to scale to larger scale tasks such as ImageNet. The authors compare the performance of PCN, with varying number of cycles of recurrent processing, to standard CNN architectures on multiple image datasets. In general, PCN has slightly lower error than standard architectures with a comparable number of parameters.

deep predictive coding network, local recurrent processing, object recognition, (8 more...)

Neural Information Processing Systems

Industry: Law > Litigation (0.66)

Technology: Information Technology > Artificial Intelligence > Vision (0.76)

Add feedback

Why The Brain Separates Face Recognition From Object Recognition

Neural Information Processing SystemsMar-14-2024, 23:37:23 GMT

Many studies have uncovered evidence that visual cortex contains specialized regions involved in processing faces but not other object classes. Recent electrophysiology studies of cells in several of these specialized regions revealed that at least some of these regions are organized in a hierarchical manner with viewpointspecific cells projecting to downstream viewpoint-invariant identity-specific cells [1]. A separate computational line of reasoning leads to the claim that some transformations of visual inputs that preserve viewed object identity are class-specific. In particular, the 2D images evoked by a face undergoing a 3D rotation are not produced by the same image transformation (2D) that would produce the images evoked by an object of another class undergoing the same 3D rotation. However, within the class of faces, knowledge of the image transformation evoked by 3D rotation can be reliably transferred from previously viewed faces to help identify a novel face at a new viewpoint. We show, through computational simulations, that an architecture which applies this method of gaining invariance to class-specific transformations is effective when restricted to faces and fails spectacularly when applied to other object classes. We argue here that in order to accomplish viewpoint-invariant face identification from a single example view, visual cortex must separate the circuitry involved in discounting 3D rotations of faces from the generic circuitry involved in processing other objects. The resulting model of the ventral stream of visual cortex is consistent with the recent physiology results showing the hierarchical organization of the face processing network.

invariance, transformation, visual cortex, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)

Genre: Research Report (0.34)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.90)
Information Technology > Artificial Intelligence > Vision > Face Recognition (0.65)

Add feedback

Compositionality, MDL Priors, and Object Recognition

Neural Information Processing SystemsFeb-17-2024, 04:28:01 GMT

Images are ambiguous at each of many levels of a contextual hi(cid:173) erarchy. Nevertheless, the high-level interpretation of most scenes is unambiguous, as evidenced by the superior performance of hu(cid:173) mans. This observation argues for global vision models, such as de(cid:173) formable templates. Unfortunately, such models are computation(cid:173) ally intractable for unconstrained problems. We propose a composi(cid:173) tional model in which primitives are recursively composed, subject to syntactic restrictions, to form tree-structured objects and object groupings.

binding energy, cid, interpretation, (15 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Vision (0.85)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Add feedback

A Framework For Refining Text Classification and Object Recognition from Academic Articles

Li, Jinghong, Ota, Koichi, Gu, Wen, Hasegawa, Shinobu

arXiv.org Artificial IntelligenceAug-15-2023

With the widespread use of the internet, it has become increasingly crucial to extract specific information from vast amounts of academic articles efficiently. Data mining techniques are generally employed to solve this issue. However, data mining for academic articles is challenging since it requires automatically extracting specific patterns in complex and unstructured layout documents. Current data mining methods for academic articles employ rule-based(RB) or machine learning(ML) approaches. However, using rule-based methods incurs a high coding cost for complex typesetting articles. On the other hand, simply using machine learning methods requires annotation work for complex content types within the paper, which can be costly. Furthermore, only using machine learning can lead to cases where patterns easily recognized by rule-based methods are mistakenly extracted. To overcome these issues, from the perspective of analyzing the standard layout and typesetting used in the specified publication, we emphasize implementing specific methods for specific characteristics in academic articles. We have developed a novel Text Block Refinement Framework (TBRF), a machine learning and rule-based scheme hybrid. We used the well-known ACL proceeding articles as experimental data for the validation experiment. The experiment shows that our approach achieved over 95% classification accuracy and 90% detection accuracy for tables and figures.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2305.17401

Country:

Asia > Japan (0.04)
Europe > Greece > Ionian Islands > Corfu (0.04)

Genre: Research Report > New Finding (0.69)

Industry: Materials > Metals & Mining (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.48)

Add feedback

CNN-based Methods for Object Recognition with High-Resolution Tactile Sensors

Gandarias, Juan M., García-Cerezo, Alfonso J., Gómez-de-Gabriel, Jesús M.

arXiv.org Artificial IntelligenceMay-21-2023

Novel high-resolution pressure-sensor arrays allow treating pressure readings as standard images. Computer vision algorithms and methods such as Convolutional Neural Networks (CNN) can be used to identify contact objects. In this paper, a high-resolution tactile sensor has been attached to a robotic end-effector to identify contacted objects. Two CNN-based approaches have been employed to classify pressure images. These methods include a transfer learning approach using a pre-trained CNN on an RGB-images dataset and a custom-made CNN (TactNet) trained from scratch with tactile information. The transfer learning approach can be carried out by retraining the classification layers of the network or replacing these layers with an SVM. Overall, 11 configurations based on these methods have been tested: 8 transfer learning-based, and 3 TactNet-based. Moreover, a study of the performance of the methods and a comparative discussion with the current state-of-the-art on tactile object recognition is presented.

artificial intelligence, machine learning, sensor, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/JSEN.2019.2912968

2305.12417

Country:

North America > United States > Massachusetts > Suffolk County > South Boston (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Europe > Spain > Andalusia > Málaga Province > Málaga (0.04)
Africa > Central African Republic > Ombella-M'Poko > Bimbo (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

What is Computer Vision and its Benefits - Rishabh Software

#artificialintelligenceApr-13-2023, 03:10:12 GMT

Image Acquisition: The first step in computer vision is to acquire an image or video feed. This can be done using a camera or other imaging device. Pre-Processing: Once the image is acquired, it needs to be pre-processed to make it easier for the computer to analyze. This may involve noise reduction, image enhancement, or color correction. Feature Extraction: In this step, the computer analyzes the image to identify and extract specific features relevant to the task.

computer, rishabh software, video feed, (13 more...)

#artificialintelligence

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (0.83)
Information Technology > Data Science > Data Mining (0.63)

Add feedback