AITopics | object category

Collaborating Authors

object category

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Hierarchical Object Representation for Open-Ended Object Category Learning and Recognition

Neural Information Processing SystemsMar-17-2026, 07:27:36 GMT

Most robots lack the ability to learn new objects from past experiences. To migrate a robot to a new environment one must often completely re-generate the knowledge-base that it is running with. Since in open-ended domains the set of categories to be learned is not predefined, it is not feasible to assume that one can pre-program all object categories required by robots. Therefore, autonomous robots must have the ability to continuously execute learning and recognition in a concurrent and interleaved fashion. This paper proposes an open-ended 3D object recognition system which concurrently learns both the object categories and the statistical features for encoding objects. In particular, we propose an extension of Latent Dirichlet Allocation to learn structural semantic features (i.e.

artificial intelligence, name change, proceedings, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

Evaluating authenticity and quality of image captions via sentiment and semantic analyses

Krotov, Aleksei, Tebo, Alison, Picart, Dylan K., Algave, Aaron Dean

arXiv.org Artificial IntelligenceSep-14-2024

The growth of deep learning (DL) relies heavily on huge amounts of labelled data for tasks such as natural language processing and computer vision. Specifically, in image-to-text or image-to-image pipelines, opinion (sentiment) may be inadvertently learned by a model from human-generated image captions. Additionally, learning may be affected by the variety and diversity of the provided captions. While labelling large datasets has largely relied on crowd-sourcing or data-worker pools, evaluating the quality of such training data is crucial. This study proposes an evaluation method focused on sentiment and semantic richness. That method was applied to the COCO-MS dataset, comprising approximately 150K images with segmented objects and corresponding crowd-sourced captions. We employed pre-trained models (Twitter-RoBERTa-base and BERT-base) to extract sentiment scores and variability of semantic embeddings from captions. The relation of the sentiment score and semantic variability with object categories was examined using multiple linear regression. Results indicate that while most captions were neutral, about 6% of the captions exhibited strong sentiment influenced by specific object categories. Semantic variability of within-image captions remained low and uncorrelated with object categories. Model-generated captions showed less than 1.5% of strong sentiment which was not influenced by object categories and did not correlate with the sentiment of the respective human-generated captions. This research demonstrates an approach to assess the quality of crowd- or worker-sourced captions informed by image content.

caption, category, sentiment, (16 more...)

arXiv.org Artificial Intelligence

2409.0956

Country:

North America > United States > Virginia > Loudoun County > Ashburn (0.04)
North America > United States > New York > Bronx County > New York City (0.04)
Europe > Middle East > Malta > Eastern Region > Northern Harbour District > St. Julian's (0.04)

Genre: Research Report > New Finding (0.96)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Communications > Social Media > Crowdsourcing (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

FreeA: Human-object Interaction Detection using Free Annotation Labels

Wang, Yuxiao, Wei, Zhenao, Jiang, Xinyu, Lei, Yu, Xue, Weiying, Liu, Jinxiu, Liu, Qi

arXiv.org Artificial IntelligenceMar-4-2024

Recent human-object interaction (HOI) detection approaches rely on high cost of manpower and require comprehensive annotated image datasets. In this paper, we propose a novel self-adaption language-driven HOI detection method, termed as FreeA, without labeling by leveraging the adaptability of CLIP to generate latent HOI labels. To be specific, FreeA matches image features of human-object pairs with HOI text templates, and a priori knowledge-based mask method is developed to suppress improbable interactions. In addition, FreeA utilizes the proposed interaction correlation matching method to enhance the likelihood of actions related to a specified action, further refine the generated HOI labels. Experiments on two benchmark datasets show that FreeA achieves state-of-the-art performance among weakly supervised HOI models. Our approach is +8.58 mean Average Precision (mAP) on HICO-DET and +1.23 mAP on V-COCO more accurate in localizing and classifying the interactive actions than the newest weakly model, and +1.68 mAP and +7.28 mAP than the latest weakly+ model, respectively. Code will be available at https://drliuqi.github.io/.

detection, interaction, proceedings, (13 more...)

arXiv.org Artificial Intelligence

2403.0184

Country: Asia > China (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Unifying Foundation Models with Quadrotor Control for Visual Tracking Beyond Object Categories

Saviolo, Alessandro, Rao, Pratyaksh, Radhakrishnan, Vivek, Xiao, Jiuhong, Loianno, Giuseppe

arXiv.org Artificial IntelligenceOct-17-2023

Visual control enables quadrotors to adaptively navigate using real-time sensory data, bridging perception with action. Yet, challenges persist, including generalization across scenarios, maintaining reliability, and ensuring real-time responsiveness. This paper introduces a perception framework grounded in foundation models for universal object detection and tracking, moving beyond specific training categories. Integral to our approach is a multi-layered tracker integrated with the foundation detector, ensuring continuous target visibility, even when faced with motion blur, abrupt light shifts, and occlusions. Complementing this, we introduce a model-free controller tailored for resilient quadrotor visual tracking. Our system operates efficiently on limited hardware, relying solely on an onboard camera and an inertial measurement unit. Through extensive validation in diverse challenging indoor and outdoor environments, we demonstrate our system's effectiveness and adaptability. In conclusion, our research represents a step forward in quadrotor visual tracking, moving from task-specific methods to more versatile and adaptable operations.

quadrotor control, unifying foundation model, visual tracking, (1 more...)

arXiv.org Artificial Intelligence

2310.04781

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.40)

Add feedback

Efficient Unsupervised Learning for Localization and Detection in Object Categories

Neural Information Processing SystemsApr-6-2023, 15:32:17 GMT

We describe a novel method for learning templates for recognition and localization of objects drawn from categories. A generative model repre- sents the configuration of multiple object parts with respect to an object coordinate system; these parts in turn generate image features. The com- plexity of the model in the number of features is low, meaning our model is much more efficient to train than comparative methods. Moreover, a variational approximation is introduced that allows learning to be or- ders of magnitude faster than previous approaches while incorporating many more features. Our model has been carefully tested on standard datasets; we compare with a number of recent template models.

efficient unsupervised learning, localization and detection, object category

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.40)

Add feedback

InfoMax Control for Acoustic Exploration of Objects by a Mobile Robot

Rebguns, Antons ( Department of Computer Sceince School of Information: Science, Technology, and Arts University of Arizona ) | Ford, Daniel ( Department of Electrical and Computer Engineering University of Arizona ) | Fasel, Ian R ( School of Information: Science, Technology, and Arts University of Arizona )

AAAI ConferencesAug-8-2011

Recently, information gain has been proposed as a candidate intrinsic motivation for lifelong learning agents that may not always have a specific task. In the InfoMax control framework, reinforcement learning is used to find a control policy for a POMDP in which movement and sensing actions are selected to reduce Shannon entropy as quickly as possible. In this study, we implement InfoMax control on a robot which can move between objects and perform sound-producing manipulations on them. We formulate a novel latent variable mixture model for acoustic similarities and learn InfoMax polices that allow the robot to rapidly reduce uncertainty about the categories of the objects in a room. We find that InfoMax with our improved acoustic model leads to policies which lead to high classification accuracy. Interestingly, we also find that with an insufficient model, the InfoMax policy eventually learns to "bury its head in the sand" to avoid getting additional evidence that might increase uncertainty. We discuss the implications of this finding for InfoMax as a principle of intrinsic motivation in lifelong learning agents.

artificial intelligence, machine learning, robot, (16 more...)

AAAI Conferences

Workshops at the Twenty-Fifth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Arizona > Pima County > Tucson (0.14)
North America > United States > Wisconsin (0.04)

Genre: Research Report > New Finding (0.66)

Industry: Education > Educational Setting (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback