AITopics | open image

Collaborating Authors

open image

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

impacts

Neural Information Processing SystemsJun-20-2026, 18:40:04 GMT

The primary goal of PACBench is to catalyze the development of more capable, reliable, and physically grounded VLMs and their fine-tuned variants, often called VLAs for real-world robotic applications. Because VLA fine-tuning typically relies on low-level trajectory data rather than higher level reasoning, probing the underlying VLM's understanding of object Properties, action Affordances, and physical Constraints (PAC) gives us a grounded lens into the capabilities that downstream robotic policies will inherit. By diagnosing PAC weaknesses in the base model, researchers can distinguish whether a VLA's performance stems from genuine physical common sense or simply memorized motion patterns, and thus guide targeted improvements in model architectures, training methodologies, and dataset curation. In doing so, PACBench helps ensure that robotic systems become more predictable, less prone to errors from a lack of physical understanding, and better equipped for safe, effective collaboration in complex, everyday environments. By providing a fine-grained diagnostic tool, PACBench can help researchers and developers identify specific weaknesses in current models, thereby guiding targeted improvements in model architectures, training methodologies, and dataset curation. This, in turn, can lead to robotic systems that are more predictable, less prone to errors stemming from a lack of physical common sense, and better able to perform a wide range of useful tasks. The open release of our benchmark and its diverse data sources (including web-scale images, real-world humanoid captures, and simulated scenarios) is intended to foster broad community engagement and accelerate progress in this crucial area of AI. While any advancement in AI capabilities warrants ongoing consideration of its societal implications, our work focuses on enhancing the fundamental understanding and robustness of AI systems, which we see as a positive step towards more responsible AI development.

constraint, large language model, machine learning, (19 more...)

Neural Information Processing Systems

Industry: Appliances & Durable Goods (0.67)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.99)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.99)

Add feedback

8af749935131cc8ea5dae4f6d8cdb304-Supplemental-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsFeb-10-2026, 14:49:56 GMT

dataset, poisonable subset, subset, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.68)

Add feedback

8af749935131cc8ea5dae4f6d8cdb304-Paper-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsFeb-10-2026, 14:49:51 GMT

backdoor attack, dataset, natural backdoor dataset, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Chicago (0.05)
Asia > Nepal (0.04)
Oceania > New Zealand > South Island > Marlborough District > Blenheim (0.04)
(2 more...)

Genre: Research Report (0.46)

Industry:

Information Technology > Security & Privacy (0.72)
Transportation > Ground > Road (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)
Information Technology > Security & Privacy (0.92)

Add feedback

Culture Affordance Atlas: Reconciling Object Diversity Through Functional Mapping

Nwatu, Joan, Bai, Longju, Ignat, Oana, Mihalcea, Rada

arXiv.org Artificial IntelligenceDec-4-2025

Culture shapes the objects people use and for what purposes, yet mainstream Vision-Language (VL) datasets frequently exhibit cultural biases, disproportionately favoring higher-income, Western contexts. This imbalance reduces model generalizability and perpetuates performance disparities, especially impacting lower-income and non-Western communities. To address these disparities, we propose a novel function-centric framework that categorizes objects by the functions they fulfill, across diverse cultural and economic contexts. We implement this framework by creating the Culture Affordance Atlas, a re-annotated and culturally grounded restructuring of the Dollar Street dataset spanning 46 functions and 288 objects publicly available at https://lit.eecs.umich.edu/CultureAffordance-Atlas/index.html. Through extensive empirical analyses using the CLIP model, we demonstrate that function-centric labels substantially reduce socioeconomic performance gaps between high- and low-income groups by a median of 6 pp (statistically significant), improving model effectiveness for lower-income contexts. Furthermore, our analyses reveals numerous culturally essential objects that are frequently overlooked in prominent VL datasets. Our contributions offer a scalable pathway toward building inclusive VL datasets and equitable AI systems.

machine learning, natural language, open image, (18 more...)

arXiv.org Artificial Intelligence

2512.03173

Country: North America > United States (0.68)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Supplementary Materials A Extended Related Work (2)

Neural Information Processing SystemsAug-16-2025, 19:21:22 GMT

We first discuss attacks that use physical objects as triggers, then discuss a few related works which use light as a trigger. We conclude by discussing the single proposed defense against physical backdoor attacks. As mentioned briefly in 2, [ 10 ] designs a backdoor attack against lane detection systems for autonomous vehicles. This attack expands the scope of physical backdoor attacks by attacking detection rather than classification models. Furthermore, it confirms the result from [ 43 ] that even when digitally altered images are used to poison a dataset, the triggers can be activated using physical objects (traffic cones in this setting) in real world scenarios. A second work [ 31 ] evaluates the effectiveness of using facial characteristics as backdoor triggers.

artificial intelligence, machine learning, subset, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Finding Naturally Occurring Physical Backdoors in Image Datasets Emily Wenger University of Chicago Roma Bhattacharjee

Neural Information Processing SystemsAug-16-2025, 19:21:19 GMT

Extensive literature on backdoor poison attacks has studied attacks and defenses for backdoors using "digital trigger patterns." In contrast, "physical backdoors" use physical objects as triggers, have only recently been identified, and are qualitatively different enough to resist most defenses targeting digital trigger backdoors. Research on physical backdoors is limited by access to large datasets containing real images of physical objects co-located with misclassification targets . Building these datasets is time-and labor-intensive. This work seeks to address the challenge of accessibility for research on physical backdoor attacks.

artificial intelligence, deep learning, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Chicago (0.41)
Asia > Nepal (0.04)
Oceania > New Zealand > South Island > Marlborough District > Blenheim (0.04)
(2 more...)

Genre: Research Report (0.46)

Industry:

Information Technology > Security & Privacy (0.92)
Transportation > Ground > Road (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)
Information Technology > Security & Privacy (0.92)

Add feedback

PAC Bench: Do Foundation Models Understand Prerequisites for Executing Manipulation Policies?

Gundawar, Atharva, Sagar, Som, Senanayake, Ransalu

arXiv.org Artificial IntelligenceJul-1-2025

Vision-Language Models (VLMs) are increasingly pivotal for generalist robot manipulation, enabling tasks such as physical reasoning, policy generation, and failure detection. However, their proficiency in these high-level applications often assumes a deep understanding of low-level physical prerequisites, a capability that remains largely unverified. For robots to perform actions reliably, they must comprehend intrinsic object properties (e.g., material, weight), action affordances (e.g., graspable, stackable), and physical constraints (e.g., stability, reachability, or an object's state, such as being closed). Despite the widespread use of VLMs in manipulation tasks, we argue that off-the-shelf models may lack this granular, physically grounded understanding, as such prerequisites are often overlooked during training. To address this critical gap, we introduce PAC Bench, a comprehensive benchmark designed to systematically evaluate VLMs on their understanding of core Properties, Affordances, and Constraints (PAC) from a task executability perspective. PAC Bench features a diverse dataset with over 30,000 annotations, comprising 673 real-world images (115 object classes, 15 property types, and 1 to 3 affordances defined per class), 100 real-world humanoid-view scenarios, and 120 unique simulated constraint scenarios across four tasks. Our evaluations reveal significant gaps in the ability of current VLMs to grasp fundamental physical concepts, highlighting limitations in their suitability for reliable robot manipulation and pointing to key areas for targeted research. PAC Bench also serves as a standardized benchmark for rigorously evaluating physical reasoning in VLMs and guiding the development of more robust, physically grounded models for robotic applications. Project Page: https://pacbench.github.io/

constraint, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2506.23725

Country:

North America > United States > Arizona (0.04)
Africa > Mozambique > Gaza Province > Xai-Xai (0.04)

Genre: Research Report (0.63)

Industry: Appliances & Durable Goods (0.67)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

How to classify photos in 600 classes using nine million Open Images

#artificialintelligenceDec-30-2019, 14:31:28 GMT

If you're looking build an image classifier but need training data, look no further than Google Open Images. This massive image dataset contains over 30 million images and 15 million bounding boxes. Plus, Open Images is much more open and accessible than certain other image datasets at this scale. For example, ImageNet has restrictive licensing. However, it's not easy for developers on single machines to sift through that much data.You need to download and process multiple metadata files, and roll their own storage space (or apply for access to a Google Cloud bucket).

download, open image, sandwich, (8 more...)

#artificialintelligence

Industry:

Information Technology > Services (0.37)
Leisure & Entertainment (0.33)

Technology:

Information Technology > Cloud Computing (0.57)
Information Technology > Artificial Intelligence > Machine Learning (0.53)

Add feedback

nocaps: novel object captioning at scale

Agrawal, Harsh, Desai, Karan, Chen, Xinlei, Jain, Rishabh, Batra, Dhruv, Parikh, Devi, Lee, Stefan, Anderson, Peter

arXiv.org Artificial IntelligenceDec-20-2018

Image captioning models have achieved impressive results on datasets containing limited visual concepts and large amounts of paired image-caption training data. However, if these models are to ever function in the wild, a much larger variety of visual concepts must be learned, ideally from less supervision. To encourage the development of image captioning models that can learn visual concepts from alternative data sources, such as object detection datasets, we present the first large-scale benchmark for this task. Dubbed 'nocaps', for novel object captioning at scale, our benchmark consists of 166,100 human-generated captions describing 15,100 images from the Open Images validation and test sets. The associated training data consists of COCO image-caption pairs, plus Open Images image-level labels and object bounding boxes. Since Open Images contains many more classes than COCO, more than 500 object classes seen in test images have no training captions (hence, nocaps). We evaluate several existing approaches to novel object captioning on our challenging benchmark. In automatic evaluations these approaches show modest improvements over a strong baseline trained only on image-caption data. However, even when using ground-truth object detections, the results are significantly weaker than our human baseline - indicating substantial room for improvement.

artificial intelligence, caption, machine learning, (18 more...)

arXiv.org Artificial Intelligence

1812.08658

Country: North America > United States > Texas > Jack County (0.04)

Genre: Research Report (1.00)

Industry:

Transportation > Ground > Road (1.00)
Leisure & Entertainment > Sports (1.00)
Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

No Classification without Representation: Assessing Geodiversity Issues in Open Data Sets for the Developing World

Shankar, Shreya, Halpern, Yoni, Breck, Eric, Atwood, James, Wilson, Jimbo, Sculley, D.

arXiv.org Machine LearningNov-22-2017

Modern machine learning systems such as image classifiers rely heavily on large scale data sets for training. Such data sets are costly to create, thus in practice a small number of freely available, open source data sets are widely used. We suggest that examining the geo-diversity of open data sets is critical before adopting a data set for use cases in the developing world. We analyze two large, publicly available image data sets to assess geo-diversity and find that these data sets appear to exhibit an observable amerocentric and eurocentric representation bias. Further, we analyze classifiers trained on these data sets to assess the impact of these training distributions and find strong differences in the relative performance on images from different locales. These results emphasize the need to ensure geo-representation when constructing data sets for use in the developing world.

artificial intelligence, machine learning, open image, (13 more...)

arXiv.org Machine Learning

1711.08536

Country:

Asia (0.50)
North America > United States (0.29)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback