AITopics | Petryk, Suzanne

Collaborating Authors

Petryk, Suzanne

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Data-Centric AI Governance: Addressing the Limitations of Model-Focused Policies

Gupta, Ritwik, Walker, Leah, Corona, Rodolfo, Fu, Stephanie, Petryk, Suzanne, Napolitano, Janet, Darrell, Trevor, Reddie, Andrew W.

arXiv.org Artificial IntelligenceSep-25-2024

Current regulations on powerful AI capabilities are narrowly focused on "foundation" or "frontier" models. However, these terms are vague and inconsistently defined, leading to an unstable foundation for governance efforts. Critically, policy debates often fail to consider the data used with these models, despite the clear link between data and model performance. Even (relatively) "small" models that fall outside the typical definitions of foundation and frontier models can achieve equivalent outcomes when exposed to sufficiently specific datasets. In this work, we illustrate the importance of considering dataset size and content as essential factors in assessing the risks posed by models both today and in the future. More broadly, we emphasize the risk posed by over-regulating reactively and provide a path towards careful, quantitative evaluation of capabilities that can lead to a simplified regulatory environment.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2409.17216

Country:

Europe (1.00)
North America > United States > California (0.14)
Asia > Middle East > UAE (0.14)
Asia > Middle East > Qatar (0.14)

Genre: Research Report (0.82)

Industry:

Law (1.00)
Health & Medicine (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(3 more...)

Add feedback

An Introduction to Vision-Language Modeling

Bordes, Florian, Pang, Richard Yuanzhe, Ajay, Anurag, Li, Alexander C., Bardes, Adrien, Petryk, Suzanne, Mañas, Oscar, Lin, Zhiqiu, Mahmoud, Anas, Jayaraman, Bargav, Ibrahim, Mark, Hall, Melissa, Xiong, Yunyang, Lebensold, Jonathan, Ross, Candace, Jayakumar, Srihari, Guo, Chuan, Bouchacourt, Diane, Al-Tahan, Haider, Padthe, Karthik, Sharma, Vasu, Xu, Hu, Tan, Xiaoqing Ellen, Richards, Megan, Lavoie, Samuel, Astolfi, Pietro, Hemmat, Reyhane Askari, Chen, Jun, Tirumala, Kushal, Assouel, Rim, Moayeri, Mazda, Talattof, Arjang, Chaudhuri, Kamalika, Liu, Zechun, Chen, Xilun, Garrido, Quentin, Ullrich, Karen, Agrawal, Aishwarya, Saenko, Kate, Celikyilmaz, Asli, Chandra, Vikas

arXiv.org Artificial IntelligenceMay-27-2024

Following the recent popularity of Large Language Models (LLMs), several attempts have been made to extend them to the visual domain. From having a visual assistant that could guide us through unfamiliar environments to generative models that produce images using only a high-level text description, the vision-language model (VLM) applications will significantly impact our relationship with technology. However, there are many challenges that need to be addressed to improve the reliability of those models. While language is discrete, vision evolves in a much higher dimensional space in which concepts cannot always be easily discretized. To better understand the mechanics behind mapping vision to language, we present this introduction to VLMs which we hope will help anyone who would like to enter the field. First, we introduce what VLMs are, how they work, and how to train them. Then, we present and discuss approaches to evaluate VLMs. Although this work primarily focuses on mapping images to language, we also discuss extending VLMs to videos.

ieee cvf international conference, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2405.17247

Country:

Asia (0.92)
North America > Canada > Ontario > Toronto (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
(2 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.92)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
(2 more...)

Add feedback

ALOHa: A New Measure for Hallucination in Captioning Models

Petryk, Suzanne, Chan, David M., Kachinthaya, Anish, Zou, Haodi, Canny, John, Gonzalez, Joseph E., Darrell, Trevor

arXiv.org Artificial IntelligenceApr-3-2024

Despite recent advances in multimodal pre-training for visual description, state-of-the-art models still produce captions containing errors, such as hallucinating objects not present in a scene. The existing prominent metric for object hallucination, CHAIR, is limited to a fixed set of MS COCO objects and synonyms. In this work, we propose a modernized open-vocabulary metric, ALOHa, which leverages large language models (LLMs) to measure object hallucinations. Specifically, we use an LLM to extract groundable objects from a candidate caption, measure their semantic similarity to reference objects from captions and object detections, and use Hungarian matching to produce a final hallucination score. We show that ALOHa correctly identifies 13.6% more hallucinated objects than CHAIR on HAT, a new gold-standard subset of MS COCO Captions annotated for hallucinations, and 30.8% more on nocaps, where objects extend beyond MS COCO categories. Our code is available at https://davidmchan.github.io/aloha/.

artificial intelligence, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2404.02904

Country:

North America > United States > Maryland (0.14)
North America > United States > California (0.14)
Asia > Middle East > UAE (0.14)

Genre: Research Report (0.70)

Industry:

Leisure & Entertainment (0.68)
Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

CLAIR: Evaluating Image Captions with Large Language Models

Chan, David, Petryk, Suzanne, Gonzalez, Joseph E., Darrell, Trevor, Canny, John

arXiv.org Artificial IntelligenceOct-19-2023

The evaluation of machine-generated image captions poses an interesting yet persistent challenge. Effective evaluation measures must consider numerous dimensions of similarity, including semantic relevance, visual structure, object interactions, caption diversity, and specificity. Existing highly-engineered measures attempt to capture specific aspects, but fall short in providing a holistic score that aligns closely with human judgments. Here, we propose CLAIR, a novel method that leverages the zero-shot language modeling capabilities of large language models (LLMs) to evaluate candidate captions. In our evaluations, CLAIR demonstrates a stronger correlation with human judgments of caption quality compared to existing measures. Notably, on Flickr8K-Expert, CLAIR achieves relative correlation improvements over SPICE of 39.6% and over image-augmented methods such as RefCLIP-S of 18.3%. Moreover, CLAIR provides noisily interpretable results by allowing the language model to identify the underlying reasoning behind its assigned score. Code is available at https://davidmchan.github.io/clair/

artificial intelligence, large language model, natural language, (2 more...)

arXiv.org Artificial Intelligence

2310.12971

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Remembering for the Right Reasons: Explanations Reduce Catastrophic Forgetting

Ebrahimi, Sayna, Petryk, Suzanne, Gokul, Akash, Gan, William, Gonzalez, Joseph E., Rohrbach, Marcus, Darrell, Trevor

arXiv.org Artificial IntelligenceOct-4-2020

The goal of continual learning (CL) is to learn a sequence of tasks without suffering from the phenomenon of catastrophic forgetting. Previous work has shown that leveraging memory in the form of a replay buffer can reduce performance degradation on prior tasks. We hypothesize that forgetting can be further reduced when the model is encouraged to remember the \textit{evidence} for previously made decisions. As a first step towards exploring this hypothesis, we propose a simple novel training paradigm, called Remembering for the Right Reasons (RRR), that additionally stores visual model explanations for each example in the buffer and ensures the model has "the right reasons" for its predictions by encouraging its explanations to remain consistent with those used to make decisions at training time. Without this constraint, there is a drift in explanations and increase in forgetting as conventional continual learning algorithms learn new tasks. We demonstrate how RRR can be easily added to any memory or regularization-based approach and results in reduced forgetting, and more importantly, improved model explanations. We have evaluated our approach in the standard and few-shot settings and observed a consistent improvement across various CL approaches using different architectures and techniques to generate model explanations and demonstrated our approach showing a promising connection between explainability and continual learning. Our code is available at https://github.com/SaynaEbrahimi/Remembering-for-the-Right-Reasons.

artificial intelligence, neural network, saliency map, (15 more...)

arXiv.org Artificial Intelligence

2010.01528

Genre: Research Report (1.00)

Industry: Education (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback