AITopics | Pickett, Marc

Collaborating Authors

Pickett, Marc

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

The Ungrounded Alignment Problem

Pickett, Marc, Nain, Aakash Kumar, Modayil, Joseph, Jones, Llion

arXiv.org Artificial IntelligenceAug-8-2024

Modern machine learning systems have demonstrated substantial abilities with methods that either embrace or ignore human-provided knowledge, but combining benefits of both styles remains a challenge. One particular challenge involves designing learning systems that exhibit built-in responses to specific abstract stimulus patterns, yet are still plastic enough to be agnostic about the modality and exact form of their inputs. In this paper, we investigate what we call The Ungrounded Alignment Problem, which asks How can we build in predefined knowledge in a system where we don't know how a given stimulus will be grounded? This paper examines a simplified version of the general problem, where an unsupervised learner is presented with a sequence of images for the characters in a text corpus, and this learner is later evaluated on its ability to recognize specific (possibly rare) sequential patterns. Importantly, the learner is given no labels during learning or evaluation, but must map images from an unknown font or permutation to its correct class label. That is, at no point is our learner given labeled images, where an image vector is explicitly associated with a class label. Despite ample work in unsupervised and self-supervised loss functions, all current methods require a labeled fine-tuning phase to map the learned representations to correct classes. Finding this mapping in the absence of labels may seem a fool's errand, but our main result resolves this seeming paradox. We show that leveraging only letter bigram frequencies is sufficient for an unsupervised learner both to reliably associate images to class labels and to reliably identify trigger words in the sequence of inputs. More generally, this method suggests an approach for encoding specific desired innate behaviour in modality-agnostic models.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2408.04242

Genre: Research Report (1.00)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Better RAG using Relevant Information Gain

Pickett, Marc, Hartman, Jeremy, Bhowmick, Ayan Kumar, Alam, Raquib-ul, Vempaty, Aditya

arXiv.org Artificial IntelligenceJul-16-2024

A common way to extend the memory of large language models (LLMs) is by retrieval augmented generation (RAG), which inserts text retrieved from a larger memory into an LLM's context window. However, the context window is typically limited to several thousand tokens, which limits the number of retrieved passages that can inform a model's response. For this reason, it's important to avoid occupying context window space with redundant information by ensuring a degree of diversity among retrieved passages. At the same time, the information should also be relevant to the current task. Most prior methods that encourage diversity among retrieved results, such as Maximal Marginal Relevance (MMR), do so by incorporating an objective that explicitly trades off diversity and relevance. We propose a novel simple optimization metric based on relevant information gain, a probabilistic measure of the total information relevant to a query for a set of retrieved results. By optimizing this metric, diversity organically emerges from our system. When used as a drop-in replacement for the retrieval component of a RAG system, this method yields state-of-the-art performance on question answering tasks from the Retrieval Augmented Generation Benchmark (RGB), outperforming existing metrics that directly optimize for relevance and diversity.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2407.12101

Country: Asia (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.55)

Add feedback

Transformer Layers as Painters

Sun, Qi, Pickett, Marc, Nain, Aakash Kumar, Jones, Llion

arXiv.org Artificial IntelligenceJul-12-2024

Despite their nearly universal adoption for large language models, the internal workings of transformers are not well understood. We aim to better understand the impact of removing or reorganizing information throughout the layers of a pretrained transformer. Such an understanding could both yield better usage of existing models as well as to make architectural improvements to produce new variants. We present a series of empirical studies on frozen models that show that the lower and final layers of pretrained transformers differ from middle layers, but that middle layers have a surprising amount of uniformity. We further show that some classes of problems have robustness to skipping layers, running the layers in an order different from how they were trained, or running the layers in parallel. Our observations suggest that even frozen pretrained models may gracefully trade accuracy for latency by skipping layers or running layers in parallel.

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2407.09298

Country: Europe > Belgium (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.67)

Add feedback

LaMDA: Language Models for Dialog Applications

Thoppilan, Romal, De Freitas, Daniel, Hall, Jamie, Shazeer, Noam, Kulshreshtha, Apoorv, Cheng, Heng-Tze, Jin, Alicia, Bos, Taylor, Baker, Leslie, Du, Yu, Li, YaGuang, Lee, Hongrae, Zheng, Huaixiu Steven, Ghafouri, Amin, Menegali, Marcelo, Huang, Yanping, Krikun, Maxim, Lepikhin, Dmitry, Qin, James, Chen, Dehao, Xu, Yuanzhong, Chen, Zhifeng, Roberts, Adam, Bosma, Maarten, Zhao, Vincent, Zhou, Yanqi, Chang, Chung-Ching, Krivokon, Igor, Rusch, Will, Pickett, Marc, Srinivasan, Pranesh, Man, Laichee, Meier-Hellstern, Kathleen, Morris, Meredith Ringel, Doshi, Tulsee, Santos, Renelito Delos, Duke, Toju, Soraker, Johnny, Zevenbergen, Ben, Prabhakaran, Vinodkumar, Diaz, Mark, Hutchinson, Ben, Olson, Kristen, Molina, Alejandra, Hoffman-John, Erin, Lee, Josh, Aroyo, Lora, Rajakumar, Ravi, Butryna, Alena, Lamm, Matthew, Kuzmina, Viktoriya, Fenton, Joe, Cohen, Aaron, Bernstein, Rachel, Kurzweil, Ray, Aguera-Arcas, Blaise, Cui, Claire, Croak, Marian, Chi, Ed, Le, Quoc

arXiv.org Artificial IntelligenceFeb-10-2022

We present LaMDA: Language Models for Dialog Applications. LaMDA is a family of Transformer-based neural language models specialized for dialog, which have up to 137B parameters and are pre-trained on 1.56T words of public dialog data and web text. While model scaling alone can improve quality, it shows less improvements on safety and factual grounding. We demonstrate that fine-tuning with annotated data and enabling the model to consult external knowledge sources can lead to significant improvements towards the two key challenges of safety and factual grounding. The first challenge, safety, involves ensuring that the model's responses are consistent with a set of human values, such as preventing harmful suggestions and unfair bias. We quantify safety using a metric based on an illustrative set of human values, and we find that filtering candidate responses using a LaMDA classifier fine-tuned with a small amount of crowdworker-annotated data offers a promising approach to improving model safety. The second challenge, factual grounding, involves enabling the model to consult external knowledge sources, such as an information retrieval system, a language translator, and a calculator. We quantify factuality using a groundedness metric, and we find that our approach enables the model to generate responses grounded in known sources, rather than responses that merely sound plausible. Finally, we explore the use of LaMDA in the domains of education and content recommendations, and analyze their helpfulness and role consistency.

consumer health, information retrieval, machine learning, (22 more...)

arXiv.org Artificial Intelligence

2201.08239

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre:

Questionnaire & Opinion Survey (1.00)
Overview (1.00)
Personal > Interview (0.93)
Research Report > New Finding (0.92)

Industry:

Media > Music (0.92)
Law (0.92)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.92)
(8 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

CVPR 2020 Continual Learning in Computer Vision Competition: Approaches, Results, Current Challenges and Future Directions

Lomonaco, Vincenzo, Pellegrini, Lorenzo, Rodriguez, Pau, Caccia, Massimo, She, Qi, Chen, Yu, Jodelet, Quentin, Wang, Ruiping, Mai, Zheda, Vazquez, David, Parisi, German I., Churamani, Nikhil, Pickett, Marc, Laradji, Issam, Maltoni, Davide

arXiv.org Artificial IntelligenceSep-14-2020

In the last few years, we have witnessed a renewed and fast-growing interest in continual learning with deep neural networks with the shared objective of making current AI systems more adaptive, efficient and autonomous. However, despite the significant and undoubted progress of the field in addressing the issue of catastrophic forgetting, benchmarking different continual learning approaches is a difficult task by itself. In fact, given the proliferation of different settings, training and evaluation protocols, metrics and nomenclature, it is often tricky to properly characterize a continual learning algorithm, relate it to other solutions and gauge its real-world applicability. The first Continual Learning in Computer Vision challenge held at CVPR in 2020 has been one of the first opportunities to evaluate different continual learning algorithms on a common hardware with a large set of shared evaluation metrics and 3 different settings based on the realistic CORe50 video benchmark. In this paper, we report the main results of the competition, which counted more than 79 teams registered, 11 finalists and 2300$ in prizes. We also summarize the winning approaches, current challenges and future research directions.

continual learning, deep learning, neural network, (16 more...)

arXiv.org Artificial Intelligence

2009.09929

Country:

Asia (0.28)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

The AAAI-13 Conference Workshops

AI MagazineJan-10-2014

The AAAI-13 Workshop Program, a part of the 27th AAAI Conference on Artificial Intelligence, was held Sunday and Monday, July 14‚Äì15, 2013 at the Hyatt Regency Bellevue Hotel in Bellevue, Washington, USA.

artificial intelligence, Computer Engineering, management and information, (2 more...)

AI Magazine

Industry: Information Technology (0.71)

Technology:

Information Technology > Artificial Intelligence > Robots (0.53)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.46)

Add feedback

The AAAI-13 Conference Workshops

AI MagazineJan-10-2014

Benjamin Grosof (Coherent Knowledge from episodic memory to great progress is being made on methods Systems) on representing activity create semantic memory, using a combination to solve problems related to structure context through semantic rule methods, of semantic memory and prediction, motion simulation, deriving from experience in the episodic memory to guide users?

constraint-based reasoning, health & medicine, workshop, (19 more...)

AI Magazine

Country:

North America > United States > California (0.28)
North America > Canada > Alberta (0.28)
Europe > Germany > Bremen > Bremen (0.14)

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Leisure & Entertainment > Games (1.00)
Health & Medicine > Consumer Health (0.95)
Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Add feedback

Spontaneous Analogy by Piggybacking on a Perceptual System

Pickett, Marc, Aha, David W.

arXiv.org Artificial IntelligenceOct-10-2013

Most computational models of analogy assume they are given a delineated source domain and often a specified target domain. These systems do not address how analogs can be isolated from large domains and spontaneously retrieved from long-term memory, a process we call spontaneous analogy. We present a system that represents relational structures as feature bags. Using this representation, our system leverages perceptual algorithms to automatically create an ontology of relational structures and to efficiently retrieve analogs for new relational structures from long-term memory. We provide a demonstration of our approach that takes a set of unsegmented stories, constructs an ontology of analogical schemas (corresponding to plot devices), and uses this ontology to efficiently find analogs within new stories, yielding significant time-savings over linear analog retrieval at a small accuracy cost.

artificial intelligence, neural network, relational structure, (14 more...)

arXiv.org Artificial Intelligence

1310.2955

Country: North America > United States > Maryland (0.28)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (0.80)

Add feedback

Building on Deep Learning

Pickett, Marc (Naval Research Laboratory)

AAAI ConferencesJul-9-2013

We propose using deep learning as the "workhorse" of a cognitive architecture. We show how deep learning can be leveraged to learn representations, such as a hierarchy of analogical schemas, from relational data. This approach to higher cognition drives some desiderata of deep learning, particularly modality independence and the ability to make top-down predictions. Finally, we consider the problem of how relational representations might be learned from sensor data that is not explicitly relational.

deep learning

AAAI Conferences

Workshops at the Twenty-Seventh AAAI Conference on Artificial Intelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback