AITopics | Schwarz, Jonathan Richard

Collaborating Authors

Schwarz, Jonathan Richard

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Composable Interventions for Language Models

Kolbeinsson, Arinbjorn, O'Brien, Kyle, Huang, Tianjin, Gao, Shanghua, Liu, Shiwei, Schwarz, Jonathan Richard, Vaidya, Anurag, Mahmood, Faisal, Zitnik, Marinka, Chen, Tianlong, Hartvigsen, Thomas

arXiv.org Artificial IntelligenceJul-8-2024

Test-time interventions for language models can enhance factual accuracy, mitigate harmful outputs, and improve model efficiency without costly retraining. But despite a flood of new methods, different types of interventions are largely developing independently. In practice, multiple interventions must be applied sequentially to the same model, yet we lack standardized ways to study how interventions interact. We fill this gap by introducing composable interventions, a framework to study the effects of using multiple interventions on the same language models, featuring new metrics and a unified codebase. Using our framework, we conduct extensive experiments and compose popular methods from three emerging intervention categories -- Knowledge Editing, Model Compression, and Machine Unlearning. Our results from 310 different compositions uncover meaningful interactions: compression hinders editing and unlearning, composing interventions hinges on their order of application, and popular general-purpose metrics are inadequate for assessing composability. Taken together, our findings showcase clear gaps in composability, suggesting a need for new multi-objective interventions. All of our code is public: https://github.com/hartvigsen-group/composable-interventions.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2407.06483

Country: North America > United States (0.92)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)

Add feedback

Empowering Biomedical Discovery with AI Agents

Gao, Shanghua, Fang, Ada, Huang, Yepeng, Giunchiglia, Valentina, Noori, Ayush, Schwarz, Jonathan Richard, Ektefaie, Yasha, Kondic, Jovana, Zitnik, Marinka

arXiv.org Artificial IntelligenceApr-3-2024

A long-standing ambition for artificial intelligence (AI) in biomedicine is the development of AI systems that could eventually make major scientific discoveries, with the potential to be worthy of a Nobel Prize--fulfilling the Nobel Turing Challenge [1]. While the concept of an "AI scientist" is aspirational, advances in agent-based AI pave the way to the development of AI agents as conversable systems capable of skeptical learning and reasoning that coordinate large language models (LLMs), machine learning (ML) tools, experimental platforms, or even combinations of them [2-5] (Figure 1). The complexity of biological problems requires a multistage approach, where decomposing complex questions into simpler tasks is necessary. AI agents can break down a problem into manageable subtasks, which can then be addressed by agents with specialized functions for targeted problem-solving and integration of scientific knowledge, paving the way toward a future in which a major biomedical discovery is made solely by AI [2, 6].

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2404.02831

Country: North America > United States > Massachusetts (0.14)

Genre:

Research Report > Experimental Study (0.92)
Personal > Honors (0.67)

Industry:

Law (1.00)
Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Online Adaptation of Language Models with a Memory of Amortized Contexts

Tack, Jihoon, Kim, Jaehyung, Mitchell, Eric, Shin, Jinwoo, Teh, Yee Whye, Schwarz, Jonathan Richard

arXiv.org Artificial IntelligenceMar-7-2024

Due to the rapid generation and dissemination of information, large language models (LLMs) quickly run out of date despite enormous development costs. Due to this crucial need to keep models updated, online learning has emerged as a critical necessity when utilizing LLMs for real-world applications. However, given the ever-expanding corpus of unseen documents and the large parameter space of modern LLMs, efficient adaptation is essential. To address these challenges, we propose Memory of Amortized Contexts (MAC), an efficient and effective online adaptation framework for LLMs with strong knowledge retention. We propose an amortized feature extraction and memory-augmentation approach to compress and extract information from new documents into compact modulations stored in a memory bank. When answering questions, our model attends to and extracts relevant knowledge from this memory bank. To learn informative modulations in an efficient manner, we utilize amortization-based meta-learning, which substitutes the optimization process with a single forward pass of the encoder. Subsequently, we learn to choose from and aggregate selected documents into a single modulation by conditioning on the question, allowing us to adapt a frozen language model during test time without requiring further gradient updates. Our experiment demonstrates the superiority of MAC in multiple aspects, including online adaptation performance, time, and memory efficiency. Code is available at: https://github.com/jihoontack/MAC.

large language model, machine learning, natural language, (12 more...)

arXiv.org Artificial Intelligence

2403.04317

Country: Europe (0.14)

Genre: Research Report (1.00)

Industry: Education > Educational Setting > Online (0.35)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

C3: High-performance and low-complexity neural compression from a single image or video

Kim, Hyunjik, Bauer, Matthias, Theis, Lucas, Schwarz, Jonathan Richard, Dupont, Emilien

arXiv.org Machine LearningDec-5-2023

Most neural compression models are trained on large datasets of images or videos in order to generalize to unseen data. Such generalization typically requires large and expressive architectures with a high decoding complexity. Here we introduce C3, a neural compression method with strong rate-distortion (RD) performance that instead overfits a small model to each image or video separately. The resulting decoding complexity of C3 can be an order of magnitude lower than neural baselines with similar RD performance. C3 builds on COOL-CHIC (Ladune et al.) and makes several simple and effective improvements for images. We further develop new methodology to apply C3 to videos. On the CLIC2020 image benchmark, we match the RD performance of VTM, the reference implementation of the H.266 codec, with less than 3k MACs/pixel for decoding. On the UVG video benchmark, we match the RD performance of the Video Compression Transformer (Mentzer et al.), a well-established neural video codec, with less than 5k MACs/pixel for decoding.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Machine Learning

2312.02753

Country: North America > United States (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Learning Large-scale Neural Fields via Context Pruned Meta-Learning

Tack, Jihoon, Kim, Subin, Yu, Sihyun, Lee, Jaeho, Shin, Jinwoo, Schwarz, Jonathan Richard

arXiv.org Artificial IntelligenceOct-24-2023

We introduce an efficient optimization-based meta-learning technique for large-scale neural field training by realizing significant memory savings through automated online context point selection. This is achieved by focusing each learning step on the subset of data with the highest expected immediate improvement in model quality, resulting in the almost instantaneous modeling of global structure and subsequent refinement of high-frequency details. We further improve the quality of our meta-learned initialization by introducing a bootstrap correction resulting in the minimization of any error introduced by reduced context sets while simultaneously mitigating the well-known myopia of optimization-based meta-learning. Finally, we show how gradient re-scaling at meta-test time allows the learning of extremely high-quality neural fields in significantly shortened optimization procedures. Our framework is model-agnostic, intuitive, straightforward to implement, and shows significant reconstruction improvements for a wide range of signals. We provide an extensive empirical evaluation on nine datasets across multiple multiple modalities, demonstrating state-of-the-art results while providing additional insight through careful analysis of the algorithmic components constituting our method. Code is available at https://github.com/jihoontack/GradNCP

artificial intelligence, gradncp, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2302.00617

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Modality-Agnostic Variational Compression of Implicit Neural Representations

Schwarz, Jonathan Richard, Tack, Jihoon, Teh, Yee Whye, Lee, Jaeho, Shin, Jinwoo

arXiv.org Artificial IntelligenceApr-7-2023

We introduce a modality-agnostic neural compression algorithm based on a functional view of data and parameterised as an Implicit Neural Representation (INR). Bridging the gap between latent coding and sparsity, we obtain compact latent representations non-linearly mapped to a soft gating mechanism. This allows the specialisation of a shared INR network to each data item through subnetwork selection. After obtaining a dataset of such latent representations, we directly optimise the rate/distortion trade-off in a modality-agnostic space using neural compression. Variational Compression of Implicit Neural Representations (VC-INR) shows improved performance given the same representational capacity pre quantisation while also outperforming previous quantisation schemes used for other INR techniques. Our experiments demonstrate strong results over a large set of diverse modalities using the same algorithm without any modality-specific inductive biases. We show results on images, climate data, 3D shapes and scenes as well as audio and video, introducing VC-INR as the first INR-based method to outperform codecs as well-known and diverse as JPEG 2000, MP3 and AVC/HEVC on their respective modalities.

artificial intelligence, machine learning, modality-agnostic variational compression, (14 more...)

arXiv.org Artificial Intelligence

2301.09479

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

Spatial Functa: Scaling Functa to ImageNet Classification and Generation

Bauer, Matthias, Dupont, Emilien, Brock, Andy, Rosenbaum, Dan, Schwarz, Jonathan Richard, Kim, Hyunjik

arXiv.org Artificial IntelligenceFeb-9-2023

Neural fields, also known as implicit neural representations, have emerged as a powerful means to represent complex signals of various modalities. Based on this Dupont et al. (2022) introduce a framework that views neural fields as data, termed *functa*, and proposes to do deep learning directly on this dataset of neural fields. In this work, we show that the proposed framework faces limitations when scaling up to even moderately complex datasets such as CIFAR-10. We then propose *spatial functa*, which overcome these limitations by using spatially arranged latent representations of neural fields, thereby allowing us to scale up the approach to ImageNet-1k at 256x256 resolution. We demonstrate competitive performance to Vision Transformers (Steiner et al., 2022) on classification and Latent Diffusion (Rombach et al., 2022) on image generation respectively.

artificial intelligence, machine learning, representation, (18 more...)

arXiv.org Artificial Intelligence

2302.0313

Country: Asia > Middle East > Israel (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback