AITopics | Iyer, Abhiram

Plotting

Iyer, Abhiram

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Permutation Invariant Learning with High-Dimensional Particle Filters

Boopathy, Akhilan, Muppidi, Aneesh, Yang, Peggy, Iyer, Abhiram, Yue, William, Fiete, Ila

arXiv.org Artificial IntelligenceOct-30-2024

Sequential learning in deep models often suffers from challenges such as catastrophic forgetting and loss of plasticity, largely due to the permutation dependence of gradient-based algorithms, where the order of training data impacts the learning outcome. In this work, we introduce a novel permutation-invariant learning framework based on high-dimensional particle filters. We theoretically demonstrate that particle filters are invariant to the sequential ordering of training minibatches or tasks, offering a principled solution to mitigate catastrophic forgetting and loss-of-plasticity. We develop an efficient particle filter for optimizing high-dimensional models, combining the strengths of Bayesian methods with gradient-based optimization. Through extensive experiments on continual supervised and reinforcement learning benchmarks, including SplitMNIST, SplitCIFAR100, and ProcGen, we empirically show that our method consistently improves performance, while reducing variance compared to standard baselines.

artificial intelligence, machine learning, particle filter, (14 more...)

arXiv.org Artificial Intelligence

2410.22695

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Towards Exact Computation of Inductive Bias

Boopathy, Akhilan, Yue, William, Hwang, Jaedong, Iyer, Abhiram, Fiete, Ila

arXiv.org Machine LearningJun-22-2024

Much research in machine learning involves finding appropriate inductive biases (e.g. convolutional neural networks, momentum-based optimizers, transformers) to promote generalization on tasks. However, quantification of the amount of inductive bias associated with these architectures and hyperparameters has been limited. We propose a novel method for efficiently computing the inductive bias required for generalization on a task with a fixed training data budget; formally, this corresponds to the amount of information required to specify well-generalizing models within a specific hypothesis space of models. Our approach involves modeling the loss distribution of random hypotheses drawn from a hypothesis space to estimate the required inductive bias for a task relative to these hypotheses. Unlike prior work, our method provides a direct estimate of inductive bias without using bounds and is applicable to diverse hypothesis spaces. Moreover, we derive approximation error bounds for our estimation approach in terms of the number of sampled hypotheses. Consistent with prior results, our empirical results demonstrate that higher dimensional tasks require greater inductive bias. We show that relative to other expressive model classes, neural networks as a model class encode large amounts of inductive bias. Furthermore, our measure quantifies the relative difference in inductive bias between different neural network architectures. Our proposed inductive bias metric provides an information-theoretic interpretation of the benefits of specific model architectures for certain tasks and provides a quantitative guide to developing tasks requiring greater inductive bias, thereby encouraging the development of more powerful inductive biases.

artificial intelligence, inductive bias, machine learning, (17 more...)

arXiv.org Machine Learning

2406.15941

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report > New Finding (0.88)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Uncovering Latent Memories: Assessing Data Leakage and Memorization Patterns in Large Language Models

Duan, Sunny, Khona, Mikail, Iyer, Abhiram, Schaeffer, Rylan, Fiete, Ila R

arXiv.org Artificial IntelligenceJun-20-2024

Frontier AI systems are making transformative impacts across society, but such benefits are not without costs: models trained on web-scale datasets containing personal and private data raise profound concerns about data privacy and security. Language models are trained on extensive corpora including potentially sensitive or proprietary information, and the risk of data leakage -- where the model response reveals pieces of such information -- remains inadequately understood. Prior work has investigated what factors drive memorization and have identified that sequence complexity and the number of repetitions drive memorization. Here, we focus on the evolution of memorization over training. We begin by reproducing findings that the probability of memorizing a sequence scales logarithmically with the number of times it is present in the data. We next show that sequences which are apparently not memorized after the first encounter can be "uncovered" throughout the course of training even without subsequent encounters, a phenomenon we term "latent memorization". The presence of latent memorization presents a challenge for data privacy as memorized sequences may be hidden at the final checkpoint of the model but remain easily recoverable. To this end, we develop a diagnostic test relying on the cross entropy loss to uncover latent memorized sequences with high accuracy.

artificial intelligence, machine learning, sequence, (15 more...)

arXiv.org Artificial Intelligence

2406.14549

Country: North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (1.00)

Add feedback

Resampling-free Particle Filters in High-dimensions

Boopathy, Akhilan, Muppidi, Aneesh, Yang, Peggy, Iyer, Abhiram, Yue, William, Fiete, Ila

arXiv.org Machine LearningApr-21-2024

State estimation is crucial for the performance and safety of numerous robotic applications. Among the suite of estimation techniques, particle filters have been identified as a powerful solution due to their non-parametric nature. Yet, in high-dimensional state spaces, these filters face challenges such as 'particle deprivation' which hinders accurate representation of the true posterior distribution. This paper introduces a novel resampling-free particle filter designed to mitigate particle deprivation by forgoing the traditional resampling step. This ensures a broader and more diverse particle set, especially vital in high-dimensional scenarios. Theoretically, our proposed filter is shown to offer a near-accurate representation of the desired posterior distribution in high-dimensional contexts. Empirically, the effectiveness of our approach is underscored through a high-dimensional synthetic state estimation task and a 6D pose estimation derived from videos. We posit that as robotic systems evolve with greater degrees of freedom, particle filters tailored for high-dimensional state spaces will be indispensable.

artificial intelligence, machine learning, particle filter, (15 more...)

arXiv.org Machine Learning

2404.13698

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Avoiding Catastrophe: Active Dendrites Enable Multi-Task Learning in Dynamic Environments

Iyer, Abhiram, Grewal, Karan, Velu, Akash, Souza, Lucas Oliveira, Forest, Jeremy, Ahmad, Subutai

arXiv.org Artificial IntelligenceDec-31-2021

A key challenge for AI is to build embodied systems that operate in dynamically changing environments. Such systems must adapt to changing task contexts and learn continuously. Although standard deep learning systems achieve state of the art results on static benchmarks, they often struggle in dynamic scenarios. In these settings, error signals from multiple contexts can interfere with one another, ultimately leading to a phenomenon known as catastrophic forgetting. In this article we investigate biologically inspired architectures as solutions to these problems. Specifically, we show that the biophysical properties of dendrites and local inhibitory systems enable networks to dynamically restrict and route information in a context-specific manner. Our key contributions are as follows. First, we propose a novel artificial neural network architecture that incorporates active dendrites and sparse representations into the standard deep learning framework. Next, we study the performance of this architecture on two separate benchmarks requiring task-based adaptation: Meta-World, a multi-task reinforcement learning environment where a robotic agent must learn to solve a variety of manipulation tasks simultaneously; and a continual learning benchmark in which the model's prediction task changes throughout training. Analysis on both benchmarks demonstrates the emergence of overlapping but distinct and sparse subnetworks, allowing the system to fluidly learn multiple tasks with minimal forgetting. Our neural implementation marks the first time a single architecture has achieved competitive results on both multi-task and continual learning settings. Our research sheds light on how biological properties of neurons can inform deep learning systems to address dynamic scenarios that are typically impossible for traditional ANNs to solve.

artificial intelligence, health & medicine, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2201.00042

Country:

Asia (0.67)
North America > United States > California (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Collision Avoidance Robotics Via Meta-Learning (CARML)

Iyer, Abhiram, Mahadevan, Aravind

arXiv.org Artificial IntelligenceJul-16-2020

Inspired by the work done by Andrychowicz et al. in [7], they modeled an I. INTRODUCTION LSTM as a meta-learner, which helped to train another neural Today, most deep reinforcement learning techniques require network "learner" classifier using a few-shot framework. Unlike models to be trained on a large number of training samples. In common deep learning optimizers such as Momentum, contrast, Model-Agnostic Meta-Learning (MAML) proposed ADAM, and Adagrad, this method is able to train a model by Finn et.

deep learning, neural network, reward function, (19 more...)

arXiv.org Artificial Intelligence

2007.08616

Country:

North America > United States > New York (0.15)
North America > United States > California (0.14)
North America > United States > Arizona (0.14)

Genre: Research Report (0.40)

Industry: Transportation (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback