AITopics | Basu, Sourya

Collaborating Authors

Basu, Sourya

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

G-RepsNet: A Fast and General Construction of Equivariant Networks for Arbitrary Matrix Groups

Basu, Sourya, Lohit, Suhas, Brand, Matthew

arXiv.org Artificial IntelligenceFeb-23-2024

Group equivariance is a strong inductive bias useful in a wide range of deep learning tasks. However, constructing efficient equivariant networks for general groups and domains is difficult. Recent work by Finzi et al. (2021) directly solves the equivariance constraint for arbitrary matrix groups to obtain equivariant MLPs (EMLPs). But this method does not scale well and scaling is crucial in deep learning. Here, we introduce Group Representation Networks (G-RepsNets), a lightweight equivariant network for arbitrary matrix groups with features represented using tensor polynomials. The key intuition for our design is that using tensor representations in the hidden layers of a neural network along with simple inexpensive tensor operations can lead to expressive universal equivariant networks. We find G-RepsNet to be competitive to EMLP on several tasks with group symmetries such as O(5), O(1, 3), and O(3) with scalars, vectors, and second-order tensors as data types. On image classification tasks, we find that G-RepsNet using second-order representations is competitive and often even outperforms sophisticated state-of-the-art equivariant models such as GCNNs (Cohen & Welling, 2016a) and E(2)-CNNs (Weiler & Cesa, 2019). To further illustrate the generality of our approach, we show that G-RepsNet is competitive to G-FNO (Helwig et al., 2023) and EGNN (Satorras et al., 2021) on N-body predictions and solving PDEs, respectively, while being efficient.

artificial intelligence, machine learning, representation, (16 more...)

arXiv.org Artificial Intelligence

2402.15413

Genre: Research Report (0.81)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Efficient Model-Agnostic Multi-Group Equivariant Networks

Baltaji, Razan, Basu, Sourya, Varshney, Lav R.

arXiv.org Artificial IntelligenceOct-14-2023

Constructing model-agnostic group equivariant networks, such as equitune (Basu et al., 2023b) and its generalizations (Kim et al., 2023), can be computationally expensive for large product groups. We address this by providing efficient model-agnostic equivariant designs for two related problems: one where the network has multiple inputs each with potentially different groups acting on them, and another where there is a single input but the group acting on it is a large product group. For the first design, we initially consider a linear model and characterize the entire equivariant space that satisfies this constraint. This characterization gives rise to a novel fusion layer between different channels that satisfies an invariance-symmetry (IS) constraint, which we call an IS layer. We then extend this design beyond linear models, similar to equitune, consisting of equivariant and IS layers. We also show that the IS layer is a universal approximator of invariant-symmetric functions. Inspired by the first design, we use the notion of the IS property to design a second efficient model-agnostic equivariant design for large product groups acting on a single input. For the first design, we provide experiments on multi-image classification where each view is transformed independently with transformations such as rotations. We find equivariant models are robust to such transformations and perform competitively otherwise. For the second design, we consider three applications: language compositionality on the SCAN dataset to product groups; fairness in natural language generation from GPT-2 to address intersectionality; and robust zero-shot image classification with CLIP. Overall, our methods are simple and general, competitive with equitune and its variants, while also being computationally more efficient.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2310.09675

Country: North America (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Efficient Equivariant Transfer Learning from Pretrained Models

Basu, Sourya, Katdare, Pulkit, Sattigeri, Prasanna, Chenthamarakshan, Vijil, Driggs-Campbell, Katherine, Das, Payel, Varshney, Lav R.

arXiv.org Artificial IntelligenceOct-10-2023

Efficient transfer learning algorithms are key to the success of foundation models on diverse downstream tasks even with limited data. Recent works of Basu et al. (2023) and Kaba et al. (2022) propose group averaging (equitune) and optimization-based methods, respectively, over features from group-transformed inputs to obtain equivariant outputs from non-equivariant neural networks. While Kaba et al. (2022) are only concerned with training from scratch, we find that equitune performs poorly on equivariant zero-shot tasks despite good finetuning results. We hypothesize that this is because pretrained models provide better quality features for certain transformations than others and simply averaging them is deleterious. Hence, we propose {\lambda}-equitune that averages the features using importance weights, {\lambda}s. These weights are learned directly from the data using a small neural network, leading to excellent zero-shot and finetuned results that outperform equitune. Further, we prove that {\lambda}-equitune is equivariant and a universal approximator of equivariant functions. Additionally, we show that the method of Kaba et al. (2022) used with appropriate loss functions, which we call equizero, also gives excellent zero-shot and finetuned performance. Both equitune and equizero are special cases of {\lambda}-equitune. To show the simplicity and generality of our method, we validate on a wide range of diverse applications and models such as 1) image classification using CLIP, 2) deep Q-learning, 3) fairness in natural language generation (NLG), 4) compositional generalization in languages, and 5) image classification using pretrained CNNs such as Resnet and Alexnet.

large language model, machine learning, natural language, (3 more...)

arXiv.org Artificial Intelligence

2305.099

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.64)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.60)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.44)

Add feedback

Transformers are Universal Predictors

Basu, Sourya, Choraria, Moulik, Varshney, Lav R.

arXiv.org Artificial IntelligenceJul-15-2023

In We find limits to the Transformer architecture for this sense, the Transformer architecture is said to have a universal language modeling and show it has a universal computation property (Lu et al., 2021), reminiscent prediction property in an information-theoretic of predictive coding hypotheses of the brain that posit one sense. We further analyze performance in nonasymptotic basic operation in neurobiological information processing data regimes to understand the role (Golkar et al., 2022). of various components of the Transformer architecture, The basic predictive workings of Transformers and previous especially in the context of data-efficient findings of universal approximation and computation training. We validate our theoretical analysis with properties motivate us to ask whether they also have a universal experiments on both synthetic and real datasets.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2307.07843

Country:

North America > United States > Illinois (0.14)
North America > United States > Hawaii (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Equi-Tuning: Group Equivariant Fine-Tuning of Pretrained Models

Basu, Sourya, Sattigeri, Prasanna, Ramamurthy, Karthikeyan Natesan, Chenthamarakshan, Vijil, Varshney, Kush R., Varshney, Lav R., Das, Payel

arXiv.org Artificial IntelligenceFeb-4-2023

We introduce equi-tuning, a novel fine-tuning method that transforms (potentially non-equivariant) pretrained models into group equivariant models while incurring minimum $L_2$ loss between the feature representations of the pretrained and the equivariant models. Large pretrained models can be equi-tuned for different groups to satisfy the needs of various downstream tasks. Equi-tuned models benefit from both group equivariance as an inductive bias and semantic priors from pretrained models. We provide applications of equi-tuning on three different tasks: image classification, compositional generalization in language, and fairness in natural language generation (NLG). We also provide a novel group-theoretic definition for fairness in NLG. The effectiveness of this definition is shown by testing it against a standard empirical method of fairness in NLG. We provide experimental results for equi-tuning using a variety of pretrained models: Alexnet, Resnet, VGG, and Densenet for image classification; RNNs, GRUs, and LSTMs for compositional generalization; and GPT2 for fairness in NLG. We test these models on benchmark datasets across all considered tasks to show the generality and effectiveness of the proposed method.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2210.06475

Country: North America (0.28)

Genre: Research Report (0.81)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Equivariant Mesh Attention Networks

Basu, Sourya, Gallego-Posada, Jose, Viganò, Francesco, Rowbottom, James, Cohen, Taco

arXiv.org Artificial IntelligenceAug-27-2022

Equivariance to symmetries has proven to be a powerful inductive bias in deep learning research. Recent works on mesh processing have concentrated on various kinds of natural symmetries, including translations, rotations, scaling, node permutations, and gauge transformations. To date, no existing architecture is equivariant to all of these transformations. In this paper, we present an attention-based architecture for mesh data that is provably equivariant to all transformations mentioned above. Our pipeline relies on the use of relative tangential features: a simple, effective, equivariance-friendly alternative to raw node positions as inputs. Experiments on the FAUST and TOSCA datasets confirm that our proposed architecture achieves improved performance on these benchmarks and is indeed equivariant, and therefore robust, to a wide variety of local/global transformations.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2205.10662

Country:

North America > United States (0.46)
North America > Canada (0.28)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback