AITopics | Soljacic, Marin

Collaborating Authors

Soljacic, Marin

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Model Stitching: Looking For Functional Similarity Between Representations

Hernandez, Adriano, Dangovski, Rumen, Lu, Peter Y., Soljacic, Marin

arXiv.org Artificial IntelligenceAug-31-2023

Model stitching (Lenc & Vedaldi 2015) is a compelling methodology to compare different neural network representations, because it allows us to measure to what degree they may be interchanged. We expand on a previous work from Bansal, Nakkiran & Barak which used model stitching to compare representations of the same shapes learned by differently seeded and/or trained neural networks of the same architecture. Our contribution enables us to compare the representations learned by layers with different shapes from neural networks with different architectures. We subsequently reveal unexpected behavior of model stitching. Namely, we find that stitching, based on convolutions, for small ResNets, can reach high accuracy if those layers come later in the first (sender) network than in the second (receiver), even if those layers are far apart.

artificial intelligence, functional similarity, machine learning, (2 more...)

arXiv.org Artificial Intelligence

2303.11277

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.64)

Add feedback

AI-Assisted Discovery of Quantitative and Formal Models in Social Science

Balla, Julia, Huang, Sihao, Dugan, Owen, Dangovski, Rumen, Soljacic, Marin

arXiv.org Artificial IntelligenceAug-16-2023

In social science, formal and quantitative models, such as ones describing economic growth and collective action, are used to formulate mechanistic explanations, provide predictions, and uncover questions about observed phenomena. Here, we demonstrate the use of a machine learning system to aid the discovery of symbolic models that capture nonlinear and dynamical relationships in social science datasets. By extending neuro-symbolic methods to find compact functions and differential equations in noisy and longitudinal data, we show that our system can be used to discover interpretable models from real-world data in economics and sociology. Augmenting existing workflows with symbolic regression can help uncover novel relationships and explore counterfactual models during the scientific process. We propose that this AI-assisted framework can bridge parametric and non-parametric models commonly employed in social science research by systematically exploring the space of nonlinear models and enabling fine-grained control over expressivity and interpretability.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2210.00563

Country:

Europe > United Kingdom (0.67)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.81)

Industry:

Banking & Finance > Economy (0.89)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.68)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Multi-Symmetry Ensembles: Improving Diversity and Generalization via Opposing Symmetries

Loh, Charlotte, Han, Seungwook, Sudalairaj, Shivchander, Dangovski, Rumen, Xu, Kai, Wenzel, Florian, Soljacic, Marin, Srivastava, Akash

arXiv.org Artificial IntelligenceJun-19-2023

Deep ensembles (DE) have been successful in improving model performance by learning diverse members via the stochasticity of random initialization. While recent works have attempted to promote further diversity in DE via hyperparameters or regularizing loss functions, these methods primarily still rely on a stochastic approach to explore the hypothesis space. In this work, we present Multi-Symmetry Ensembles (MSE), a framework for constructing diverse ensembles by capturing the multiplicity of hypotheses along symmetry axes, which explore the hypothesis space beyond stochastic perturbations of model weights and hyperparameters. We leverage recent advances in contrastive representation learning to create models that separately capture opposing hypotheses of invariant and equivariant functional classes and present a simple ensembling approach to efficiently combine appropriate hypotheses for a given task. We show that MSE effectively captures the multiplicity of conflicting hypotheses that is often required in large, diverse datasets like ImageNet. As a result of their inherent diversity, MSE improves classification performance, uncertainty quantification, and generalization across a series of transfer tasks.

artificial intelligence, hypothesis, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2303.02484

Country: North America > United States > Hawaii (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

WaveletNet: Logarithmic Scale Efficient Convolutional Neural Networks for Edge Devices

Jing, Li, Dangovski, Rumen, Soljacic, Marin

arXiv.org Machine LearningNov-28-2018

We present a logarithmic-scale efficient convolutional neural network architecture for edge devices, named WaveletNet. Our model is based on the well-known depthwise convolution, and on two new layers, which we introduce in this work: a wavelet convolution and a depthwise fast wavelet transform. By breaking the symmetry in channel dimensions and applying a fast algorithm, WaveletNet shrinks the complexity of convolutional blocks by an O(logD/D) factor, where D is the number of channels. Experiments on CIFAR-10 and ImageNet classification show superior and comparable performances of WaveletNet compared to state-of-the-art models such as MobileNetV2.

convolution, deep learning, neural network, (17 more...)

arXiv.org Machine Learning

1811.11644

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Genre: Research Report (0.84)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.85)

Add feedback

Gated Orthogonal Recurrent Units: On Learning to Forget

Jing, Li (Massachusetts Institute of Technology) | Gulcehre, Caglar (MILA - Universite de Montreal) | Peurifoy, John (Massachusetts Institute of Technology) | Shen, Yichen (Massachusetts Institute of Technology) | Tegmark, Max (Massachusetts Institute of Technology) | Soljacic, Marin (Massachusetts Institute of Technology) | Bengio, Yoshua (MILA - Universite de Montreal)

AAAI ConferencesApr-6-2018

We present a novel recurrent neural network (RNN) based model that combines the remembering ability of unitary RNNs with the ability of gated RNNs to effectively forget redundant/irrelevant information in its memory. We achieve this by extending unitary RNNs with a gating mechanism. Our model is able to outperform LSTMs, GRUs and Unitary RNNs on several long-term dependency benchmark tasks. We empirically both show the orthogonal/unitary RNNs lack the ability to forget and also the ability of GORU to simultaneously remember long term dependencies while forgetting irrelevant information. This plays an important role in recurrent neural networks. We provide competitive results along with an analysis of our model on many natural sequential tasks including the bAbI Question Answering, TIMIT speech spectrum prediction, Penn TreeBank, and synthetic tasks that involve long-term dependencies such as algorithmic, parenthesis, denoising and copying tasks.

gated orthogonal recurrent unit, learning

AAAI Conferences

Workshops at the Thirty-Second AAAI Conference on Artificial Intelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Rotational Unit of Memory

Dangovski, Rumen, Jing, Li, Soljacic, Marin

arXiv.org Machine LearningOct-26-2017

The concepts of unitary evolution matrices and associative memory have boosted the field of Recurrent Neural Networks (RNN) to state-of-the-art performance in a variety of sequential tasks. However, RNN still have a limited capacity to manipulate long-term memory. To bypass this weakness the most successful applications of RNN use external techniques such as attention mechanisms. In this paper we propose a novel RNN model that unifies the state-of-the-art approaches: Rotational Unit of Memory (RUM). The core of RUM is its rotational operation, which is, naturally, a unitary matrix, providing architectures with the power to learn long-term dependencies by overcoming the vanishing and exploding gradients problem. Moreover, the rotational unit also serves as associative memory. We evaluate our model on synthetic memorization, question answering and language modeling tasks. RUM learns the Copying Memory task completely and improves the state-of-the-art result in the Recall task. RUM's performance in the bAbI Question Answering task is comparable to that of models with attention mechanism. We also improve the state-of-the-art result to 1.189 bits-per-character (BPC) loss in the Character Level Penn Treebank (PTB) task, which is to signify the applications of RUM to real-world sequential data. The universality of our construction, at the core of RNN, establishes RUM as a promising approach to language modeling, speech recognition and machine translation.

deep learning, neural network, rum, (17 more...)

arXiv.org Machine Learning

1710.09537

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report > Promising Solution (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.98)

Add feedback