AITopics | Pereira, Francisco

More Experts Than Galaxies: Conditionally-overlapping Experts With Biologically-Inspired Fixed Routing

Shaier, Sagi, Pereira, Francisco, von der Wense, Katharina, Hunter, Lawrence E, Jones, Matt

arXiv.org Artificial IntelligenceOct-18-2024

The evolution of biological neural systems has led to both modularity and sparse coding, which enables efficiency in energy usage, and robustness across the diversity of tasks in the lifespan. In contrast, standard neural networks rely on dense, non-specialized architectures, where all model parameters are simultaneously updated to learn multiple tasks, leading to representation interference. Current sparse neural network approaches aim to alleviate this issue, but are often hindered by limitations such as 1) trainable gating functions that cause representation collapse; 2) non-overlapping experts that result in redundant computation and slow learning; and 3) reliance on explicit input or task IDs that impose significant constraints on flexibility and scalability. In this paper we propose Conditionally Overlapping Mixture of ExperTs (COMET), a general deep learning method that addresses these challenges by inducing a modular, sparse architecture with an exponential number of overlapping experts. COMET replaces the trainable gating function used in Sparse Mixture of Experts with a fixed, biologically inspired random projection applied to individual input representations. This design causes the degree of expert overlap to depend on input similarity, so that similar inputs tend to share more parameters. This facilitates positive knowledge transfer, resulting in faster learning and improved generalization. We demonstrate the effectiveness of COMET on a range of tasks, including image classification, language modeling, and regression, using several popular deep learning architectures.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2410.08003

Country: North America > United States (1.00)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Interpretable factorization of clinical questionnaires to identify latent factors of psychopathology

Lam, Ka Chun, Mahony, Bridget W, Raznahan, Armin, Pereira, Francisco

arXiv.org Artificial IntelligenceDec-12-2023

Psychiatry research seeks to understand the manifestations of psychopathology in behavior, as measured in questionnaire data, by identifying a small number of latent factors that explain them. While factor analysis is the traditional tool for this purpose, the resulting factors may not be interpretable, and may also be subject to confounding variables. Moreover, missing data are common, and explicit imputation is often required. To overcome these limitations, we introduce interpretability constrained questionnaire factorization (ICQF), a non-negative matrix factorization method with regularization tailored for questionnaire data. Our method aims to promote factor interpretability and solution stability. We provide an optimization procedure with theoretical convergence guarantees, and an automated procedure to detect latent dimensionality accurately. We validate these procedures using realistic synthetic data. We demonstrate the effectiveness of our method in a widely used general-purpose questionnaire, in two independent datasets (the Healthy Brain Network and Adolescent Brain Cognitive Development studies). Specifically, we show that ICQF improves interpretability, as defined by domain experts, while preserving diagnostic information across a range of disorders, and outperforms competing methods for smaller dataset sizes. This suggests that the regularization in our method matches domain characteristics. The python implementation for ICQF is available at \url{https://github.com/jefferykclam/ICQF}.

artificial intelligence, machine learning, optimization problem, (17 more...)

arXiv.org Artificial Intelligence

2312.07762

Country: North America > United States (1.00)

Genre:

Questionnaire & Opinion Survey (1.00)
Research Report > New Finding (0.67)
Research Report > Experimental Study (0.67)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.89)

Add feedback

Testing for context-dependent changes in neural encoding in naturalistic experiments

Chen, Yenho, Harris, Carl W., Ma, Xiaoyu, Li, Zheng, Pereira, Francisco, Zheng, Charles Y.

arXiv.org Artificial IntelligenceNov-16-2022

We propose a decoding-based approach to detect context effects on neural codes in longitudinal neural recording data. The approach is agnostic to how information is encoded in neural activity, and can control for a variety of possible confounding factors present in the data. We demonstrate our approach by determining whether it is possible to decode location encoding from prefrontal cortex in the mouse and, further, testing whether the encoding changes due to task engagement.

artificial intelligence, machine learning, modeling & simulation, (19 more...)

arXiv.org Artificial Intelligence

2211.09295

Country: North America > United States > Maryland (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.68)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.66)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(2 more...)

Add feedback

Evaluating Adversarial Robustness for Deep Neural Network Interpretability in fMRI Decoding

McClure, Patrick, Moraczewski, Dustin, Lam, Ka Chun, Thomas, Adam, Pereira, Francisco

arXiv.org Machine LearningJun-4-2020

While deep neural networks (DNNs) are being increasingly used to make predictions from high-dimensional, complex data, they are widely seen as uninterpretable "black boxes", since it can be difficult to discover what input information is used to make predictions. This ability is particularly important for applications in cognitive neuroscience and neuroinformatics. A saliency map is a common approach for producing interpretable visualizations of the relative importance of input features for a prediction. However, many methods for creating these maps fail due to focusing too much on the input or being extremely sensitive to small input noise. It is also challenging to quantitatively evaluate how well saliency maps correspond to the truly relevant input information. In this paper, we develop two quantitative evaluation procedures for saliency methods, using the fact that the Human Connectome Project (HCP) dataset contains functional magnetic resonance imaging (fMRI) data from multiple tasks per subject to create ground truth saliency maps. We then introduce an adversarial training method that makes DNNs robust to small input noise, and demonstrate that it measurably improves interpretability.

deep learning, neural network, saliency map, (19 more...)

arXiv.org Machine Learning

2004.11114

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.30)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Bayesian Automatic Relevance Determination for Utility Function Specification in Discrete Choice Models

Rodrigues, Filipe, Ortelli, Nicola, Bierlaire, Michel, Pereira, Francisco

arXiv.org Machine LearningJun-10-2019

Specifying utility functions is a key step towards applying the discrete choice framework for understanding the behaviour processes that govern user choices. However, identifying the utility function specifications that best model and explain the observed choices can be a very challenging and time-consuming task. This paper seeks to help modellers by leveraging the Bayesian framework and the concept of automatic relevance determination (ARD), in order to automatically determine an optimal utility function specification from an exponentially large set of possible specifications in a purely data-driven manner. Based on recent advances in approximate Bayesian inference, a doubly stochastic variational inference is developed, which allows the proposed DCM-ARD model to scale to very large and high-dimensional datasets. Using semi-artificial choice data, the proposed approach is shown to very accurately recover the true utility function specifications that govern the observed choices. Moreover, when applied to real choice data, DCM-ARD is shown to be able discover high quality specifications that can outperform previous ones from the literature according to multiple criteria, thereby demonstrating its practical applicability.

game theory, specification, survey article, (20 more...)

arXiv.org Machine Learning

1906.03855

Country: Europe > Denmark (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine (0.93)
Materials > Chemicals > Industrial Gases > Liquified Gas (0.67)
Materials > Chemicals > Commodity Chemicals > Petrochemicals > LNG (0.67)
Energy > Oil & Gas > Midstream (0.67)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Decision Support Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
(2 more...)

Add feedback

Revealing interpretable object representations from human behavior

Zheng, Charles Y., Pereira, Francisco, Baker, Chris I., Hebart, Martin N.

arXiv.org Machine LearningJan-9-2019

To study how mental object representations are related to behavior, we estimated sparse, non-negative representations of objects using human behavioral judgments on images representative of 1,854 object categories. These representations predicted a latent similarity structure between objects, which captured most of the explainable variance in human behavioral judgments. Individual dimensions in the low-dimensional embedding were found to be highly reproducible and interpretable as conveying degrees of taxonomic membership, functionality, and perceptual attributes. We further demonstrated the predictive power of the embeddings for explaining other forms of human behavior, including categorization, typicality judgments, and feature ratings, suggesting that the dimensions reflect human conceptual representations of objects beyond the specific task.

dimension, health & medicine, text processing, (22 more...)

arXiv.org Machine Learning

1901.02915

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Distributed Weight Consolidation: A Brain Segmentation Case Study

McClure, Patrick, Zheng, Charles Y., Kaczmarzyk, Jakub, Rogers-Lee, John, Ghosh, Satra, Nielson, Dylan, Bandettini, Peter A., Pereira, Francisco

Neural Information Processing SystemsDec-31-2018

Collecting the large datasets needed to train deep neural networks can be very difficult, particularly for the many applications for which sharing and pooling data is complicated by practical, ethical, or legal concerns. However, it may be the case that derivative datasets or predictive models developed within individual sites can be shared and combined with fewer restrictions. Training on distributed data and combining the resulting networks is often viewed as continual learning, but these methods require networks to be trained sequentially. In this paper, we introduce distributed weight consolidation (DWC), a continual learning method to consolidate the weights of separate neural networks, each trained on an independent dataset. We evaluated DWC with a brain segmentation case study, where we consolidated dilated convolutional neural networks trained on independent structural magnetic resonance imaging (sMRI) datasets from different sites. We found that DWC led to increased performance on test sets from the different sites, while maintaining generalization performance for a very large and completely independent multi-site dataset, compared to an ensemble baseline.

dataset, deep learning, neural network, (20 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report (0.68)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.94)

Add feedback

Distributed Weight Consolidation: A Brain Segmentation Case Study

McClure, Patrick, Zheng, Charles Y., Kaczmarzyk, Jakub, Rogers-Lee, John, Ghosh, Satra, Nielson, Dylan, Bandettini, Peter A., Pereira, Francisco

Neural Information Processing SystemsDec-31-2018

Collecting the large datasets needed to train deep neural networks can be very difficult, particularly for the many applications for which sharing and pooling data is complicated by practical, ethical, or legal concerns. However, it may be the case that derivative datasets or predictive models developed within individual sites can be shared and combined with fewer restrictions. Training on distributed data and combining the resulting networks is often viewed as continual learning, but these methods require networks to be trained sequentially. In this paper, we introduce distributed weight consolidation (DWC), a continual learning method to consolidate the weights of separate neural networks, each trained on an independent dataset. We evaluated DWC with a brain segmentation case study, where we consolidated dilated convolutional neural networks trained on independent structural magnetic resonance imaging (sMRI) datasets from different sites. We found that DWC led to increased performance on test sets from the different sites, while maintaining generalization performance for a very large and completely independent multi-site dataset, compared to an ensemble baseline.

dataset, deep learning, neural network, (21 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report (0.46)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.94)

Add feedback

Knowing what you know in brain segmentation using deep neural networks

McClure, Patrick, Rho, Nao, Lee, John A., Kaczmarzyk, Jakub R., Zheng, Charles, Ghosh, Satrajit S., Nielson, Dylan, Thomas, Adam, Bandettini, Peter, Pereira, Francisco

arXiv.org Machine LearningDec-18-2018

In this paper, we describe a deep neural network trained to predict FreeSurfer segmentations of structural MRI volumes, in seconds rather than hours. The network was trained and evaluated on an extremely large dataset (n = 11,148), obtained by combining data from more than a hundred sites. We also show that the prediction uncertainty of the network at each voxel is a good indicator of whether the network has made an error. The resulting uncertainty volume can be used in conjunction with the predicted segmentation to improve downstream uses, such as calculation of measures derived from segmentation regions of interest or the building of prediction models. Finally, we demonstrate that the average prediction uncertainty across voxels in the brain is an excellent indicator of manual quality control ratings, outperforming the best available automated solutions.

deep learning, neural network, segmentation, (23 more...)

arXiv.org Machine Learning

1812.01719

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.85)

Add feedback

Learning Supervised Topic Models for Classification and Regression from Crowds

Rodrigues, Filipe, Lourenço, Mariana, Ribeiro, Bernardete, Pereira, Francisco

arXiv.org Machine LearningAug-17-2018

The growing need to analyze large collections of documents has led to great developments in topic modeling. Since documents are frequently associated with other related variables, such as labels or ratings, much interest has been placed on supervised topic models. However, the nature of most annotation tasks, prone to ambiguity and noise, often with high volumes of documents, deem learning under a single-annotator assumption unrealistic or unpractical for most real-world applications. In this article, we propose two supervised topic models, one for classification and another for regression problems, which account for the heterogeneity and biases among different annotators that are encountered in practice when learning from crowds. We develop an efficient stochastic variational inference algorithm that is able to scale to very large datasets, and we empirically demonstrate the advantages of the proposed model over state-of-the-art approaches.

annotator, optimization problem, survey article, (19 more...)

arXiv.org Machine Learning

doi: 10.1109/TPAMI.2017.2648786

1808.05902

Country:

Europe (0.47)
North America > United States (0.28)

Genre: Research Report (1.00)

Industry: Media (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.68)

Add feedback

Filters

Collaborating Authors

Pereira, Francisco

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

More Experts Than Galaxies: Conditionally-overlapping Experts With Biologically-Inspired Fixed Routing

Interpretable factorization of clinical questionnaires to identify latent factors of psychopathology

Testing for context-dependent changes in neural encoding in naturalistic experiments

Evaluating Adversarial Robustness for Deep Neural Network Interpretability in fMRI Decoding

Bayesian Automatic Relevance Determination for Utility Function Specification in Discrete Choice Models

Revealing interpretable object representations from human behavior

Distributed Weight Consolidation: A Brain Segmentation Case Study

Distributed Weight Consolidation: A Brain Segmentation Case Study

Knowing what you know in brain segmentation using deep neural networks

Learning Supervised Topic Models for Classification and Regression from Crowds