AITopics | Turek, Javier S.

Collaborating Authors

Turek, Javier S.

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Assessing Episodic Memory in LLMs with Sequence Order Recall Tasks

Pink, Mathis, Vo, Vy A., Wu, Qinyuan, Mu, Jianing, Turek, Javier S., Hasson, Uri, Norman, Kenneth A., Michelmann, Sebastian, Huth, Alexander, Toneva, Mariya

arXiv.org Artificial IntelligenceOct-10-2024

Current LLM benchmarks focus on evaluating models' memory of facts and semantic relations, primarily assessing semantic aspects of long-term memory. However, in humans, long-term memory also includes episodic memory, which links memories to their contexts, such as the time and place they occurred. The ability to contextualize memories is crucial for many cognitive tasks and everyday functions. This form of memory has not been evaluated in LLMs with existing benchmarks. To address the gap in evaluating memory in LLMs, we introduce Sequence Order Recall Tasks (SORT), which we adapt from tasks used to study episodic memory in cognitive psychology. SORT requires LLMs to recall the correct order of text segments, and provides a general framework that is both easily extendable and does not require any additional annotations. We present an initial evaluation dataset, Book-SORT, comprising 36k pairs of segments extracted from 9 books recently added to the public domain. Based on a human experiment with 155 participants, we show that humans can recall sequence order based on long-term memory of a book. We find that models can perform the task with high accuracy when relevant text is given in-context during the SORT evaluation. However, when presented with the book text only during training, LLMs' performance on SORT falls short. By making it possible to evaluate more aspects of memory, we believe that SORT will aid in the emerging development of memory-augmented models. Large language models (LLMs) have impressive performance on many benchmarks that test factual or semantic knowledge learned during training or in-context (Hendrycks et al., 2020; Ryo et al., 2023; Logan IV et al., 2019; Petroni et al., 2019; Yu et al., 2023; Sun et al., 2023). While these advances are noteworthy, the type of long-term knowledge that these datasets test is only one of several types that naturally intelligent systems store, retrieve, and update continuously over time (Norris, 2017; Izquierdo et al., 1999; McClelland et al., 1995). Current evaluation tasks do not assess episodic memory, which is a form of long-term knowledge thought to be important for cognitive function in humans and animals. In contrast to semantic memory, episodic memory links memories to their contexts, such as the time and place they occurred.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2410.08133

Country:

Europe (0.93)
North America > United States > Texas > Travis County > Austin (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Consumer Health (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

A single-layer RNN can approximate stacked and bidirectional RNNs, and topologies in between

Turek, Javier S., Jain, Shailee, Capota, Mihai, Huth, Alexander G., Willke, Theodore L.

arXiv.org Machine LearningAug-30-2019

To enhance the expressiveness and representational capacity of recurrent neural networks (RNN), a large body of work has emerged exploring stacked architectures with additional topological modifications like shortcut connections or bidirectionality. However, choosing the best network for a particular problem requires a combinatorial search over architectures and their hyperparameters. In this work, we show that a single-layer RNN can perfectly mimic an arbitrarily deep stacked RNN under specific constraints on its weight matrix and a delay between input and output. This obviates the need to manually select hyperparameters like the number of layers. Additionally, we show that weakening weight constraints while keeping the delay gives rise to partial acausality in the single-layer RNN, much like a bidirectional network. Synthetic experiments confirm that the delayed RNN can mimic bidirectional networks in perfectly solving some acausal tasks, outperforming them in others. Finally, we show that in a challenging language processing task, the delayed RNN performs within 0.3\% of the accuracy of the bidirectional network while reducing computational costs.

deep learning, neural network, rnn, (20 more...)

arXiv.org Machine Learning

1909.00021

Country:

Europe (1.00)
North America > United States > Texas (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Sparse and low-rank approximations of large symmetric matrices using biharmonic interpolation

Turek, Javier S., Huth, Alexander

arXiv.org Machine LearningMay-30-2017

Symmetric matrices are widely used in machine learning problems such as kernel machines and manifold learning. Using large datasets often requires computing low-rank approximations of these symmetric matrices so that they fit in memory. In this paper, we present a novel method based on biharmonic interpolation for low-rank matrix approximation. The method exploits knowledge of the data manifold to learn an interpolation operator that approximates values using a subset of randomly selected landmark points. This operator is readily sparsified, reducing memory requirements by at least two orders of magnitude without significant loss in accuracy. We show that our method can approximate very large datasets using twenty times more landmarks than other methods. Further, numerical results suggest that our method is stable even when numerical difficulties arise for other methods.

artificial intelligence, machine learning, matrix, (18 more...)

arXiv.org Machine Learning

1705.10887

Country:

North America > United States > Texas (0.14)
North America > United States > Oregon (0.14)
Asia > Middle East > Israel (0.14)

Genre: Research Report > New Finding (0.34)

Industry: Education (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)

Add feedback

A Searchlight Factor Model Approach for Locating Shared Information in Multi-Subject fMRI Analysis

Zhang, Hejia, Chen, Po-Hsuan, Chen, Janice, Zhu, Xia, Turek, Javier S., Willke, Theodore L., Hasson, Uri, Ramadge, Peter J.

arXiv.org Machine LearningSep-29-2016

There is a growing interest in joint multi-subject fMRI analysis. The challenge of such analysis comes from inherent anatomical and functional variability across subjects. One approach to resolving this is a shared response factor model. This assumes a shared and time synchronized stimulus across subjects. Such a model can often identify shared information, but it may not be able to pinpoint with high resolution the spatial location of this information. In this work, we examine a searchlight based shared response model to identify shared information in small contiguous regions (searchlights) across the whole brain. Validation using classification tasks demonstrates that we can pinpoint informative local regions.

factor model, health & medicine, neurology, (21 more...)

arXiv.org Machine Learning

1609.09432

Country: North America > United States (0.14)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (0.89)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Enabling Factor Analysis on Thousand-Subject Neuroimaging Datasets

Anderson, Michael J., Capotă, Mihai, Turek, Javier S., Zhu, Xia, Willke, Theodore L., Wang, Yida, Chen, Po-Hsuan, Manning, Jeremy R., Ramadge, Peter J., Norman, Kenneth A.

arXiv.org Machine LearningAug-17-2016

The scale of functional magnetic resonance image data is rapidly increasing as large multi-subject datasets are becoming widely available and high-resolution scanners are adopted. The inherent low-dimensionality of the information in this data has led neuroscientists to consider factor analysis methods to extract and analyze the underlying brain activity. In this work, we consider two recent multi-subject factor analysis methods: the Shared Response Model and Hierarchical Topographic Factor Analysis. We perform analytical, algorithmic, and code optimization to enable multi-node parallel implementations to scale. Single-node improvements result in 99x and 1812x speedups on these two methods, and enables the processing of larger datasets. Our distributed implementations show strong scaling of 3.3x and 5.5x respectively with 20 nodes on real datasets. We also demonstrate weak scaling on a synthetic dataset with 1024 subjects, on up to 1024 nodes and 32,768 cores.

dataset, health & medicine, neurology, (19 more...)

arXiv.org Machine Learning

1608.04647

Country: North America > United States (0.46)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A Convolutional Autoencoder for Multi-Subject fMRI Data Aggregation

Chen, Po-Hsuan, Zhu, Xia, Zhang, Hejia, Turek, Javier S., Chen, Janice, Willke, Theodore L., Hasson, Uri, Ramadge, Peter J.

arXiv.org Machine LearningAug-16-2016

Finding the most effective way to aggregate multi-subject fMRI data is a long-standing and challenging problem. It is of increasing interest in contemporary fMRI studies of human cognition due to the scarcity of data per subject and the variability of brain anatomy and functional response across subjects. Recent work on latent factor models shows promising results in this task but this approach does not preserve spatial locality in the brain. We examine two ways to combine the ideas of a factor model and a searchlight based analysis to aggregate multi-subject fMRI data while preserving spatial locality. We first do this directly by combining a recent factor method known as a shared response model with searchlight analysis. Then we design a multi-view convolutional autoencoder for the same task. Both approaches preserve spatial locality and have competitive or better performance compared with standard searchlight analysis and the shared response model applied across the whole brain. We also report a system design to handle the computational challenge of training the convolutional autoencoder.

autoencoder, deep learning, neural network, (19 more...)

arXiv.org Machine Learning

1608.04846

Country: Europe > United Kingdom > England (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A multilevel framework for sparse optimization with application to inverse covariance estimation and logistic regression

Treister, Eran, Turek, Javier S., Yavneh, Irad

arXiv.org Machine LearningJul-1-2016

Solving l1 regularized optimization problems is common in the fields of computational biology, signal processing and machine learning. Such l1 regularization is utilized to find sparse minimizers of convex functions. A well-known example is the LASSO problem, where the l1 norm regularizes a quadratic function. A multilevel framework is presented for solving such l1 regularized sparse optimization problems efficiently. We take advantage of the expected sparseness of the solution, and create a hierarchy of problems of similar type, which is traversed in order to accelerate the optimization process. This framework is applied for solving two problems: (1) the sparse inverse covariance estimation problem, and (2) l1-regularized logistic regression. In the first problem, the inverse of an unknown covariance matrix of a multivariate normal distribution is estimated, under the assumption that it is sparse. To this end, an l1 regularized log-determinant optimization problem needs to be solved. This task is challenging especially for large-scale datasets, due to time and memory limitations. In the second problem, the l1-regularization is added to the logistic regression classification objective to reduce overfitting to the data and obtain a sparse model. Numerical experiments demonstrate the efficiency of the multilevel framework in accelerating existing iterative solvers for both of these problems.

health & medicine, matrix, neurology, (20 more...)

arXiv.org Machine Learning

1607.00315

Country:

North America > United States (0.46)
Asia > Middle East > Israel (0.14)
North America > Canada > British Columbia (0.14)

Genre:

Research Report > New Finding (0.81)
Research Report > Experimental Study (0.81)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.67)
Health & Medicine > Pharmaceuticals & Biotechnology (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

A Block-Coordinate Descent Approach for Large-scale Sparse Inverse Covariance Estimation

Treister, Eran, Turek, Javier S.

Neural Information Processing SystemsDec-31-2014

The sparse inverse covariance estimation problem arises in many statistical applications in machine learning and signal processing. In this problem, the inverse of a covariance matrix of a multivariate normal distribution is estimated, assuming that it is sparse. An $\ell_1$ regularized log-determinant optimization problem is typically solved to approximate such matrices. Because of memory limitations, most existing algorithms are unable to handle large scale instances of this problem. In this paper we present a new block-coordinate descent approach for solving the problem for large-scale data sets. Our method treats the sought matrix block-by-block using quadratic approximations, and we show that this approach has advantages over existing methods in several aspects. Numerical experiments on both synthetic and real gene expression data demonstrate that our approach outperforms the existing state of the art methods, especially for large-scale problems.

health & medicine, matrix, oncology, (18 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Israel (0.14)
North America > Canada (0.14)

Genre: Research Report (0.67)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback