Goto

Collaborating Authors

 ortega


Human Decision-Making under Limited Time

Neural Information Processing Systems

Subjective expected utility theory assumes that decision-makers possess unlimited computational resources to reason about their choices; however, virtually all decisions in everyday life are made under resource constraints---i.e.



Echoes of the past: A unified perspective on fading memory and echo states

arXiv.org Machine Learning

Recurrent neural networks (RNNs) have become increasingly popular in information processing tasks involving time series and temporal data. A fundamental property of RNNs is their ability to create reliable input/output responses, often linked to how the network handles its memory of the information it processed. Various notions have been proposed to conceptualize the behavior of memory in RNNs, including steady states, echo states, state forgetting, input forgetting, and fading memory. Although these notions are often used interchangeably, their precise relationships remain unclear. This work aims to unify these notions in a common language, derive new implications and equivalences between them, and provide alternative proofs to some existing results. By clarifying the relationships between these concepts, this research contributes to a deeper understanding of RNNs and their temporal information processing capabilities.


A tensor network approach for chaotic time series prediction

arXiv.org Artificial Intelligence

Making accurate predictions of chaotic time series is a complex challenge. Reservoir computing, a neuromorphic-inspired approach, has emerged as a powerful tool for this task. It exploits the memory and nonlinearity of dynamical systems without requiring extensive parameter tuning. However, selecting and optimizing reservoir architectures remains an open problem. Next-generation reservoir computing simplifies this problem by employing nonlinear vector autoregression based on truncated Volterra series, thereby reducing hyperparameter complexity. Nevertheless, the latter suffers from exponential parameter growth in terms of the maximum monomial degree. Tensor networks offer a promising solution to this issue by decomposing multidimensional arrays into low-dimensional structures, thus mitigating the curse of dimensionality. This paper explores the application of a previously proposed tensor network model for predicting chaotic time series, demonstrating its advantages in terms of accuracy and computational efficiency compared to conventional echo state networks. Using a state-of-the-art tensor network approach enables us to bridge the gap between the tensor network and reservoir computing communities, fostering advances in both fields.


Optimizing $k$ in $k$NN Graphs with Graph Learning Perspective

arXiv.org Artificial Intelligence

In this paper, we propose a method, based on graph signal processing, to optimize the choice of $k$ in $k$-nearest neighbor graphs ($k$NNGs). $k$NN is one of the most popular approaches and is widely used in machine learning and signal processing. The parameter $k$ represents the number of neighbors that are connected to the target node; however, its appropriate selection is still a challenging problem. Therefore, most $k$NNGs use ad hoc selection methods for $k$. In the proposed method, we assume that a different $k$ can be chosen for each node. We formulate a discrete optimization problem to seek the best $k$ with a constraint on the sum of distances of the connected nodes. The optimal $k$ values are efficiently obtained without solving a complex optimization. Furthermore, we reveal that the proposed method is closely related to existing graph learning methods. In experiments on real datasets, we demonstrate that the $k$NNGs obtained with our method are sparse and can determine an appropriate variable number of edges per node. We validate the effectiveness of the proposed method for point cloud denoising, comparing our denoising performance with achievable graph construction methods that can be scaled to typical point cloud sizes (e.g., thousands of nodes).


Geometric Learning with Positively Decomposable Kernels

arXiv.org Machine Learning

Kernel methods are powerful tools in machine learning. Classical kernel methods are based on positive-definite kernels, which map data spaces into reproducing kernel Hilbert spaces (RKHS). For non-Euclidean data spaces, positive-definite kernels are difficult to come by. In this case, we propose the use of reproducing kernel Krein space (RKKS) based methods, which require only kernels that admit a positive decomposition. We show that one does not need to access this decomposition in order to learn in RKKS. We then investigate the conditions under which a kernel is positively decomposable. We show that invariant kernels admit a positive decomposition on homogeneous spaces under tractable regularity assumptions. This makes them much easier to construct than positive-definite kernels, providing a route for learning with kernels for non-Euclidean data. By the same token, this provides theoretical foundations for RKKS-based methods in general.


Evaluating Self-Supervised Speech Representations for Indigenous American Languages

arXiv.org Artificial Intelligence

The application of self-supervision to speech representation learning has garnered significant interest in recent years, due to its scalability to large amounts of unlabeled data. However, much progress, both in terms of pre-training and downstream evaluation, has remained concentrated in monolingual models that only consider English. Few models consider other languages, and even fewer consider indigenous ones. In our submission to the New Language Track of the ASRU 2023 ML-SUPERB Challenge, we present an ASR corpus for Quechua, an indigenous South American Language. We benchmark the efficacy of large SSL models on Quechua, along with 6 other indigenous languages such as Guarani and Bribri, on low-resource ASR. Our results show surprisingly strong performance by state-of-the-art SSL models, showing the potential generalizability of large-scale models to real-world data.


Infinite-dimensional reservoir computing

arXiv.org Artificial Intelligence

Reservoir computing (RC) [Jaeg 10, Maas 02, Jaeg 04, Maas 11] and in particular echo state networks (ESNs) [Matt 92, Matt 93, Jaeg 04] have gained much popularity in recent years due to their excellent performance in the forecasting of dynamical systems [Grig 14, Jaeg 04, Path 17, Path 18, Lu 18, Wikn 21, Arco 22] and due to the ease of their implementation. RC aims at approximating nonlinear input/output systems using randomly generated state-space systems (called reservoirs) in which only a linear readout is estimated. It has been theoretically established that this is indeed possible in a variety of deterministic and stochastic contexts [Grig 18b, Grig 18a, Gono 20c, Gono 21b, Gono 23] in which RC systems have been shown to have universal approximation properties. In this paper, we focus on deriving error bounds for a variant of the architectures that we just cited and consider as approximants randomly generated linear systems with readouts given by randomly generated neural networks in which only the output layer is trained. Thus, from a learning perspective, we combine linear echo state networks and what is referred to in the literature as random features [Rahi 07] /extreme learning machines (ELMs) [Huan 06]. We develop explicit and readily computable approximation and estimation bounds for a newly introduced concept class whose elements we refer to as recurrent (generalized) Barron functionals since they can be viewed as a dynamical analog of the (generalized) Barron functions introduced in [Barr 92, Barr 93] and extended later in [E 20b, E 20a, E 19].


Joint Graph and Vertex Importance Learning

arXiv.org Artificial Intelligence

To account for the difficulty associated with singular CGL matrices in inverse covariance estimation, the objective In this paper, we explore the topic of graph learning from the function is oftentimes modified [5, 9-12]. However, such an perspective of the Irregularity-Aware Graph Fourier Transform, approach produces dense graphs, even if variables are weakly with the goal of learning the graph signal space inner correlated (see Sec. 4 and [11]) because the modified objective product to better model data. We propose a novel method to function encourages well connected graphs [9]. This issue learn a graph with smaller edge weight upper bounds compared can be solved by incorporating non-convex sparse regularization to combinatorial Laplacian approaches. Experimentally, [11, 13] at the expense of a more complex graph our approach yields much sparser graphs compared to a learning algorithm.


Intra-operative Brain Tumor Detection with Deep Learning-Optimized Hyperspectral Imaging

arXiv.org Artificial Intelligence

Surgery for gliomas (intrinsic brain tumors), especially when low-grade, is challenging due to the infiltrative nature of the lesion. Currently, no real-time, intra-operative, label-free and wide-field tool is available to assist and guide the surgeon to find the relevant demarcations for these tumors. While marker-based methods exist for the high-grade glioma case, there is no convenient solution available for the low-grade case; thus, marker-free optical techniques represent an attractive option. Although RGB imaging is a standard tool in surgical microscopes, it does not contain sufficient information for tissue differentiation. We leverage the richer information from hyperspectral imaging (HSI), acquired with a snapscan camera in the 468-787 nm range, coupled to a surgical microscope, to build a deep-learning-based diagnostic tool for cancer resection with potential for intra-operative guidance. However, the main limitation of the HSI snapscan camera is the image acquisition time, limiting its widespread deployment in the operation theater. Here, we investigate the effect of HSI channel reduction and pre-selection to scope the design space for the development of cheaper and faster sensors. Neural networks are used to identify the most important spectral channels for tumor tissue differentiation, optimizing the trade-off between the number of channels and precision to enable real-time intra-surgical application. We evaluate the performance of our method on a clinical dataset that was acquired during surgery on five patients. By demonstrating the possibility to efficiently detect low-grade glioma, these results can lead to better cancer resection demarcations, potentially improving treatment effectiveness and patient outcome.