inner product
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > Dominican Republic (0.04)
- (4 more...)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)
- North America > United States > Virginia (0.04)
- South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
- North America > United States > Massachusetts (0.04)
- (5 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
- Information Technology > Artificial Intelligence > Natural Language (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
- North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
- North America > United States > Arizona > Maricopa County > Tempe (0.04)
- Government > Regional Government > North America Government > United States Government (1.00)
- Energy (1.00)
- Asia > China > Hong Kong (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Asia > Afghanistan > Parwan Province > Charikar (0.04)
Neural Networks on Symmetric Spaces of Noncompact Type
Nguyen, Xuan Son, Yang, Shuo, Histace, Aymeric
Recent works have demonstrated promising performances of neural networks on hyperbolic spaces and symmetric positive definite (SPD) manifolds. These spaces belong to a family of Riemannian manifolds referred to as symmetric spaces of noncompact type. In this paper, we propose a novel approach for developing neural networks on such spaces. Our approach relies on a unified formulation of the distance from a point to a hyperplane on the considered spaces. We show that some existing formulations of the point-to-hyperplane distance can be recovered by our approach under specific settings. Furthermore, we derive a closed-form expression for the point-to-hyperplane distance in higher-rank symmetric spaces of noncompact type equipped with G-invariant Riemannian metrics. The derived distance then serves as a tool to design fully-connected (FC) layers and an attention mechanism for neural networks on the considered spaces. Our approach is validated on challenging benchmarks for image classification, electroencephalogram (EEG) signal classification, image generation, and natural language inference.
- Asia > Middle East > Lebanon (0.04)
- Europe > Austria > Styria > Graz (0.04)
- Asia > Japan > Honshū > Chūbu > Nagano Prefecture > Nagano (0.04)
- (2 more...)
Provable Non-linear Inductive Matrix Completion
Consider a standard recommendation/retrieval problem where given a query, the goal is to retrieve the most relevant items. Inductive matrix completion (IMC) method is a standard approach for this problem where the given query as well as the items are embedded in a common low-dimensional space. The inner product between a query embedding and an item embedding reflects relevance of the (query, item) pair. Non-linear IMC (NIMC) uses non-linear networks to embed the query as well as items, and is known to be highly effective for a variety of tasks, such as video recommendations for users, semantic web search, etc. Despite its wide usage, existing literature lacks rigorous understanding of NIMC models.
Spherization Layer: Representation Using Only Angles
In neural network literature, angular similarity between feature vectors is frequently used for interpreting or re-using learned representations. However, the inner product in neural networks partially disperses information over the scales and angles of the involved input vectors and weight vectors. Therefore, when using only angular similarity on representations trained with the inner product, information loss occurs in downstream methods, which limits their performance. In this paper, we proposed the $\textit{spherization layer}$ to represent all information on angular similarity. The layer 1) maps the pre-activations of input vectors into the specific range of angles, 2) converts the angular coordinates of the vectors to Cartesian coordinates with an additional dimension, and 3) trains decision boundaries from hyperplanes, without bias parameters, passing through the origin. This approach guarantees that representation learning always occurs on the hyperspherical surface without the loss of any information unlike other projection-based methods. Furthermore, this method can be applied to any network by replacing an existing layer. We validate the functional correctness of the proposed method in a toy task, retention ability in well-known image classification tasks, and effectiveness in word analogy test and few-shot learning.
The Inductive Bias of Quantum Kernels
It has been hypothesized that quantum computers may lend themselves well to applications in machine learning. In the present work, we analyze function classes defined via quantum kernels. Quantum computers offer the possibility to efficiently compute inner products of exponentially large density operators that are classically hard to compute.
Impossibility Results for Grammar-Compressed Linear Algebra
To handle vast amounts of data, it is natural and popular to compress vectors and matrices. When we compress a vector from size N down to size n << N, it certainly makes it easier to store and transmit efficiently, but does it also make it easier to process? In this paper we consider lossless compression schemes, and ask if we can run our computations on the compressed data as efficiently as if the original data was that small. That is, if an operation has time complexity T(input-size), can we perform it on the compressed representation in time T(n) rather than T(N)? We consider the most basic linear algebra operations: inner product, matrix-vector multiplication, and matrix multiplication.