AITopics | mutual coherence

Collaborating Authors

mutual coherence

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Understanding Counting in Small Transformers: The Interplay between Attention and Feed-Forward Layers

Behrens, Freya, Biggio, Luca, Zdeborová, Lenka

arXiv.org Artificial IntelligenceJul-16-2024

We provide a comprehensive analysis of simple transformer models trained on the histogram task, where the goal is to count the occurrences of each item in the input sequence from a fixed alphabet. Despite its apparent simplicity, this task exhibits a rich phenomenology that allows us to characterize how different architectural components contribute towards the emergence of distinct algorithmic solutions. In particular, we showcase the existence of two qualitatively different mechanisms that implement a solution, relation- and inventory-based counting. Which solution a model can implement depends non-trivially on the precise choice of the attention mechanism, activation function, memorization capacity and the presence of a beginning-of-sequence token. By introspecting learned models on the counting task, we find evidence for the formation of both mechanisms. From a broader perspective, our analysis offers a framework to understand how the interaction of different architectural components of transformer models shapes diverse algorithmic solutions and approximations.

accuracy, construction, explicit construction, (17 more...)

arXiv.org Artificial Intelligence

2407.11542

Country:

Europe > Switzerland > Vaud > Lausanne (0.04)
North America > Dominican Republic (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Variational Learning ISTA

Massoli, Fabio Valerio, Louizos, Christos, Behboodi, Arash

arXiv.org Machine LearningJul-9-2024

Compressed sensing combines the power of convex optimization techniques with a sparsity-inducing prior on the signal space to solve an underdetermined system of equations. For many problems, the sparsifying dictionary is not directly given, nor its existence can be assumed. Besides, the sensing matrix can change across different scenarios. Addressing these issues requires solving a sparse representation learning problem, namely dictionary learning, taking into account the epistemic uncertainty of the learned dictionaries and, finally, jointly learning sparse representations and reconstructions under varying sensing matrix conditions. We address both concerns by proposing a variant of the LISTA architecture. First, we introduce Augmented Dictionary Learning ISTA (A-DLISTA), which incorporates an augmentation module to adapt parameters to the current measurement setup. Then, we propose to learn a distribution over dictionaries via a variational approach, dubbed Variational Learning ISTA (VLISTA). VLISTA exploits A-DLISTA as the likelihood model and approximates a posterior distribution over the dictionaries as part of an unfolded LISTA-based recovery algorithm. As a result, VLISTA provides a probabilistic way to jointly learn the dictionary distribution and the reconstruction algorithm with varying sensing matrices. We provide theoretical and experimental support for our architecture and show that our model learns calibrated uncertainties.

a-dlista, matrix, vlista, (16 more...)

arXiv.org Machine Learning

2407.06646

Country: North America > United States > New York > New York County > New York City (0.14)

Genre: Research Report > New Finding (0.46)

Add feedback

CPRL - An Extension of Compressive Sensing to the Phase Retrieval Problem

Neural Information Processing SystemsMar-14-2024, 06:12:06 GMT

While compressive sensing (CS) has been one of the most vibrant research fields in the past few years, most development only applies to linear models. This limits its application in many areas where CS could make a difference. This paper presents a novel extension of CS to the phase retrieval problem, where intensity measurements of a linear system are used to recover a complex sparse signal. We propose a novel solution using a lifting technique - CPRL, which relaxes the NP-hard problem to a nonsmooth semidefinite program. Our analysis shows that CPRL inherits many desirable properties from CS, such as guarantees for exact recovery. We further provide scalable numerical solvers to accelerate its implementation.

algorithm, cprl, matrix, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
Europe > Sweden > Östergötland County > Linköping (0.04)
Europe > Spain > Galicia > Madrid (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.47)

Add feedback

Revisiting RIP guarantees for sketching operators on mixture models

Belhadji, Ayoub, Gribonval, Rémi

arXiv.org Machine LearningDec-9-2023

In the context of sketching for compressive mixture modeling, we revisit existing proofs of the Restricted Isometry Property of sketching operators with respect to certain mixtures models. After examining the shortcomings of existing guarantees, we propose an alternative analysis that circumvents the need to assume importance sampling when drawing random Fourier features to build random sketching operators. Our analysis is based on new deterministic bounds on the restricted isometry constant that depend solely on the set of frequencies used to define the sketching operator; then we leverage these bounds to establish concentration inequalities for random sketching operators that lead to the desired RIP guarantees. Our analysis also opens the door to theoretical guarantees for structured sketching with frequencies associated to fast random linear operators.

artificial intelligence, machine learning, operator, (17 more...)

arXiv.org Machine Learning

2312.05573

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > Virginia > Arlington County > Arlington (0.04)
(5 more...)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
Information Technology > Data Science (0.67)

Add feedback

Sparse Models for Machine Learning

Lin, Jianyi

arXiv.org Artificial IntelligenceAug-26-2023

The sparse modeling is an evident manifestation capturing the parsimony principle just described, and sparse models are widespread in statistics, physics, information sciences, neuroscience, computational mathematics, and so on. In statistics the many applications of sparse modeling span regression, classification tasks, graphical model selection, sparse M-estimators and sparse dimensionality reduction. It is also particularly effective in many statistical and machine learning areas where the primary goal is to discover predictive patterns from data which would enhance our understanding and control of underlying physical, biological, and other natural processes, beyond just building accurate outcome black-box predictors. Common examples include selecting biomarkers in biological procedures, finding relevant brain activity locations which are predictive about brain states and processes based on fMRI data, and identifying network bottlenecks best explaining end-to-end performance. Moreover, the research and applications of efficient recovery of high-dimensional sparse signals from a relatively small number of observations, which is the main focus of compressed sensing or compressive sensing, have rapidly grown and became an extremely intense area of study beyond classical signal processing. Likewise interestingly, sparse modeling is directly related to various artificial vision tasks, such as image denoising, segmentation, restoration and superresolution, object or face detection and recognition in visual scenes, and action recognition. In this manuscript, we provide a brief introduction of the basic theory underlying sparse representation and compressive sensing, and then discuss some methods for recovering sparse solutions to optimization problems in effective way, together with some applications of sparse recovery in a machine learning problem known as sparse dictionary learning.

artificial intelligence, bayesian inference, machine learning, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.1201/9781003283980-5

2308.1396

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Connecticut > New Haven County > New Haven (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(3 more...)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.54)

Add feedback

Heuristic Algorithms for the Approximation of Mutual Coherence

Betz, Gregor, Chekan, Vera, Mchedlidze, Tamara

arXiv.org Artificial IntelligenceJul-4-2023

Mutual coherence is a measure of similarity between two opinions. Although the notion comes from philosophy, it is essential for a wide range of technologies, e.g., the Wahl-O-Mat system. In Germany, this system helps voters to find candidates that are the closest to their political preferences. The exact computation of mutual coherence is highly time-consuming due to the iteration over all subsets of an opinion. Moreover, for every subset, an instance of the SAT model counting problem has to be solved which is known to be a hard problem in computer science. This work is the first study to accelerate this computation. We model the distribution of the so-called confirmation values as a mixture of three Gaussians and present efficient heuristics to estimate its model parameters. The mutual coherence is then approximated with the expected value of the distribution. Some of the presented algorithms are fully polynomial-time, others only require solving a small number of instances of the SAT model counting problem. The average squared error of our best algorithm lies below 0.0035 which is insignificant if the efficiency is taken into account. Furthermore, the accuracy is precise enough to be used in Wahl-O-Mat-like systems.

artificial intelligence, coherence, mutual coherence, (14 more...)

arXiv.org Artificial Intelligence

2307.01639

Country:

Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)
Europe > Netherlands (0.04)
Europe > Germany > Berlin (0.04)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)

Add feedback

Neural Optimization Kernel: Towards Robust Deep Learning

Lyu, Yueming, Tsang, Ivor

arXiv.org Machine LearningJun-10-2021

Recent studies show a close connection between neural networks (NN) and kernel methods. However, most of these analyses (e.g., NTK) focus on the influence of (infinite) width instead of the depth of NN models. There remains a gap between theory and practical network designs that benefit from the depth. This paper first proposes a novel kernel family named Neural Optimization Kernel (NOK). Our kernel is defined as the inner product between two $T$-step updated functionals in RKHS w.r.t. a regularized optimization problem. Theoretically, we proved the monotonic descent property of our update rule for both convex and non-convex problems, and a $O(1/T)$ convergence rate of our updates for convex problems. Moreover, we propose a data-dependent structured approximation of our NOK, which builds the connection between training deep NNs and kernel methods associated with NOK. The resultant computational graph is a ResNet-type finite width NN. Our structured approximation preserved the monotonic descent property and $O(1/T)$ convergence rate. Namely, a $T$-layer NN performs $T$-step monotonic descent updates. Notably, we show our $T$-layered structured NN with ReLU maintains a $O(1/T)$ convergence rate w.r.t. a convex regularized problem, which explains the success of ReLU on training deep NN from a NN architecture optimization perspective. For the unsupervised learning and the shared parameter case, we show the equivalence of training structured NN with GD and performing functional gradient descent in RKHS associated with a fixed (data-dependent) NOK at an infinity-width regime. For finite NOKs, we prove generalization bounds. Remarkably, we show that overparameterized deep NN (NOK) can increase the expressive power to reduce empirical risk and reduce the generalization bound at the same time. Extensive experiments verify the robustness of our structured NOK blocks.

approximation, inequality, neural network, (10 more...)

arXiv.org Machine Learning

2106.06097

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.82)

Add feedback

Dataless Model Selection with the Deep Frame Potential

Murdock, Calvin, Lucey, Simon

arXiv.org Machine LearningMar-30-2020

Choosing a deep neural network architecture is a fundamental problem in applications that require balancing performance and parameter efficiency. Standard approaches rely on ad-hoc engineering or computationally expensive validation on a specific dataset. We instead attempt to quantify networks by their intrinsic capacity for unique and robust representations, enabling efficient architecture comparisons without requiring any data. Building upon theoretical connections between deep learning and sparse approximation, we propose the deep frame potential: a measure of coherence that is approximately related to representation stability but has minimizers that depend only on network structure. This provides a framework for jointly quantifying the contributions of architectural hyper-parameters such as depth, width, and skip connections. We validate its use as a criterion for model selection and demonstrate correlation with generalization error on a variety of common residual and densely connected network architectures.

architecture, frame potential, representation, (13 more...)

arXiv.org Machine Learning

2003.13866

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > Canada > Ontario > Toronto (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Convolutional Neural Networks Analyzed via Inverse Problem Theory and Sparse Representations

Tarhan, Cem, Akar, Gozde Bozdagi

arXiv.org Machine LearningOct-23-2018

Inverse problems in imaging such as denoising, deblurring, superresolution (SR) have been addressed for many decades. In recent years, convolutional neural networks (CNNs) have been widely used for many inverse problem areas. Although their indisputable success, CNNs are not mathematically validated as to how and what they learn. In this paper, we prove that during training, CNN elements solve for inverse problems which are optimum solutions stored as CNN neuron filters. We discuss the necessity of mutual coherence between CNN layer elements in order for a network to converge to the optimum solution. We prove that required mutual coherence can be provided by the usage of residual learning and skip connections. We have set rules over training sets and depth of networks for better convergence, i.e. performance.

artificial intelligence, inverse problem, machine learning, (18 more...)

arXiv.org Machine Learning

doi: 10.1049/iet-spr.2018.5220

1807.07998

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.85)

Add feedback

Optimized projections for compressed sensing via rank-constrained nearest correlation matrix

Cleju, Nicolae

arXiv.org Machine LearningSep-14-2013

Optimizing the acquisition matrix is useful for compressed sensing of signals that are sparse in overcomplete dictionaries, because the acquisition matrix can be adapted to the particular correlations of the dictionary atoms. In this paper a novel formulation of the optimization problem is proposed, in the form of a rank-constrained nearest correlation matrix problem. Furthermore, improvements for three existing optimization algorithms are introduced, which are shown to be particular instances of the proposed formulation. Simulation results show notable improvements and superior robustness in sparse signal recovery. Keywords: acquisition, compressed sensing, nearest correlation matrix, optimization 1. Introduction Compressed Sensing (CS) [1] studies the possibility of acquiring a signal x that is a priori known to be sparse in some dictionary D with fewer linear measurements than required by the traditional sampling theorem. In many cases the dictionary D is an orthogonal basis, but we consider here the general case of an overcomplete dictionary.

algorithm, artificial intelligence, optimization problem, (14 more...)

arXiv.org Machine Learning

doi: 10.1016/j.acha.2013.08.005

1309.3676

Country: Europe > Romania (0.28)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback