AITopics | Banff

Collaborating Authors

Banff

Latent Coincidence Analysis: A Hidden Variable Model for Distance Metric Learning

Neural Information Processing SystemsMar-14-2024, 14:42:20 GMT

We describe a latent variable model for supervised dimensionality reduction and distance metric learning. The model discovers linear projections of high dimensional data that shrink the distance between similarly labeled inputs and expand the distance between differently labeled ones. The model's continuous latent variables locate pairs of examples in a latent space of lower dimensionality. The model differs significantly from classical factor analysis in that the posterior distribution over these latent variables is not always multivariate Gaussian. Nevertheless we show that inference is completely tractable and derive an Expectation-Maximization (EM) algorithm for parameter estimation. We also compare the model to other approaches in distance metric learning. The model's main advantage is its simplicity: at each iteration of the EM algorithm, the distance metric is re-estimated by solving an unconstrained least-squares problem. Experiments show that these simple updates are highly effective.

algorithm, classification, lca, (11 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Oregon (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Add feedback

The limits of squared Euclidean distance regularization

Neural Information Processing SystemsMar-13-2024, 11:00:55 GMT

Some of the simplest loss functions considered in Machine Learning are the square loss, the logistic loss and the hinge loss. The most common family of algorithms, including Gradient Descent (GD) with and without Weight Decay, always predict with a linear combination of the past instances. We give a random construction for sets of examples where the target linear weight vector is trivial to learn but any algorithm from the above family is drastically sub-optimal. Our lower bound on the latter algorithms holds even if the algorithms are enhanced with an arbitrary kernel function. This type of result was known for the square loss. However, we develop new techniques that let us prove such hardness results for any loss function satisfying some minimal requirements on the loss function (including the three listed above). We also show that algorithms that regularize with the squared Euclidean distance are easily confused by random features. Finally, we conclude by discussing related open problems regarding feed forward neural networks. We conjecture that our hardness results hold for any training algorithm that is based on the squared Euclidean distance regularization (i.e.

algorithm, linear combination, matrix, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Cruz County > Santa Cruz (0.14)
Europe > United Kingdom > England > Greater London > London (0.04)
North America > United States > Oregon > Multnomah County > Portland (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.35)

Add feedback

Sequential Monte Carlo for Graphical Models

Neural Information Processing SystemsMar-13-2024, 09:00:30 GMT

We propose a new framework for how to use sequential Monte Carlo (SMC) algorithms for inference in probabilistic graphical models (PGM). Via a sequential decomposition of the PGM we find a sequence of auxiliary distributions defined on a monotonically increasing sequence of probability spaces. By targeting these auxiliary distributions using SMC we are able to approximate the full joint distribution defined by the PGM. One of the key merits of the SMC sampler is that it provides an unbiased estimate of the partition function of the model. We also show how it can be used within a particle Markov chain Monte Carlo framework in order to construct high-dimensional block-sampling algorithms for general PGMs.

partition function, sampler, smc sampler, (15 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Asia > Middle East > Jordan (0.05)
North America > United States > Wisconsin > Dane County > Madison (0.04)
(7 more...)

Genre: Instructional Material (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.88)

Add feedback

Authorship Verification based on the Likelihood Ratio of Grammar Models

Nini, Andrea, Halvani, Oren, Graner, Lukas, Gherardi, Valerio, Ishihara, Shunichi

arXiv.org Artificial IntelligenceMar-13-2024

Authorship Verification (AV) is the process of analyzing a set of documents to determine whether they were written by a specific author. This problem often arises in forensic scenarios, e.g., in cases where the documents in question constitute evidence for a crime. Existing state-of-the-art AV methods use computational solutions that are not supported by a plausible scientific explanation for their functioning and that are often difficult for analysts to interpret. To address this, we propose a method relying on calculating a quantity we call $\lambda_G$ (LambdaG): the ratio between the likelihood of a document given a model of the Grammar for the candidate author and the likelihood of the same document given a model of the Grammar for a reference population. These Grammar Models are estimated using $n$-gram language models that are trained solely on grammatical features. Despite not needing large amounts of data for training, LambdaG still outperforms other established AV methods with higher computational complexity, including a fine-tuned Siamese Transformer network. Our empirical evaluation based on four baseline methods applied to twelve datasets shows that LambdaG leads to better results in terms of both accuracy and AUC in eleven cases and in all twelve cases if considering only topic-agnostic methods. The algorithm is also highly robust to important variations in the genre of the reference population in many cross-genre comparisons. In addition to these properties, we demonstrate how LambdaG is easier to interpret than the current state-of-the-art. We argue that the advantage of LambdaG over other methods is due to fact that it is compatible with Cognitive Linguistic theories of language processing.

authorship verification, corpus, lambdag, (16 more...)

arXiv.org Artificial Intelligence

2403.08462

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > New York > New York County > New York City (0.04)
Europe > Spain > Valencian Community > Valencia Province > Valencia (0.04)
(21 more...)

Genre: Research Report > New Finding (0.67)

Industry:

Law (1.00)
Information Technology > Security & Privacy (0.92)
Law Enforcement & Public Safety (0.67)
Media (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(5 more...)

Add feedback

Structural Positional Encoding for knowledge integration in transformer-based medical process monitoring

Irwin, Christopher, Dossena, Marco, Leonardi, Giorgio, Montani, Stefania

arXiv.org Artificial IntelligenceMar-13-2024

Predictive process monitoring is a process mining task aimed at forecasting information about a running process trace, such as the most correct next activity to be executed. In medical domains, predictive process monitoring can provide valuable decision support in atypical and nontrivial situations. Decision support and quality assessment in medicine cannot ignore domain knowledge, in order to be grounded on all the available information (which is not limited to data) and to be really acceptable by end users. In this paper, we propose a predictive process monitoring approach relying on the use of a {\em transformer}, a deep learning architecture based on the attention mechanism. A major contribution of our work lies in the incorporation of ontological domain-specific knowledge, carried out through a graph positional encoding technique. The paper presents and discusses the encouraging experimental result we are collecting in the domain of stroke management.

architecture, international conference, sequence, (15 more...)

arXiv.org Artificial Intelligence

2403.08836

Country:

Europe > Austria > Vienna (0.14)
Europe > Italy (0.05)
Europe > Greece > Central Macedonia > Thessaloniki (0.04)
(7 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.46)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Towards Model-Agnostic Posterior Approximation for Fast and Accurate Variational Autoencoders

Yacoby, Yaniv, Pan, Weiwei, Doshi-Velez, Finale

arXiv.org Machine LearningMar-13-2024

Inference for Variational Autoencoders (VAEs) consists of learning two models: (1) a generative model, which transforms a simple distribution over a latent space into the distribution over observed data, and (2) an inference model, which approximates the posterior of the latent codes given data. The two components are learned jointly via a lower bound to the generative model's log marginal likelihood. In early phases of joint training, the inference model poorly approximates the latent code posteriors. Recent work showed that this leads optimization to get stuck in local optima, negatively impacting the learned generative model. As such, recent work suggests ensuring a high-quality inference model via iterative training: maximizing the objective function relative to the inference model before every update to the generative model. Unfortunately, iterative training is inefficient, requiring heuristic criteria for reverting from iterative to joint training for speed. Here, we suggest an inference method that trains the generative and inference models independently. It approximates the posterior of the true model a priori; fixing this posterior approximation, we then maximize the lower bound relative to only the generative model. By conventional wisdom, this approach should rely on the true prior and likelihood of the true model to approximate its posterior (which are unknown). However, we show that we can compute a deterministic, model-agnostic posterior approximation (MAPA) of the true model's posterior. We then use MAPA to develop a proof-of-concept inference method. We present preliminary results on low-dimensional synthetic data that (1) MAPA captures the trend of the true posterior, and (2) our MAPA-based inference performs better density estimation with less computation than baselines. Lastly, we present a roadmap for scaling the MAPA-based inference method to high-dimensional data.

mapa, model-agnostic posterior approximation, posterior, (13 more...)

arXiv.org Machine Learning

2403.08941

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
North America > Puerto Rico > San Juan > San Juan (0.04)
(4 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.88)

Add feedback

Deep Generative Domain Adaptation with Temporal Relation Knowledge for Cross-User Activity Recognition

Ye, Xiaozhou, Wang, Kevin I-Kai

arXiv.org Artificial IntelligenceMar-12-2024

In human activity recognition (HAR), the assumption that training and testing data are independent and identically distributed (i.i.d.) often fails, particularly in cross-user scenarios where data distributions vary significantly. This discrepancy highlights the limitations of conventional domain adaptation methods in HAR, which typically overlook the inherent temporal relations in time-series data. To bridge this gap, our study introduces a Conditional Variational Autoencoder with Universal Sequence Mapping (CVAE-USM) approach, which addresses the unique challenges of time-series domain adaptation in HAR by relaxing the i.i.d. assumption and leveraging temporal relations to align data distributions effectively across different users. This method combines the strengths of Variational Autoencoder (VAE) and Universal Sequence Mapping (USM) to capture and utilize common temporal patterns between users for improved activity recognition. Our results, evaluated on two public HAR datasets (OPPT and PAMAP2), demonstrate that CVAE-USM outperforms existing state-of-the-art methods, offering a more accurate and generalizable solution for cross-user activity recognition.

dataset, domain adaptation, temporal relation, (15 more...)

arXiv.org Artificial Intelligence

2403.14682

Country:

Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)

Genre:

Research Report > New Finding (0.48)
Research Report > Promising Solution (0.34)

Industry:

Information Technology (0.47)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Deep Generative Domain Adaptation with Temporal Attention for Cross-User Activity Recognition

Ye, Xiaozhou, Wang, Kevin I-Kai

arXiv.org Artificial IntelligenceMar-12-2024

In Human Activity Recognition (HAR), a predominant assumption is that the data utilized for training and evaluation purposes are drawn from the same distribution. It is also assumed that all data samples are independent and identically distributed ($\displaystyle i.i.d.$). Contrarily, practical implementations often challenge this notion, manifesting data distribution discrepancies, especially in scenarios such as cross-user HAR. Domain adaptation is the promising approach to address these challenges inherent in cross-user HAR tasks. However, a clear gap in domain adaptation techniques is the neglect of the temporal relation embedded within time series data during the phase of aligning data distributions. Addressing this oversight, our research presents the Deep Generative Domain Adaptation with Temporal Attention (DGDATA) method. This novel method uniquely recognises and integrates temporal relations during the domain adaptation process. By synergizing the capabilities of generative models with the Temporal Relation Attention mechanism, our method improves the classification performance in cross-user HAR. A comprehensive evaluation has been conducted on three public sensor-based HAR datasets targeting different scenarios and applications to demonstrate the efficacy of the proposed DGDATA method.

activity recognition, data distribution, temporal relation, (14 more...)

arXiv.org Artificial Intelligence

2403.17958

Country:

Oceania > New Zealand > North Island > Auckland Region > Auckland (0.05)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)

Genre: Research Report > Promising Solution (0.86)

Industry:

Information Technology (0.68)
Health & Medicine (0.68)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

Transformations in Learned Image Compression from a Modulation Perspective

Bao, Youneng, Meng, Fangyang, Tan, Wen, Li, Chao, Tian, Yonghong, Liang, Yongsheng

arXiv.org Artificial IntelligenceMar-12-2024

In this paper, a unified transformation method in learned image compression(LIC) is proposed from the perspective of modulation. Firstly, the quantization in LIC is considered as a generalized channel with additive uniform noise. Moreover, the LIC is interpreted as a particular communication system according to the consistency in structures and optimization objectives. Thus, the technology of communication systems can be applied to guide the design of modules in LIC. Furthermore, a unified transform method based on signal modulation (TSM) is defined. In the view of TSM, the existing transformation methods are mathematically reduced to a linear modulation. A series of transformation methods, e.g. TPM and TJM, are obtained by extending to nonlinear modulation. The experimental results on various datasets and backbone architectures verify that the effectiveness and robustness of the proposed method. More importantly, it further confirms the feasibility of guiding LIC design from a communication perspective. For example, when backbone architecture is hyperprior combining context model, our method achieves 3.52$\%$ BD-rate reduction over GDN on Kodak dataset without increasing complexity.

communication system, image compression, transformation, (14 more...)

arXiv.org Artificial Intelligence

2203.02158

Country:

North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
(6 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Continual Learning by Three-Phase Consolidation

Maltoni, Davide, Pellegrini, Lorenzo

arXiv.org Artificial IntelligenceMar-12-2024

TPC (Three-Phase Consolidation) is here introduced as a simple but effective approach to continually learn new classes (and/or instances of known classes) while controlling forgetting of previous knowledge. Each experience (a.k.a. task) is learned in three phases characterized by different rules and learning dynamics, aimed at removing the class-bias problem (due to class unbalancing) and limiting gradient-based corrections to prevent forgetting of underrepresented classes. Several experiments on complex datasets demonstrate its accuracy and efficiency advantages over competitive existing approaches. The algorithm and all the results presented in this paper are fully reproducible thanks to its publication on the Avalanche open framework for continual learning.

correction, learning, scenario, (14 more...)

arXiv.org Artificial Intelligence

2403.14679

Country:

North America > United States (0.04)
Europe > Italy > Emilia-Romagna > Metropolitan City of Bologna > Bologna (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
Asia > Singapore (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback