AITopics | bernoulli model

Collaborating Authors

bernoulli model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Reviews: Adversarially Robust Generalization Requires More Data

Neural Information Processing SystemsOct-8-2024, 10:18:22 GMT

The paper considered theoretical results on adversarially robust generalization, which studies the robustness of classifiers in the presence of even small noise. In particular, the work studied the generalization of adversarially robust learning by investigating the sample complexity in a comparison to that of standard learning. Specifically, the study focused on two simple concrete distribution models: gaussian model and Bernoulli model. For both models, the authors established the lower and upper bounds for the sample complexities. From these results, they drew the conclusion that the sample complexity of robust generalization is much larger than standard generalization.

adversarially robust generalization require, generalization, sample complexity, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.57)

Add feedback

Probabilistic Modeling for Sequences of Sets in Continuous-Time

Chang, Yuxin, Boyd, Alex, Smyth, Padhraic

arXiv.org Machine LearningJan-4-2024

Neural marked temporal point processes have been a valuable addition to the existing toolbox of statistical parametric models for continuous-time event data. These models are useful for sequences where each event is associated with a single item (a single type of event or a "mark") -- but such models are not suited for the practical situation where each event is associated with a set of items. In this work, we develop a general framework for modeling set-valued data in continuous-time, compatible with any intensity-based recurrent neural point process model. In addition, we develop inference methods that can use such models to answer probabilistic queries such as "the probability of item $A$ being observed before item $B$," conditioned on sequence history. Computing exact answers for such queries is generally intractable for neural models due to both the continuous-time nature of the problem setting and the combinatorially-large space of potential outcomes for each event. To address this, we develop a class of importance sampling methods for querying with set-based sequences and demonstrate orders-of-magnitude improvements in efficiency over direct sampling via systematic experiments with four real-world datasets. We also illustrate how to use this framework to perform model selection using likelihoods that do not involve one-step-ahead prediction.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Machine Learning

2312.15045

Country:

North America > United States > California > Orange County > Irvine (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Industry:

Education (0.48)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.64)

Add feedback

A majorization-minimization algorithm for nonnegative binary matrix factorization

Magron, Paul, Févotte, Cédric

arXiv.org Artificial IntelligenceApr-20-2022

This paper tackles the problem of decomposing binary data using matrix factorization. We consider the family of mean-parametrized Bernoulli models, a class of generative models that are well suited for modeling binary data and enables interpretability of the factors. We factorize the Bernoulli parameter and consider an additional Beta prior on one of the factors to further improve the model's expressive power. While similar models have been proposed in the literature, they only exploit the Beta prior as a proxy to ensure a valid Bernoulli parameter in a Bayesian setting; in practice it reduces to a uniform or uninformative prior. Besides, estimation in these models has focused on costly Bayesian inference. In this paper, we propose a simple yet very efficient majorization-minimization algorithm for maximum a posteriori estimation. Our approach leverages the Beta prior whose parameters can be tuned to improve performance in matrix completion tasks. Experiments conducted on three public binary datasets show that our approach offers an excellent trade-off between prediction performance, computational complexity, and interpretability.

algorithm, factorization, matrix factorization, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/LSP.2022.3187368

2204.09741

Country:

Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
Europe > France > Grand Est > Meurthe-et-Moselle > Nancy (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

Probabilistic Embeddings with Laplacian Graph Priors

Yrjänäinen, Väinö, Magnusson, Måns

arXiv.org Machine LearningMar-25-2022

Probabilistic word embedding models have emerged as a way to model textual data in scientific applications [Rudolph We introduce probabilistic embeddings using et al., 2016, Bamler and Mandt, 2017]. The advantages include Laplacian priors (PELP). The proposed model enables flexible inclusion of prior knowledge, explicit handling incorporating graph side-information into of uncertainty, straightforward estimation, and usefulness static word embeddings. We theoretically show in decision-making [Ghahramani, 2015]. More versatile that the model unifies several previously proposed and flexible word embedding methods allow for increasingly embedding methods under one umbrella.

machine learning, natural language, pelp model, (18 more...)

arXiv.org Machine Learning

2204.01846

Country:

North America > United States > Maine (0.04)
Europe > Sweden > Uppsala County > Uppsala (0.04)
North America > United States > Wyoming (0.04)
(7 more...)

Genre: Research Report (0.64)

Industry:

Health & Medicine (0.68)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

Dual Confidence Regions: A Simple Introduction - DataScienceCentral.com

#artificialintelligenceFeb-26-2022, 05:12:05 GMT

This tutorial explains how to build confidence regions (the 2D version of a confidence interval) using as little statistical theory as possible. I also avoid the traditional terminology and notation such as α, Z1-α, critical value, confidence level, significance level and so on. These can be confusing to beginners and professionals alike. Instead, I use simulations and two keywords only: confidence region, and confidence level. The purpose is to explain the concept using a framework that will appeal to machine learning professionals, software engineers and non-statisticians.

confidence region, probability, simulation, (14 more...)

#artificialintelligence

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.61)

Add feedback

More Data Can Expand the Generalization Gap Between Adversarially Robust and Standard Models

Chen, Lin, Min, Yifei, Zhang, Mingrui, Karbasi, Amin

arXiv.org Machine LearningFeb-11-2020

As modern machine learning models continue to gain traction in the real world, a wide variety of novel problems have come to the forefront of the research community. One particularly important challenge has been that of adversarial attacks (Szegedy et al., 2013; Goodfellow et al., 2014; Kos et al., 2018; Carlini & Wagner, 2018). To be specific, given a model with excellent performance on a standard data set, one can add small perturbations to the test data that can fool the model and cause it to make wrong predictions. What is more worrying is that these small perturbations can possibly be designed to be imperceptible to human beings, which raises concerns about potential safety issues and risks, especially when it comes to applications such as autonomous vehicles where human lives are at stake. The problem of adversarial robustness in machine learning models has been explored from several different perspectives since its discovery. One direction has been to propose attacks that challenge these models and their training procedures (Carlini & Wagner, 2017; Gu & Rigazio, 2014; Athalye et al., 2018; Papernot et al., 2016a; Moosavi-Dezfooli et al., 2016).

adversary regime, generalization gap, regime, (11 more...)

arXiv.org Machine Learning

2002.04725

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Subspace Tracking from Missing and Outlier Corrupted Data

Narayanamurthy, Praneeth, Daneshpajooh, Vahid, Vaswani, Namrata

arXiv.org Machine LearningOct-6-2018

We study the related problems of subspace tracking in the presence of missing data (ST-miss) as well as robust subspace tracking with missing data (RST-miss). Here "robust" refers to robustness to sparse outliers. In recent work, we have studied the RST problem without missing data. In this work, we show that simple modifications of our solution approach for RST also provably solve ST-miss and RST-miss under weaker and similar assumptions respectively. To our knowledge, our result is the first complete guarantee for both ST-miss and RST-miss. This means we are able to show that, under assumptions on only the algorithm inputs (input data and/or initialization), the output subspace estimates are close to the true data subspaces at all times. Our guarantees hold under mild and easily interpretable assumptions and handle time-varying subspaces (unlike all previous work). We also show that our algorithm and its extensions are fast and have competitive experimental performance when compared with existing methods.

artificial intelligence, data quality, machine learning, (15 more...)

arXiv.org Machine Learning

1810.03051

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Data Science > Data Quality (0.75)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Reliable Learning of Bernoulli Mixture Models

Najafi, Amir, Motahari, Abolfazl, Rabiee, Hamid R.

arXiv.org Machine LearningOct-5-2017

In this paper, we have derived a set of sufficient conditions for reliable clustering of data produced by Bernoulli Mixture Models (BMM), when the number of clusters is unknown. A BMM refers to a random binary vector whose components are independent Bernoulli trials with cluster-specific frequencies. The problem of clustering BMM data arises in many real-world applications, most notably in population genetics where researchers aim at inferring the population structure from multilocus genotype data. Our findings stipulate a minimum dataset size and a minimum number of Bernoulli trials (or genotyped loci) per sample, such that the existence of a clustering algorithm with a sufficient accuracy is guaranteed. Moreover, the mathematical intuitions and tools behind our work can help researchers in designing more effective and theoretically-plausible heuristic methods for similar problems.

artificial intelligence, bernoulli model, machine learning, (17 more...)

arXiv.org Machine Learning

1710.02101

Country: Asia > Middle East (0.28)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.88)

Add feedback

High Dimensional Low Rank plus Sparse Matrix Decomposition

Rahmani, Mostafa, Atia, George

arXiv.org Machine LearningMar-16-2017

This paper is concerned with the problem of low rank plus sparse matrix decomposition for big data. Conventional algorithms for matrix decomposition use the entire data to extract the low-rank and sparse components, and are based on optimization problems with complexity that scales with the dimension of the data, which limits their scalability. Furthermore, existing randomized approaches mostly rely on uniform random sampling, which is quite inefficient for many real world data matrices that exhibit additional structures (e.g. clustering). In this paper, a scalable subspace-pursuit approach that transforms the decomposition problem to a subspace learning problem is proposed. The decomposition is carried out using a small data sketch formed from sampled columns/rows. Even when the data is sampled uniformly at random, it is shown that the sufficient number of sampled columns/rows is roughly O(r\mu), where \mu is the coherency parameter and r the rank of the low rank component. In addition, adaptive sampling algorithms are proposed to address the problem of column/row sampling from structured data. We provide an analysis of the proposed method with adaptive sampling and show that adaptive sampling makes the required number of sampled columns/rows invariant to the distribution of the data. The proposed approach is amenable to online implementation and an online scheme is proposed.

artificial intelligence, machine learning, matrix, (18 more...)

arXiv.org Machine Learning

doi: 10.1109/TSP.2017.2649482

1502.00182

Country: North America > United States (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Data Science (0.88)

Add feedback

Matrix completion with column manipulation: Near-optimal sample-robustness-rank tradeoffs

Chen, Yudong, Xu, Huan, Caramanis, Constantine, Sanghavi, Sujay

arXiv.org Machine LearningApr-24-2016

This paper considers the problem of matrix completion when some number of the columns are completely and arbitrarily corrupted, potentially by a malicious adversary. It is well-known that standard algorithms for matrix completion can return arbitrarily poor results, if even a single column is corrupted. One direct application comes from robust collaborative filtering. Here, some number of users are so-called manipulators who try to skew the predictions of the algorithm by calibrating their inputs to the system. In this paper, we develop an efficient algorithm for this problem based on a combination of a trimming procedure and a convex program that minimizes the nuclear norm and the $\ell_{1,2}$ norm. Our theoretical results show that given a vanishing fraction of observed entries, it is nevertheless possible to complete the underlying matrix even when the number of corrupted columns grows. Significantly, our results hold without any assumptions on the locations or values of the observed entries of the manipulated columns. Moreover, we show by an information-theoretic argument that our guarantees are nearly optimal in terms of the fraction of sampled entries on the authentic columns, the fraction of corrupted columns, and the rank of the underlying matrix. Our results therefore sharply characterize the tradeoffs between sample, robustness and rank in matrix completion.

corrupted column, data mining, machine learning, (19 more...)

arXiv.org Machine Learning

1102.2254

Country: North America > United States (0.67)

Genre: Research Report > New Finding (1.00)

Industry: Education (0.45)

Technology:

Information Technology > Communications (1.00)
Information Technology > Data Science > Data Mining (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.48)

Add feedback