AITopics | bayes-net

Collaborating Authors

bayes-net

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Subadditivity of Probability Divergences on Bayes-Nets with Applications to Time Series GANs

Ding, Mucong, Daskalakis, Constantinos, Feizi, Soheil

arXiv.org Machine LearningMar-1-2020

GANs for time series data often use sliding windows or self-attention to capture underlying time dependencies. While these techniques have no clear theoretical justification, they are successful in significantly reducing the discriminator size, speeding up the training process, and improving the generation quality. In this paper, we provide both theoretical foundations and a practical framework of GANs for high-dimensional distributions with conditional independence structure captured by a Bayesian network, such as time series data. We prove that several probability divergences satisfy subadditivity properties with respect to the neighborhoods of the Bayes-net graph, providing an upper bound on the distance between two Bayes-nets by the sum of (local) distances between their marginals on every neighborhood of the graph. This leads to our proposed Subadditive GAN framework that uses a set of simple discriminators on the neighborhoods of the Bayes-net, rather than a giant discriminator on the entire network, providing significant statistical and computational benefits. We show that several probability distances including Jensen-Shannon, Total Variation, and Wasserstein, have subadditivity or generalized subadditivity. Moreover, we prove that Integral Probability Metrics (IPMs), which encompass commonly-used loss functions in GANs, also enjoy a notion of subadditivity under some mild conditions. Furthermore, we prove that nearly all f-divergences satisfy local subadditivity in which subadditivity holds when the distributions are relatively close. Our experiments on synthetic as well as real-world datasets verify the proposed theory and the benefits of subadditive GANs.

bayes-net, probability divergence, subadditivity, (12 more...)

arXiv.org Machine Learning

2003.00652

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Maryland > Prince George's County > College Park (0.04)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Square Hellinger Subadditivity for Bayesian Networks and its Applications to Identity Testing

Daskalakis, Constantinos, Pan, Qinxuan

arXiv.org Machine LearningDec-9-2016

We show that the square Hellinger distance between two Bayesian networks on the same directed graph, $G$, is subadditive with respect to the neighborhoods of $G$. Namely, if $P$ and $Q$ are the probability distributions defined by two Bayesian networks on the same DAG, our inequality states that the square Hellinger distance, $H^2(P,Q)$, between $P$ and $Q$ is upper bounded by the sum, $\sum_v H^2(P_{\{v\} \cup \Pi_v}, Q_{\{v\} \cup \Pi_v})$, of the square Hellinger distances between the marginals of $P$ and $Q$ on every node $v$ and its parents $\Pi_v$ in the DAG. Importantly, our bound does not involve the conditionals but the marginals of $P$ and $Q$. We derive a similar inequality for more general Markov Random Fields. As an application of our inequality, we show that distinguishing whether two Bayesian networks $P$ and $Q$ on the same (but potentially unknown) DAG satisfy $P=Q$ vs $d_{\rm TV}(P,Q)>\epsilon$ can be performed from $\tilde{O}(|\Sigma|^{3/4(d+1)} \cdot n/\epsilon^2)$ samples, where $d$ is the maximum in-degree of the DAG and $\Sigma$ the domain of each variable of the Bayesian networks. If $P$ and $Q$ are defined on potentially different and potentially unknown trees, the sample complexity becomes $\tilde{O}(|\Sigma|^{4.5} n/\epsilon^2)$, whose dependence on $n, \epsilon$ is optimal up to logarithmic factors. Lastly, if $P$ and $Q$ are product distributions over $\{0,1\}^n$ and $Q$ is known, the sample complexity becomes $O(\sqrt{n}/\epsilon^2)$, which is optimal up to constant factors.

artificial intelligence, hellinger distance, machine learning, (18 more...)

arXiv.org Machine Learning

1612.03164

Country: North America > United States (0.46)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback