AITopics | Yu, Zhongjie

Collaborating Authors

Yu, Zhongjie

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Characteristic Circuits

Yu, Zhongjie, Trapp, Martin, Kersting, Kristian

arXiv.org Machine LearningDec-12-2023

In many real-world scenarios, it is crucial to be able to reliably and efficiently reason under uncertainty while capturing complex relationships in data. Probabilistic circuits (PCs), a prominent family of tractable probabilistic models, offer a remedy to this challenge by composing simple, tractable distributions into a high-dimensional probability distribution. However, learning PCs on heterogeneous data is challenging and densities of some parametric distributions are not available in closed form, limiting their potential use. We introduce characteristic circuits (CCs), a family of tractable probabilistic models providing a unified formalization of distributions over heterogeneous data in the spectral domain. The one-to-one relationship between characteristic functions and probability measures enables us to learn high-dimensional distributions on heterogeneous data domains and facilitates efficient probabilistic inference even when no closed-form density function is available. We show that the structure and parameters of CCs can be learned efficiently from the data and find that CCs outperform state-of-the-art density estimators for heterogeneous data domains on common benchmark data sets.

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Machine Learning

2312.0779

Country: Europe > Germany (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

Probabilistic Circuits That Know What They Don't Know

Ventola, Fabrizio, Braun, Steven, Yu, Zhongjie, Mundt, Martin, Kersting, Kristian

arXiv.org Artificial IntelligenceJun-12-2023

Probabilistic circuits (PCs) are models that allow exact and tractable probabilistic inference. In contrast to neural networks, they are often assumed to be well-calibrated and robust to out-of-distribution (OOD) data. In this paper, we show that PCs are in fact not robust to OOD data, i.e., they don't know what they don't know. We then show how this challenge can be overcome by model uncertainty quantification. To this end, we propose tractable dropout inference (TDI), an inference procedure to estimate uncertainty by deriving an analytical solution to Monte Carlo dropout (MCD) through variance propagation. Unlike MCD in neural networks, which comes at the cost of multiple network evaluations, TDI provides tractable sampling-free uncertainty estimates in a single forward pass. TDI improves the robustness of PCs to distribution shift and OOD data, demonstrated through a series of experiments evaluating the classification confidence and uncertainty estimates on real-world data.

artificial intelligence, machine learning, node, (18 more...)

arXiv.org Artificial Intelligence

2302.06544

Country: Europe > Germany (0.28)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

HalluAudio: Hallucinating Frequency as Concepts for Few-Shot Audio Classification

Yu, Zhongjie, Wang, Shuyang, Chen, Lin, Cheng, Zhongwei

arXiv.org Artificial IntelligenceFeb-27-2023

ABSTRACT Few-shot audio classification is an emerging topic that attracts more and more attention from the research community. Most existing work ignores the specificity of the form of the audio spectrogram and focuses largely on the embedding space borrowed from image tasks, while in this work, we aim to take advantage of this special audio format and propose a new method by hallucinating high-frequency and low-frequency parts as structured concepts. Extensive experiments on ESC-50 and our curated balanced Kaggle18 dataset show the proposed method outperforms the baseline by a notable margin. The way that our method hallucinates high-frequency and low-frequency parts also enables its interpretability and Figure 1. Detailed structure opens up new potentials for the few-shot audio classification.

artificial intelligence, machine learning, spectrogram, (13 more...)

arXiv.org Artificial Intelligence

2302.14204

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Leveraging Probabilistic Circuits for Nonparametric Multi-Output Regression

Yu, Zhongjie, Zhu, Mingye, Trapp, Martin, Skryagin, Arseny, Kersting, Kristian

arXiv.org Machine LearningJun-16-2021

DN), thus, limiting their use to moderately sized data sets. To enable posterior inference in GPs on large-scale problems, Inspired by recent advances in the field of expertbased recent work (see e.g. Liu et al. [2020] for a detailed approximations of Gaussian processes (GPs), review) mainly resorts to global approximations to the posterior, we present an expert-based approach to large-scale e.g., using inducing points, or local approximations multi-output regression using single-output GP that aim to distribute the computation of the posterior distribution experts. Employing a deeply structured mixture onto local experts. Unfortunately, most of these of single-output GPs encoded via a probabilistic approaches only focus on single-output regression, i.e., the circuit allows us to capture correlations between dependent variable is univariate, and in the case of local multiple output dimensions accurately. By recursively approximations, do not easily extend to multi-output regression partitioning the covariate space and the output tasks, see Bruinsma et al. [2020] for a detailed space, posterior inference in our model reduces to discussion on recent techniques on multi-output GPs.

bayesian inference, momogp, survey article, (19 more...)

arXiv.org Machine Learning

2106.08687

Country:

North America > United States > California (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

RECOWNs: Probabilistic Circuits for Trustworthy Time Series Forecasting

Thoma, Nils, Yu, Zhongjie, Ventola, Fabrizio, Kersting, Kristian

arXiv.org Artificial IntelligenceJun-8-2021

Time series forecasting is a relevant task that is performed in several real-world scenarios such as product sales analysis and prediction of energy demand. Given their accuracy performance, currently, Recurrent Neural Networks (RNNs) are the models of choice for this task. Despite their success in time series forecasting, less attention has been paid to make the RNNs trustworthy. For example, RNNs can not naturally provide an uncertainty measure to their predictions. This could be extremely useful in practice in several cases e.g. to detect when a prediction might be completely wrong due to an unusual pattern in the time series. Whittle Sum-Product Networks (WSPNs), prominent deep tractable probabilistic circuits (PCs) for time series, can assist an RNN with providing meaningful probabilities as uncertainty measure. With this aim, we propose RECOWN, a novel architecture that employs RNNs and a discriminant variant of WSPNs called Conditional WSPNs (CWSPNs). We also formulate a Log-Likelihood Ratio Score as better estimation of uncertainty that is tailored to time series and Whittle likelihoods. In our experiments, we show that RECOWNs are accurate and trustworthy time series predictors, able to "know when they do not know".

deep learning, neural network, prediction, (19 more...)

arXiv.org Artificial Intelligence

2106.04148

Country:

Europe > Germany (0.14)
North America > United States > California (0.14)

Genre: Research Report > New Finding (0.67)

Industry:

Energy (0.66)
Education > Educational Setting (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Add feedback

Looking back to lower-level information in few-shot learning

Yu, Zhongjie, Raschka, Sebastian

arXiv.org Machine LearningJul-15-2020

Humans are capable of learning new concepts from small numbers of examples. In contrast, supervised deep learning models usually lack the ability to extract reliable predictive rules from limited data scenarios when attempting to classify new examples. This challenging scenario is commonly known as few-shot learning. Few-shot learning has garnered increased attention in recent years due to its significance for many real-world problems. Recently, new methods relying on meta-learning paradigms combined with graph-based structures, which model the relationship between examples, have shown promising results on a variety of few-shot classification tasks. However, existing work on few-shot learning is only focused on the feature embeddings produced by the last layer of the neural network. In this work, we propose the utilization of lower-level, supporting information, namely the feature embeddings of the hidden neural network layers, to improve classifier accuracy. Based on a graph-based meta-learning framework, we develop a method called Looking-Back, where such lower-level information is used to construct additional graphs for label propagation in limited data settings. Our experiments on two popular few-shot learning datasets, miniImageNet and tieredImageNet, show that our method can utilize the lower-level information in the network to improve state-of-the-art classification performance.

deep learning, information, neural network, (14 more...)

arXiv.org Machine Learning

doi: 10.3390/info11070345

2005.13638

Country: North America > United States > Wisconsin > Dane County > Madison (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback