Directed Networks
On Controlling the Size of Clusters in Probabilistic Clustering
Jitta, Aditya (University of Helsinki) | Klami, Arto (University of Helsinki)
Classical model-based partitional clustering algorithms, such ask-means or mixture of Gaussians, provide only loose and indirect control over the size of the resulting clusters. In this work, we present a family of probabilistic clustering models that can be steered towards clusters of desired size by providing a prior distribution over the possible sizes, allowing the analyst to fine-tune exploratory analysis or to produce clusters of suitable size for future down-stream processing.Our formulation supports arbitrary multimodal prior distributions, generalizing the previous work on clustering algorithms searching for clusters of equal size or algorithms designed for the microclustering task of finding small clusters. We provide practical methods for solving the problem, using integer programming for making the cluster assignments, and demonstrate that we can also automatically infer the number of clusters.
Variational Probability Flow for Biologically Plausible Training of Deep Neural Networks
Liu, Zuozhu (Singapore University of Technology and Design) | Quek, Tony Q. S. (Singapore University of Technology and Design) | Lin, Shaowei (Singapore University of Technology and Design)
The quest for biologically plausible deep learning is driven, not just by the desire to explain experimentally-observed properties of biological neural networks, but also by the hope of discovering more efficient methods for training artificial networks. In this paper, we propose a new algorithm named Variational Probably Flow (VPF), an extension of minimum probability flow for training binary Deep Boltzmann Machines (DBMs). We show that weight updates in VPF are local, depending only on the states and firing rates of the adjacent neurons. Unlike contrastive divergence, there is no need for Gibbs confabulations; and unlike backpropagation, alternating feedforward and feedback phases are not required. Moreover, the learning algorithm is effective for training DBMs with intra-layer connections between the hidden nodes. Experiments with MNIST and Fashion MNIST demonstrate that VPF learns reasonable features quickly, reconstructs corrupted images more accurately, and generates samples with a high estimated log-likelihood. Lastly, we note that, interestingly, if an asymmetric version of VPF exists, the weight updates directly explain experimental results in Spike-Timing-Dependent Plasticity (STDP).
Bayesian Network Structure Learning: The Two-Step Clustering-Based Algorithm
Zhang, Yikun (Sun Yat-sen University) | Liu, Jiming (Hong Kong Baptist University) | Liu, Yang (Hong Kong Baptist University)
In this paper we introduce a two-step clustering-based strategy, which can automatically generate prior information from data in order to further improve the accuracy and time efficiency of state-of-the-art algorithms for Bayesian network structure learning. Our clustering-based strategy is composed of two steps. In the first step, we divide the potential nodes into several groups via clustering analysis and apply Bayesian network structure learning to obtain some pre-existing arcs within each cluster. In the second step, with all the within-cluster arcs being well preserved, we learn the between-cluster structure of the given network. Experimental results on benchmark datasets show that a wide range of structure learning algorithms benefit from the proposed clustering-based strategy in terms of both accuracy and efficiency.
Constructing Hierarchical Bayesian Networks With Pooling
Nishino, Kaneharu (The University of Tokyo) | Inaba, Mary (The University of Tokyo)
Inspired by the Bayesian brain hypothesis and deep learning, we develop a Bayesian autoencoder, a method of constructing recognition systems using a Bayesian network. We construct hierarchical Bayesian networks based on feature extraction and implement pooling to achieve invariance within a Bayesian network framework. The constructed networks propagate information bidirectionally between layers. We expect they will be able to achieve brain-like recognition using local features and global information such as their environments.
Investigating the Role of Ensemble Learning in High-Value Wine Identification
Portinale, Luigi (Computer Science Institute,ย University of Piemonte Orientale) | Locatelli, Monica (University of Piemonte Orientale)
We tackle the problem of authenticating high value Italian wines through machine learning classification. The problem is a seriuos one, since protection of high quality wines from forgeries is worth several million of Euros each year. In a previous work we have identified some base models (in particular classifiers based on Bayesian network (BNC), multi-layer perceptron (MLP) and sequential minimal optimization (SMO)) that well behave using unexpensive chemical analyses of the interested wines. In the present paper, we investigate the role of esemble learning in the construction of more robust classifiers; results suggest that, while bagging and boosting may significantly improve both BNC and MLP, the SMO model is already very robust and efficient as a base learner. We report on results concerning both cross validation on two different datasets, as well as experiments with models trained with the above datasets and tested with a dataset of potentially fake wines; this has been synthesized from a generative probabilistic model learned from real samples and expert knowledge.
Face Sketch Synthesis From Coarse to Fine
Zhang, Mingjin (Xidian University) | Wang, Nannan (Xidian University) | Li, Yunsong (Xidian University) | Wang, Ruxin (Yunnan Union Vision Innovations Technology Co., Ltd) | Gao, Xinbo (Xidian University)
Synthesizing fine face sketches from photos is a valuable yet challenging problem in digital entertainment. Face sketches synthesized by conventional methods usually exhibit coarse structures of faces, whereas fine details are lost especially on some critical facial components. In this paper, by imitating the coarse-to-fine drawing process of artists, we propose a novel face sketch synthesis framework consisting of a coarse stage and a fine stage. In the coarse stage, a mapping relationship between face photos and sketches is learned via the convolutional neural network. It ensures that the synthesized sketches keep coarse structures of faces. Given the test photo and the coarse synthesized sketch, a probabilistic graphic model is designed to synthesize the delicate face sketch which has fine and critical details. Experimental results on public face sketch databases illustrate that our proposed framework outperforms the state-of-the-art methods in both quantitive and visual comparisons.
Semi-Supervised Bayesian Attribute Learning for Person Re-Identification
Liu, Wenhe (University of Technology, Sydney) | Chang, Xiaojun (Carnegie Mellon University) | Chen, Ling (University of Technology, Sydney) | Yang, Yi (University of Technology, Sydney)
Person re-identification (re-ID) tasks aim to identify the same person in multiple images captured from non-overlapping camera views. Most previous re-ID studies have attempted to solve this problem through either representation learning or metric learning, or by combining both techniques. Representation learning relies on the latent factors or attributes of the data. In most of these works, the dimensionality of the factors/attributes has to be manually determined for each new dataset. Thus, this approach is not robust. Metric learning optimizes a metric across the dataset to measure similarity according to distance. However, choosing the optimal method for computing these distances is data dependent, and learning the appropriate metric relies on a sufficient number of pair-wise labels. To overcome these limitations, we propose a novel algorithm for person re-ID, called semi-supervised Bayesian attribute learning. We introduce an Indian Buffet Process to identify the priors of the latent attributes. The dimensionality of attributes factors is then automatically determined by nonparametric Bayesian learning. Meanwhile, unlike traditional distance metric learning, we propose a re-identification probability distribution to describe how likely it is that a pair of images contains the same person. This technique relies solely on the latent attributes of both images. Moreover, pair-wise labels that are not known can be estimated from pair-wise labels that are known, making this a robust approach for semi-supervised learning. Extensive experiments demonstrate the superior performance of our algorithm over several state-of-the-art algorithms on small-scale datasets and comparable performance on large-scale re-ID datasets.
Asymmetric Joint Learning for Heterogeneous Face Recognition
Cao, Bing (Xidian University) | Wang, Nannan (Xidian University) | Gao, Xinbo (Xidian University) | Li, Jie (Xidian University)
Heterogeneous face recognition (HFR) refers to matching a probe face image taken from one modality to face images acquired from another modality. It plays an important role in security scenarios. However, HFR is still a challenging problem due to great discrepancies between cross-modality images. This paper proposes an asymmetric joint learning (AJL) approach to handle this issue. The proposed method transforms the cross-modality differences mutually by incorporating the synthesized images into the learning process which provides more discriminative information. Although the aggregated data would augment the scale of intra-classes, it also reduces the diversity (i.e. discriminative information) for inter-classes. Then, we develop the AJL model to balance this dilemma. Finally, we could obtain the similarity score between two heterogeneous face images through the log-likelihood ratio. Extensive experiments on viewed sketch database, forensic sketch database and near infrared image database illustrate that the proposed AJL-HFR method achieve superior performance in comparison to state-of-the-art methods.
Towards Training Probabilistic Topic Models on Neuromorphic Multi-Chip Systems
Xiao, Zihao (Tsinghua University) | Chen, Jianfei (Tsinghua University) | Zhu, Jun (Tsinghua University)
Probabilistic topic models are popular unsupervised learning methods, including probabilistic latent semantic indexing (pLSI) and latent Dirichlet allocation (LDA). By now, their training is implemented on general purpose computers (GPCs), which are flexible in programming but energy-consuming. Towards low-energy implementations, this paper investigates their training on an emerging hardware technology called the neuromorphic multi-chip systems (NMSs). NMSs are very effective for a family of algorithms called spiking neural networks (SNNs). We present three SNNs to train topic models.The first SNN is a batch algorithm combining the conventional collapsed Gibbs sampling (CGS) algorithm and an inference SNN to train LDA. The other two SNNs are online algorithms targeting at both energy- and storage-limited environments. The two online algorithms are equivalent with training LDA by using maximum-a-posterior estimation and maximizing the semi-collapsed likelihood, respectively.They use novel, tailored ordinary differential equations for stochastic optimization. We simulate the new algorithms and show that they are comparable with the GPC algorithms, while being suitable for NMS implementation. We also propose an extension to train pLSI and a method to prune the network to obey the limited fan-in of some NMSs.
Conditional PSDDs: Modeling and Learning With Modular Knowledge
Shen, Yujia (University of California, Los Angeles) | Choi, Arthur (University of California, Los Angeles) | Darwiche, Adnan (University of California, Los Angeles)
Probabilistic Sentential Decision Diagrams (PSDDs) have been proposed for learning tractable probability distributions from a combination of data and background knowledge (in the form of Boolean constraints). In this paper, we propose a variant on PSDDs, called conditional PSDDs, for representing a family of distributions that are conditioned on the same set of variables. Conditional PSDDs can also be learned from a combination of data and (modular) background knowledge. We use conditional PSDDs to define a more structured version of Bayesian networks, in which nodes can have an exponential number of states, hence expanding the scope of domains where Bayesian networks can be applied. Compared to classical PSDDs, the new representation exploits the independencies captured by a Bayesian network to decompose the learning process into localized learning tasks, which enables the learning of better models while using less computation. We illustrate the promise of conditional PSDDs and structured Bayesian networks empirically, and by providing a case study to the modeling of distributions over routes on a map.