It is known that exact computation of conditional probabilities in belief networks is NP-hard. Many investigators in the AI community have tacitly assumed that algorithms for performing approximate inference with belief networks are of polynomial complexity. Indeed, special cases of approximate inference can be performed in time polynomial in the input size. However, we have discovered that the general problem of approximating conditional probabilities with belief networks, like exact inference, resides in the NP-hard complexity class. We develop a complexity analysis to elucidate the difficulty of approximate probabilistic inference.
Homomorphic Encryption (HE) is one of the most promising security solutions to emerging Machine Learning as a Service (MLaaS). Several Leveled-HE (LHE)-enabled Convolutional Neural Networks (LHECNNs) are proposed to implement MLaaS to avoid the large bootstrapping overhead. However, prior LHECNNs have to pay significant computational overhead but achieve only low inference accuracy, due to their polynomial approximation activations and poolings. Stacking many polynomial approximation activation layers in a network greatly reduces the inference accuracy, since the polynomial approximation activation errors lead to a low distortion of the output distribution of the next batch normalization layer. So the polynomial approximation activations and poolings have become the obstacle to a fast and accurate LHECNN model.
The fundamental idea of Bayesian inference is to become "less wrong" with more data. The process is straightforward: we have an initial belief, known as a prior, which we update as we gain additional information. Although we don't think about it as Bayesian Inference, we use this technique all the time. For example, we might initially think there is a 50% chance we will get a promotion at the end of the quarter. If we receive positive feedback from our manager, we adjust our estimate upwards, and conversely, we might decrease the probability if we make a mess with the coffee machine.
Learning, taking into account full distribution of the data, referred to as generative, is not feasible with deep neural networks (DNNs) because they model only the conditional distribution of the outputs given the inputs. Current solutions are either based on joint probability models facing difficult estimation problems or learn two separate networks, mapping inputs to outputs (recognition) and vice-versa (generation). We propose an intermediate approach. First, we show that forward computation in DNNs with logistic sigmoid activations corresponds to a simplified approximate Bayesian inference in a directed probabilistic multi-layer model. This connection allows to interpret DNN as a probabilistic model of the output and all hidden units given the input. Second, we propose that in order for the recognition and generation networks to be more consistent with the joint model of the data, weights of the recognition and generator network should be related by transposition. We demonstrate in a tentative experiment that such a coupled pair can be learned generatively, modelling the full distribution of the data, and has enough capacity to perform well in both recognition and generation.
We define a new notion of conditional belief, which plays the same role for Dempster-Shafer belief functions as conditional probability does for probability functions. Our definition is different from the standard definition given by Dempster, and avoids many of the well-known problems of that definition. Just as the conditional probability Pr (lB) is a probability function which is the result of conditioning on B being true, so too our conditional belief function Bel (lB) is a belief function which is the result of conditioning on B being true. We define the conditional belief as the lower envelope (that is, the inf) of a family of conditional probability functions, and provide a closed form expression for it. An alternate way of understanding our definition of conditional belief is provided by considering ideas from an earlier paper [Fagin and Halpern, 1989], where we connect belief functions with inner measures. In particular, we show here how to extend the definition of conditional probability to non measurable sets, in order to get notions of inner and outer conditional probabilities, which can be viewed as best approximations to the true conditional probability, given our lack of information. Our definition of conditional belief turns out to be an exact analogue of our definition of inner conditional probability.