noise probability
The Trustworthy Pal: Controlling the False Discovery Rate in Boolean Matrix Factorization
Hess, Sibylle, Piatkowski, Nico, Morik, Katharina
Boolean matrix factorization (BMF) is a popular and powerful technique for inferring knowledge from data. The mining result is the Boolean product of two matrices, approximating the input dataset. The Boolean product is a disjunction of rank-1 binary matrices, each describing a feature-relation, called pattern, for a group of samples. Yet, there are no guarantees that any of the returned patterns do not actually arise from noise, i.e., are false discoveries. In this paper, we propose and discuss the usage of the false discovery rate in the unsupervised BMF setting. We prove two bounds on the probability that a found pattern is constituted of random Bernoulli-distributed noise. Each bound exploits a specific property of the factorization which minimizes the approximation error---yielding new insights on the minimizers of Boolean matrix factorization. This leads to improved BMF algorithms by replacing heuristic rank selection techniques with a theoretically well-based approach. Our empirical demonstration shows that both bounds deliver excellent results in various practical settings.
Fast Rates for a kNN Classifier Robust to Unknown Asymmetric Label Noise
Reeve, Henry W. J., Kaban, Ata
We consider classification in the presence of class-dependent asymmetric label noise with unknown noise probabilities. In this setting, identifiability conditions are known, but additional assumptions were shown to be required for finite sample rates, and so far only the parametric rate has been obtained. Assuming these identifiability conditions, together with a measure-smoothness condition on the regression function and Tsybakov's margin condition, we show that the Robust kNN classifier of Gao et al. attains, the minimax optimal rates of the noise-free setting, up to a log factor, even when trained on data with unknown asymmetric label noise. Hence, our results provide a solid theoretical backing for this empirically successful algorithm. By contrast the standard kNN is not even consistent in the setting of asymmetric label noise. A key idea in our analysis is a simple kNN based method for estimating the maximum of a function that requires far less assumptions than existing mode estimators do, and which may be of independent interest for noise proportion estimation and randomised optimisation problems.
- North America > United States (0.28)
- Europe > United Kingdom > England (0.28)
Classification with unknown class conditional label noise on non-compact feature spaces
We investigate the problem of classification in the presence of unknown class conditional label noise in which the labels observed by the learner have been corrupted with some unknown class dependent probability. In order to obtain finite sample rates, previous approaches to classification with unknown class conditional label noise have required that the regression function attains its extrema uniformly on sets of positive measure. We shall consider this problem in the setting of non-compact metric spaces, where the regression function need not attain its extrema. In this setting we determine the minimax optimal learning rates (up to logarithmic factors). The rate displays interesting threshold behaviour: When the regression function approaches its extrema at a sufficient rate, the optimal learning rates are of the same order as those obtained in the label-noise free setting. If the regression function approaches its extrema more gradually then classification performance necessarily degrades. In addition, we present an algorithm which attains these rates without prior knowledge of either the distributional parameters or the local density. This identifies for the first time a scenario in which finite sample rates are achievable in the label noise setting, but they differ from the optimal rates without label noise.
Negation Without Negation in Probabilistic Logic Programming
Buchman, David (University of British Columbia) | Poole, David (University of British Columbia)
Probabilistic logic programs without negation can have cycles (with a preference for false), but cannot represent all conditional distributions. Probabilistic logic programs with negation can represent arbitrary conditional probabilities, but with cycles they create logical inconsistencies. We show how allowing negative noise probabilities allows us to represent arbitrary conditional probabilities without negations. Noise probabilities for non-exclusive rules are difficult to interpret and unintuitive to manipulate; to alleviate this we define ``probability-strengths'' which provide an intuitive additive algebra for combining rules. For acyclic programs we prove what constraints on the strengths allow for proper distributions on the non-noise variables and allow for all non-extreme distributions to be represented. We show how arbitrary CPDs can be converted into this form in a canonical way. Furthermore, if a joint distribution can be compactly represented by a cyclic program with negations, we show how it can also be compactly represented with negative noise probabilities and no negations. This allows algorithms for exact inference that do not support negations to be applicable to probabilistic logic programs with negations.
- North America > United States (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)