Z Advanced Computing, Inc. (ZAC) of Potomac, MD announced on August 27 that it is funded by the US Air Force, to use ZAC's detailed 3D image recognition technology, based on Explainable-AI, for drones (unmanned aerial vehicle or UAV) for aerial image/object recognition. ZAC is the first to demonstrate Explainable-AI, where various attributes and details of 3D (three dimensional) objects can be recognized from any view or angle. "With our superior approach, complex 3D objects can be recognized from any direction, using only a small number of training samples," said Dr. Saied Tadayon, CTO of ZAC. "For complex tasks, such as drone vision, you need ZAC's superior technology to handle detailed 3D image recognition." "You cannot do this with the other techniques, such as Deep Convolutional Neural Networks, even with an extremely large number of training samples. That's basically hitting the limits of the CNNs," continued Dr. Bijan Tadayon, CEO of ZAC.

Zanella, Giacomo, Betancourt, Brenda, Wallach, Hanna, Miller, Jeffrey, Zaidi, Abbas, Steorts, Rebecca C.

Most generative models for clustering implicitly assume that the number of data points in each cluster grows linearly with the total number of data points. Finite mixture models, Dirichlet process mixture models, and Pitman--Yor process mixture models make this assumption, as do all other infinitely exchangeable clustering models. However, for some applications, this assumption is inappropriate. For example, when performing entity resolution, the size of each cluster should be unrelated to the size of the data set, and each cluster should contain a negligible fraction of the total number of data points. These applications require models that yield clusters whose sizes grow sublinearly with the size of the data set. We address this requirement by defining the microclustering property and introducing a new class of models that can exhibit this property. We compare models within this class to two commonly used clustering models using four entity-resolution data sets.

Herbrich, Ralf, Graepel, Thore

We present a bound on the generalisation error of linear classifiers in terms of a refined margin quantity on the training set. The result is obtained in a PAC-Bayesian framework and is based on geometrical arguments in the space of linear classifiers. The new bound constitutes an exponential improvement of the so far tightest margin bound by Shawe-Taylor et al. [8] and scales logarithmically in the inverse margin. Even in the case of less training examples than input dimensions sufficiently large margins lead to nontrivial bound values and - for maximum margins - to a vanishing complexity term.Furthermore, the classical margin is too coarse a measure for the essential quantity that controls the generalisation error: the volume ratio between the whole hypothesis space and the subset of consistent hypotheses. The practical relevance of the result lies in the fact that the well-known support vector machine is optimal w.r.t. the new bound only if the feature vectors are all of the same length. As a consequence we recommend to use SVMs on normalised feature vectors only - a recommendation that is well supported by our numerical experiments on two benchmark data sets. 1 Introduction Linear classifiers are exceedingly popular in the machine learning community due to their straightforward applicability and high flexibility which has recently been boosted by the so-called kernel methods [13]. A natural and popular framework for the theoretical analysis of classifiers is the PAC (probably approximately correct) framework[11] which is closely related to Vapnik's work on the generalisation error [12]. For binary classifiers it turned out that the growth function is an appropriate measureof "complexity" and can tightly be upper bounded by the VC (Vapnik-Chervonenkis) dimension [14].

Betancourt, Brenda, Zanella, Giacomo, Miller, Jeffrey W., Wallach, Hanna, Zaidi, Abbas, Steorts, Rebecca C.

Most generative models for clustering implicitly assume that the number of data points in each cluster grows linearly with the total number of data points. Finite mixture models, Dirichlet process mixture models, and Pitman-Yor process mixture models make this assumption, as do all other infinitely exchangeable clustering models. However, for some applications, this assumption is inappropriate. For example, when performing entity resolution, the size of each cluster should be unrelated to the size of the data set, and each cluster should contain a negligible fraction of the total number of data points. These applications require models that yield clusters whose sizes grow sublinearly with the size of the data set. We address this requirement by defining the microclustering property and introducing a new class of models that can exhibit this property. We compare models within this class to two commonly used clustering models using four entity-resolution data sets.

We extend the Chow-Liu algorithm for general random variables while the previous versions only considered finite cases. In particular, this paper applies the generalization to Suzuki's learning algorithm that generates from data forests rather than trees based on the minimum description length by balancing the fitness of the data to the forest and the simplicity of the forest. As a result, we successfully obtain an algorithm when both of the Gaussian and finite random variables are present.