Haffner, Patrick
The Role of Linguistic Priors in Measuring Compositional Generalization of Vision-Language Models
Wu, Chenwei, Li, Li Erran, Ermon, Stefano, Haffner, Patrick, Ge, Rong, Zhang, Zaiwei
Compositionality is a common property in many modalities including natural languages and images, but the compositional generalization of multi-modal models is not well-understood. In this paper, we identify two sources of visual-linguistic compositionality: linguistic priors and the interplay between images and texts. We show that current attempts to improve compositional generalization rely on linguistic priors rather than on information in the image. We also propose a new metric for compositionality without such linguistic priors.
Rapid Connectionist Speaker Adaptation
Witbrock, Michael, Haffner, Patrick
We present SVCnet, a system for modelling speaker variability. Encoder Neural Networks specialized for each speech sound produce low dimensionality models of acoustical variation, and these models are further combined into an overall model of voice variability. A training procedure is described which minimizes the dependence of this model on which sounds have been uttered. Using the trained model (SVCnet) and a brief, unconstrained sample of a new speaker's voice, the system produces a Speaker Voice Code that can be used to adapt a recognition system to the new speaker without retraining. A system which combines SVCnet with an MS-TDNN recognizer is described
Learning to Adapt by Minimizing Discrepancy
Ororbia, Alexander G. II, Haffner, Patrick, Reitter, David, Giles, C. Lee
We explore whether useful temporal neural generative models can be learned from sequential data without back-propagation through time. We investigate the viability of a more neurocognitively-grounded approach in the context of unsupervised generative modeling of sequences. Specifically, we build on the concept of predictive coding, which has gained influence in cognitive science, in a neural framework. To do so we develop a novel architecture, the Temporal Neural Coding Network, and its learning algorithm, Discrepancy Reduction. The underlying directed generative model is fully recurrent, meaning that it employs structural feedback connections and temporal feedback connections, yielding information propagation cycles that create local learning signals. This facilitates a unified bottom-up and top-down approach for information transfer inside the architecture. Our proposed algorithm shows promise on the bouncing balls generative modeling problem. Further experiments could be conducted to explore the strengths and weaknesses of our approach.
Rational Kernels
Cortes, Corinna, Haffner, Patrick, Mohri, Mehryar
We introduce a general family of kernels based on weighted transducers or rational relations, rational kernels, that can be used for analysis of variable-length sequences or more generally weighted automata, in applications such as computational biology or speech recognition. We show that rational kernels can be computed efficiently using a general algorithm of composition of weighted transducers and a general single-source shortest-distance algorithm. We also describe several general families of positive definite symmetric rational kernels. These general kernels can be combined with Support Vector Machines to form efficient and powerful techniques for spoken-dialog classification: highly complex kernels become easy to design and implement and lead to substantial improvements in the classification accuracy. We also show that the string kernels considered in applications to computational biology are all specific instances of rational kernels.
Rational Kernels
Cortes, Corinna, Haffner, Patrick, Mohri, Mehryar
We introduce a general family of kernels based on weighted transducers orrational relations, rational kernels, that can be used for analysis of variable-length sequences or more generally weighted automata, in applications suchas computational biology or speech recognition. We show that rational kernels can be computed efficiently using a general algorithm ofcomposition of weighted transducers and a general single-source shortest-distance algorithm. We also describe several general families of positive definite symmetric rational kernels. These general kernels can be combined with Support Vector Machines to form efficient and powerful techniquesfor spoken-dialog classification: highly complex kernels become easy to design and implement and lead to substantial improvements inthe classification accuracy. We also show that the string kernels considered in applications to computational biology are all specific instances ofrational kernels.
Escaping the Convex Hull with Extrapolated Vector Machines
Haffner, Patrick
Maximum margin classifiers such as Support Vector Machines (SVMs) critically depends upon the convex hulls of the training samples of each class, as they implicitly search for the minimum distance between the convex hulls. We propose Extrapolated Vector Machines (XVMs) which rely on extrapolations outside these convex hulls. XVMs improve SVM generalization very significantly on the MNIST [7] OCR data. They share similarities with the Fisher discriminant: maximize the inter-class margin while minimizing the intra-class disparity.
Escaping the Convex Hull with Extrapolated Vector Machines
Haffner, Patrick
Maximum margin classifiers such as Support Vector Machines (SVMs) critically depends upon the convex hulls of the training samples of each class, as they implicitly search for the minimum distance between the convex hulls. We propose Extrapolated Vector Machines(XVMs) which rely on extrapolations outside these convex hulls.