Goto

Collaborating Authors

 bouchard


One-vs-Each Approximation to Softmax for Scalable Estimation of Probabilities

Neural Information Processing Systems

The softmax representation of probabilities for categorical variables plays a prominent role in modern machine learning with numerous applications in areas such as large scale classification, neural language modeling and recommendation systems. However, softmax estimation is very expensive for large scale inference because of the high cost associated with computing the normalizing constant. Here, we introduce an efficient approximation to softmax probabilities which takes the form of a rigorous lower bound on the exact probability. This bound is expressed as a product over pairwise probabilities and it leads to scalable estimation based on stochastic optimization. It allows us to perform doubly stochastic estimation by subsampling both training instances and class labels. We show that the new bound has interesting theoretical properties and we demonstrate its use in classification problems.


Unsupervised Protoform Reconstruction through Parsimonious Rule-guided Heuristics and Evolutionary Search

arXiv.org Artificial Intelligence

We propose an unsupervised method for the reconstruction of protoforms i.e., ancestral word forms from which modern language forms are derived. While prior work has primarily relied on probabilistic models of phonological edits to infer protoforms from cognate sets, such approaches are limited by their p redominantly data - driven nature. In contrast, our model integrates data - driven inference with rule - based heuristics within an evolutionary optimization framework. This hybrid approach leverages on both statistical patterns and linguistically motivat ed constraints to guide the reconstruction process. We evaluate our method on the task of reconstructing Latin protoforms using a dataset of cognates from five Romance languages. Experimental results demonstrate substantial improvements over established ba selines across both character - level accuracy and phonological plausibility metrics. Keywords: protoform reconstruction, historical linguistics, evolutionary algorithms, phonological modeling, rule - based inference .


LangFair: A Python Package for Assessing Bias and Fairness in Large Language Model Use Cases

arXiv.org Artificial Intelligence

Large Language Models (LLMs) have been observed to exhibit bias in numerous ways, potentially creating or worsening outcomes for specific groups identified by protected attributes such as sex, race, sexual orientation, or age. To help address this gap, we introduce LangFair, an open-source Python package that aims to equip LLM practitioners with the tools to evaluate bias and fairness risks relevant to their specific use cases. The package offers functionality to easily generate evaluation datasets, comprised of LLM responses to use-case-specific prompts, and subsequently calculate applicable metrics for the practitioner's use case. To guide in metric selection, LangFair offers an actionable decision framework.


One-vs-Each Approximation to Softmax for Scalable Estimation of Probabilities

Neural Information Processing Systems

The softmax representation of probabilities for categorical variables plays a prominent role in modern machine learning with numerous applications in areas such as large scale classification, neural language modeling and recommendation systems. However, softmax estimation is very expensive for large scale inference because of the high cost associated with computing the normalizing constant. Here, we introduce an efficient approximation to softmax probabilities which takes the form of a rigorous lower bound on the exact probability. This bound is expressed as a product over pairwise probabilities and it leads to scalable estimation based on stochastic optimization. It allows us to perform doubly stochastic estimation by subsampling both training instances and class labels. We show that the new bound has interesting theoretical properties and we demonstrate its use in classification problems.


The Fisher-Rao geometry of CES distributions

arXiv.org Machine Learning

When dealing with a parametric statistical model, a Riemannian manifold can naturally appear by endowing the parameter space with the Fisher information metric. The geometry induced on the parameters by this metric is then referred to as the Fisher-Rao information geometry. Interestingly, this yields a point of view that allows for leveragingmany tools from differential geometry. After a brief introduction about these concepts, we will present some practical uses of these geometric tools in the framework of elliptical distributions. This second part of the exposition is divided into three main axes: Riemannian optimization for covariance matrix estimation, Intrinsic Cram\'er-Rao bounds, and classification using Riemannian distances.



Reconsidering Analytical Variational Bounds for Output Layers of Deep Networks

arXiv.org Machine Learning

The combination of the re-parameterization trick with the use of variational auto-encoders has caused a sensation in Bayesian deep learning, allowing the training of realistic generative models of images and has considerably increased our ability to use scalable latent variable models. The re-parameterization trick is necessary for models in which no analytical variational bound is available and allows noisy gradients to be computed for arbitrary models. However, for certain standard output layers of a neural network, analytical bounds are available and the variational auto-encoder may be used both without the re-parameterization trick or the need for any Monte Carlo approximation. In this work, we show that using Jaakola and Jordan bound, we can produce a binary classification layer that allows a Bayesian output layer to be trained, using the standard stochastic gradient descent algorithm. We further demonstrate that a latent variable model utilizing the Bouchard bound for multi-class classification allows for fast training of a fully probabilistic latent factor model, even when the number of classes is very large.


Rise of the robot

#artificialintelligence

A four-part look at how robots are changing the way we work. About 180 robots here are doing work that humans used to do at a GE Aviation plant that makes parts for jet engines. But they haven't replaced the humans. Indeed, the opposite is true. Since a new, automated section of the plant ramped up at the start of the decade, the number of people working here has risen to more than 900 from 600. "A machine is not replacing three jobs," said Eric Bouchard, senior operations manager at the Bromont plant.


The charity that wants video game karts in every hospital

Engadget

In many ways, Jonathan Watson is like other 11-year-olds. He does his homework, dreams of becoming a doctor and plays video games when he can. Depending on the day, his favorite is either Minecraft or The Elder Scrolls V: Skyrim. Unlike most kids his age, though, Jonathan is at the hospital every three weeks for blood transfusions -- a procedure that can take up to six hours at a time. When I visited him at Mott Children's Hospital in Ann Arbor, Michigan, he wasn't slaying dragons or building a pixelated fortress; he was replaying the opening levels of Rayman Legends on a kart that had just been wheeled in.


One-vs-Each Approximation to Softmax for Scalable Estimation of Probabilities

Neural Information Processing Systems

The softmax representation of probabilities for categorical variables plays a prominent role in modern machine learning with numerous applications in areas such as large scale classification, neural language modeling and recommendation systems. However, softmax estimation is very expensive for large scale inference because of the high cost associated with computing the normalizing constant. Here, we introduce an efficient approximation to softmax probabilities which takes the form of a rigorous lower bound on the exact probability. This bound is expressed as a product over pairwise probabilities and it leads to scalable estimation based on stochastic optimization. It allows us to perform doubly stochastic estimation by subsampling both training instances and class labels. We show that the new bound has interesting theoretical properties and we demonstrate its use in classification problems.