Well File:

Advanced posterior analyses of hidden Markov models: finite Markov chain imbedding and hybrid decoding

arXiv.org Machine Learning

Two major tasks in applications of hidden Markov models are to (i) com pute distributions of summary statistics of the hidden state sequence, and (ii) decode the hidden state sequence. We describe finite Markov chain imbedding (FMCI) and hybrid decoding to solve each of t hese two tasks. In the first part of our paper we use FMCI to compute posterior distributions o f summary statistics such as the number of visits to a hidden state, the total time spent in a hidden st ate, the dwell time in a hidden state, and the longest run length. We use simulations from the hidde n state sequence, conditional on the observed sequence, to establish the FMCI framework. In the second part of our paper we apply hybrid segmentation for improved decoding of a HMM. We demonstra te that hybrid decoding shows increased performance compared to Viterbi or Posterior decodin g (often also referred to as global or local decoding), and we introduce a novel procedure for choosing the tuning parameter in the hybrid procedure. Furthermore, we provide an alternative derivation of the hybrid loss function based on weighted geometric means. We demonstrate and apply FMCI and hyb rid decoding on various classical data sets, and supply accompanying code for reproducibility. Key words: Artemis analysis, decoding, finite Markov chain imbedding, hidden Mar kov model, hybrid decoding, pattern distributions.


Some Optimizers are More Equal: Understanding the Role of Optimizers in Group Fairness

arXiv.org Machine Learning

We study whether and how the choice of optimization algorithm can impact group fairness in deep neural networks. Through stochastic differential equation analysis of optimization dynamics in an analytically tractable setup, we demonstrate that the choice of optimization algorithm indeed influences fairness outcomes, particularly under severe imbalance. Furthermore, we show that when comparing two categories of optimizers, adaptive methods and stochastic methods, RMSProp (from the adaptive category) has a higher likelihood of converging to fairer minima than SGD (from the stochastic category). Building on this insight, we derive two new theoretical guarantees showing that, under appropriate conditions, RMSProp exhibits fairer parameter updates and improved fairness in a single optimization step compared to SGD. We then validate these findings through extensive experiments on three publicly available datasets, namely CelebA, FairFace, and MS-COCO, across different tasks as facial expression recognition, gender classification, and multi-label classification, using various backbones. Considering multiple fairness definitions including equalized odds, equal opportunity, and demographic parity, adaptive optimizers like RMSProp and Adam consistently outperform SGD in terms of group fairness, while maintaining comparable predictive accuracy. Our results highlight the role of adaptive updates as a crucial yet overlooked mechanism for promoting fair outcomes.


On Learning Parallel Pancakes with Mostly Uniform Weights

arXiv.org Machine Learning

We study the complexity of learning $k$-mixtures of Gaussians ($k$-GMMs) on $\mathbb{R}^d$. This task is known to have complexity $d^{\Omega(k)}$ in full generality. To circumvent this exponential lower bound on the number of components, research has focused on learning families of GMMs satisfying additional structural properties. A natural assumption posits that the component weights are not exponentially small and that the components have the same unknown covariance. Recent work gave a $d^{O(\log(1/w_{\min}))}$-time algorithm for this class of GMMs, where $w_{\min}$ is the minimum weight. Our first main result is a Statistical Query (SQ) lower bound showing that this quasi-polynomial upper bound is essentially best possible, even for the special case of uniform weights. Specifically, we show that it is SQ-hard to distinguish between such a mixture and the standard Gaussian. We further explore how the distribution of weights affects the complexity of this task. Our second main result is a quasi-polynomial upper bound for the aforementioned testing task when most of the weights are uniform while a small fraction of the weights are potentially arbitrary.


Faster Algorithms for Agnostically Learning Disjunctions and their Implications

arXiv.org Machine Learning

We study the algorithmic task of learning Boolean disjunctions in the distribution-free agnostic PAC model. The best known agnostic learner for the class of disjunctions over $\{0, 1\}^n$ is the $L_1$-polynomial regression algorithm, achieving complexity $2^{\tilde{O}(n^{1/2})}$. This complexity bound is known to be nearly best possible within the class of Correlational Statistical Query (CSQ) algorithms. In this work, we develop an agnostic learner for this concept class with complexity $2^{\tilde{O}(n^{1/3})}$. Our algorithm can be implemented in the Statistical Query (SQ) model, providing the first separation between the SQ and CSQ models in distribution-free agnostic learning.


Uncertainty quantification of neural network models of evolving processes via Langevin sampling

arXiv.org Machine Learning

We propose a scalable, approximate inference hypernetwork framework for a general model of history-dependent processes. The flexible data model is based on a neural ordinary differential equation (NODE) representing the evolution of internal states together with a trainable observation model subcomponent. The posterior distribution corresponding to the data model parameters (weights and biases) follows a stochastic differential equation with a drift term related to the score of the posterior that is learned jointly with the data model parameters. This Langevin sampling approach offers flexibility in balancing the computational budget between the evaluation cost of the data model and the approximation of the posterior density of its parameters. We demonstrate performance of the hypernetwork on chemical reaction and material physics data and compare it to mean-field variational inference.


'Easter truce' in Russia's Ukraine war marked by accusations of violations

Al Jazeera

Ukraine and Russia have accused each other of breaching an "Easter truce" announced by Russian President Vladimir Putin that Ukraine said was being violated from the moment it started. In a surprise announcement on Saturday, Putin ordered his forces to "stop all military activity" along the front line in the war against Ukraine, citing humanitarian reasons. The 30-hour cessation of hostilities would have been the most significant pause in the fighting throughout the three-year conflict. But just hours after the order was meant to have come into effect, air raid sirens sounded in Kyiv and several other Ukrainian regions, with President Volodymyr Zelenskyy accusing Russia of having maintained its attacks and engaging in a PR stunt. Russia's Ministry of Defence also alleged on Sunday that Ukraine had broken the truce more than 1,000 times.


The Thinking Machine: Jensen Huang, Nvidia and the World's Most Coveted microchip – review

The Guardian

This is the latest confirmation that the "great man" theory of history continues to thrive in Silicon Valley. As such, it joins a genre that includes Walter Isaacson's twin tomes on Steve Jobs and Elon Musk, Brad Stone's book on Jeff Bezos, Michael Becraft's on Bill Gates, Max Chafkin's on Peter Thiel and Michael Lewis's on Sam Bankman-Fried. Notable characteristics of the genre include a tendency towards founder worship, discreet hagiography and a Whiggish interpretation of the life under examination. The great man under Witt's microscope is the co-founder and chief executive of Nvidia, a chip design company that went from being a small but plucky purveyor of graphics processing units (GPUs) for computer gaming to its current position as the third most valuable company in the world. Two things drove this astonishing transition.


How to Get Out of Your Own Way When Writing

Slate

Gabfest Reads is a monthly series from the hosts of Slate's Political Gabfest podcast. Recently, Maggie Smith talked with John Dickerson about her new book Dear Writer: Pep Talks & Practical Advice for the Creative Life. Maggie's first love is poetry, and they discuss how to tell when your creative endeavor is complete. This partial transcript has been edited and condensed for clarity. John Dickerson: What does it feel like when you've arrived with a poem--when you think it's "done?"


It's not too late to stop Trump and the Silicon Valley broligarchy from controlling our lives, but we must act now Carole Cadwalladr

The Guardian

To walk into the lion's den once might be considered foolhardy. To do so again after being mauled by the lion? Six years ago I gave a talk at Ted, the world's leading technology and ideas conference. It led to a gruelling lawsuit and a series of consequences that reverberate through my life to this day. And last week I returned. To give another talk that would incorporate some of my experience: a Ted Talk about being sued for giving a Ted Talk, and how the lessons I'd learned from surviving all that were a model for surviving "broligarchy" – a concept I first wrote about in the Observer in July last year: the alignment of Silicon Valley and autocracy, and a kind of power the world has never seen before.


Scientists Are Mapping the Bizarre, Chaotic Spacetime Inside Black Holes

WIRED

The original version of this story appeared in Quanta Magazine. At the beginning of time and the center of every black hole lies a point of infinite density called a singularity. To explore these enigmas, we take what we know about space, time, gravity, and quantum mechanics and apply it to a place where all of those things simply break down. There is, perhaps, nothing in the universe that challenges the imagination more. Physicists still believe that if they can come up with a coherent explanation for what actually happens in and around singularities, something revelatory will emerge, perhaps a new understanding of what space and time are made of.