AITopics | Sudderth, Erik

Differentiable and Stable Long-Range Tracking of Multiple Posterior Modes

arXiv.org Artificial IntelligenceApr-12-2024

Particle filters flexibly represent multiple posterior modes nonparametrically, via a collection of weighted samples, but have classically been applied to tracking problems with known dynamics and observation likelihoods. Such generative models may be inaccurate or unavailable for high-dimensional observations like images. We instead leverage training data to discriminatively learn particle-based representations of uncertainty in latent object states, conditioned on arbitrary observations via deep neural network encoders. While prior discriminative particle filters have used heuristic relaxations of discrete particle resampling, or biased learning by truncating gradients at resampling steps, we achieve unbiased and low-variance gradient estimates by representing posteriors as continuous mixture densities. Our theory and experiments expose dramatic failures of existing reparameterization-based estimators for mixture gradients, an issue we address via an importance-sampling gradient estimator. Unlike standard recurrent neural networks, our mixture density particle filter represents multimodal uncertainty in continuous latent states, improving accuracy and robustness. On a range of challenging tracking and robot localization problems, our approach achieves dramatic improvements in accuracy, while also showing much greater stability across multiple training runs.

artificial intelligence, machine learning, time-step, (20 more...)

arXiv.org Artificial Intelligence

2404.08789

Country: North America > United States > California (0.14)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Scalable Adaptation of State Complexity for Nonparametric Hidden Markov Models

Hughes, Michael C., Stephenson, William T., Sudderth, Erik

Neural Information Processing SystemsFeb-14-2020, 08:41:57 GMT

Bayesian nonparametric hidden Markov models are typically learned via fixed truncations of the infinite state space or local Monte Carlo proposals that make small changes to the state space. We develop an inference algorithm for the sticky hierarchical Dirichlet process hidden Markov model that scales to big datasets by processing a few sequences at a time yet allows rapid adaptation of the state space cardinality. Unlike previous point-estimate methods, our novel variational bound penalizes redundant or irrelevant states and thus enables optimization of the state space. Our birth proposals use observed data statistics to create useful new states that escape local optima. Merge and delete proposals remove ineffective states to yield simpler models with more affordable future computations.

artificial intelligence, machine learning, scalable adaptation, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Multiscale Semi-Markov Dynamics for Intracortical Brain-Computer Interfaces

Milstein, Daniel, Pacheco, Jason, Hochberg, Leigh, Simeral, John D., Jarosiewicz, Beata, Sudderth, Erik

Neural Information Processing SystemsDec-31-2017

Intracortical brain-computer interfaces (iBCIs) have allowed people with tetraplegia to control a computer cursor by imagining the movement of their paralyzed arm or hand. State-of-the-art decoders deployed in human iBCIs are derived from a Kalman filter that assumes Markov dynamics on the angle of intended movement, and a unimodal dependence on intended angle for each channel of neural activity. Due to errors made in the decoding of noisy neural data, as a user attempts to move the cursor to a goal, the angle between cursor and goal positions may change rapidly. We propose a dynamic Bayesian network that includes the on-screen goal position as part of its latent state, and thus allows the person's intended angle of movement to be aggregated over a much longer history of neural activity. This multiscale model explicitly captures the relationship between instantaneous angles of motion and long-term goals, and incorporates semi-Markov dynamics for motion trajectories. We also introduce a multimodal likelihood model for recordings of neural populations which can be rapidly calibrated for clinical applications. In offline experiments with recorded neural data, we demonstrate significantly improved prediction of motion directions compared to the Kalman filter. We derive an efficient online inference algorithm, enabling a clinical trial participant with tetraplegia to control a computer cursor with neural activity in real time. The observed kinematics of cursor movement are objectively straighter and smoother than prior iBCI decoding models without loss of responsiveness.

decoder, neural network, us government, (22 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > California > Santa Clara County (0.14)
North America > United States > California > Orange County > Irvine (0.14)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.67)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.94)
Health & Medicine > Health Care Providers & Services (0.68)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.90)

Add feedback

Scalable Adaptation of State Complexity for Nonparametric Hidden Markov Models

Hughes, Michael C., Stephenson, William T., Sudderth, Erik

Neural Information Processing SystemsDec-31-2015

Bayesian nonparametric hidden Markov models are typically learned via fixed truncations of the infinite state space or local Monte Carlo proposals that make small changes to the state space. We develop an inference algorithm for the sticky hierarchical Dirichlet process hidden Markov model that scales to big datasets by processing a few sequences at a time yet allows rapid adaptation of the state space cardinality. Unlike previous point-estimate methods, our novel variational bound penalizes redundant or irrelevant states and thus enables optimization of the state space. Our birth proposals use observed data statistics to create useful new states that escape local optima. Merge and delete proposals remove ineffective states to yield simpler models with more affordable future computations. Experiments on speaker diarization, motion capture, and epigenetic chromatin datasets discover models that are more compact, more interpretable, and better aligned to ground truth segmentations than competitors. We have released an open-source Python implementation which can parallelize local inference steps across sequences.

artificial intelligence, health & medicine, sequence, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.14)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Efficient Online Inference for Bayesian Nonparametric Relational Models

Kim, Dae Il, Gopalan, Prem K., Blei, David, Sudderth, Erik

Neural Information Processing SystemsDec-31-2013

Stochastic block models characterize observed network relationships via latent community memberships. In large social networks, we expect entities to participate in multiple communities, and the number of communities to grow with the network size. We introduce a new model for these phenomena, the hierarchical Dirichlet process relational model, which allows nodes to have mixed membership in an unbounded set of communities. To allow scalable learning, we derive an online stochastic variational inference algorithm. Focusing on assortative models of undirected networks, we also propose an efficient structured mean field variational bound, and online methods for automatically pruning unused communities. Compared to state-of-the-art online learning methods for parametric relational models, we show significantly improved perplexity and link prediction accuracy for sparse networks with tens of thousands of nodes. We also showcase an analysis of LittleSis, a large network of who-knows-who at the heights of business and government.

artificial intelligence, machine learning, node, (18 more...)

Neural Information Processing Systems

Industry: Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Databases (0.82)

Add feedback

Memoized Online Variational Inference for Dirichlet Process Mixture Models

Hughes, Michael C., Sudderth, Erik

Neural Information Processing SystemsDec-31-2013

Variational inference algorithms provide the most effective framework for large-scale training of Bayesian nonparametric models. Stochastic online approaches are promising, but are sensitive to the chosen learning rate and often converge to poor local optima. We present a new algorithm, memoized online variational inference, which scales to very large (yet finite) datasets while avoiding the complexities of stochastic gradient. Our algorithm maintains finite-dimensional sufficient statistics from batches of the full dataset, requiring some additional memory but still scaling to millions of examples. Exploiting nested families of variational bounds for infinite nonparametric models, we develop principled birth and merge moves allowing non-local optimization. Births adaptively add components to the model to escape local optima, while merges remove redundancy and improve speed. Using Dirichlet process mixture models for image clustering and denoising, we demonstrate major improvements in robustness and accuracy.

Add feedback

The Nonparametric Metadata Dependent Relational Model

Kim, Dae Il, Hughes, Michael, Sudderth, Erik

arXiv.org Machine LearningJun-27-2012

We introduce the nonparametric metadata dependent relational (NMDR) model, a Bayesian nonparametric stochastic block model for network data. The NMDR allows the entities associated with each node to have mixed membership in an unbounded collection of latent communities. Learned regression models allow these memberships to depend on, and be predicted from, arbitrary node metadata. We develop efficient MCMC algorithms for learning NMDR models from partially observed node relationships. Retrospective MCMC methods allow our sampler to work directly with the infinite stick-breaking representation of the NMDR, avoiding the need for finite truncations. Our results demonstrate recovery of useful latent communities from real-world social and ecological networks, and the usefulness of metadata in link prediction tasks.

artificial intelligence, bayesian inference, boston, (17 more...)

arXiv.org Machine Learning

1206.6414

Country: Europe > United Kingdom > Scotland (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (0.46)
Law (0.46)
Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Add feedback

Global Seismic Monitoring: A Bayesian Approach

Arora, Nimar S. (University of California, Berkeley) | Russell, Stuart (University of California, Berkeley) | Kidwell, Paul (Lawrence Livermore National Lab) | Sudderth, Erik (Brown University)

AAAI ConferencesAug-4-2011

The automated processing of multiple seismic signals to detect and localize seismic events is a central tool in both geophysics and nuclear treaty verification. This paper reports on a project, begun in 2009, to reformulate this problem in a Bayesian framework. A Bayesian seismic monitoring system, NET-VISA, has been built comprising a spatial event prior and generative models of event transmission and detection, as well as an inference algorithm. Applied in the context of the International Monitoring System (IMS), a global sensor network developed for the Comprehensive Nuclear-Test-Ban Treaty (CTBT), NET-VISA achieves a reduction of around 50% in the number of missed events compared to the currently deployed system. It also finds events that are missed even by the human analysts who post-process the IMS output.

artificial intelligence, bayesian inference, detection, (18 more...)

AAAI Conferences

Twenty-Fifth AAAI Conference on Artificial Intelligence

Country: North America > United States > California > Alameda County > Berkeley (0.14)

Genre: Research Report (0.89)

Industry: Government > Military (1.00)

Technology: