Information Technology
Nested Variational Inference Hao Wu Jan-Willem van de Meent
We develop nested variational inference (NVI), a family of methods that learn proposals for nested importance samplers by minimizing an forward or reverse KL divergence at each level of nesting. NVI is applicable to many commonly-used importance sampling strategies and provides a mechanism for learning intermediate densities, which can serve as heuristics to guide the sampler. Our experiments apply NVI to (a) sample from a multimodal distribution using a learned annealing path (b) learn heuristics that approximate the likelihood of future observations in a hidden Markov model and (c) to perform amortized inference in hierarchical deep generative models. We observe that optimizing nested objectives leads to improved sample quality in terms of log average weight and effective sample size.
This beast of a robot vacuum is heavily discounted at Amazon -- save 700 on the Roborock Qrevo Master
SAVE 700: As of May 22, the Roborock Qrevo Master is on sale for 899.99 at Amazon. As of May 22, the Roborock Qrevo Master robot vacuum and mop is on sale for 44% off, now down to 899.99. And with this vacuum, you're getting a whole lot to be excited about. The Qrevo Master handles both vacuuming and mopping, with minimal effort required on your end. Its self-emptying dock means up to seven weeks of hands-free cleaning, and with 10,000Pa suction and the Carpet Boost System, it's seriously effective, removing up to 99% of hair from carpets.
Russia-Ukraine war: List of key events, day 1,183
Russia's Defence Ministry said air defences shot down 105 Ukrainian drones over Russian regions, including 35 over the Moscow region, after the ministry said a day earlier that it had downed more than 300 Ukrainian drones. Kherson Governor Oleksandr Prokudin said one person was killed in a Russian artillery attack on the region. H said over the past day, 35 areas in Kherson, including Kherson city, came under artillery shelling and air attacks, wounding 11 people. Ukrainian President Zelenskyy said the "most intense situation" is in the Donetsk region, and the army is continuing "active operations in the Kursk and Belgorod regions". Russia's Defence Ministry said air defences shot down 105 Ukrainian drones over Russian regions, including 35 over the Moscow region, after the ministry said a day earlier that it had downed more than 300 Ukrainian drones.
Generalizing Bayesian Optimization with Decision-theoretic Entropies Willie Neiswanger
Bayesian optimization (BO) is a popular method for efficiently inferring optima of an expensive black-box function via a sequence of queries. Existing informationtheoretic BO procedures aim to make queries that most reduce the uncertainty about optima, where the uncertainty is captured by Shannon entropy. However, an optimal measure of uncertainty would, ideally, factor in how we intend to use the inferred quantity in some downstream procedure. In this paper, we instead consider a generalization of Shannon entropy from work in statistical decision theory [13, 39], which contains a broad class of uncertainty measures parameterized by a problem-specific loss function corresponding to a downstream task. We first show that special cases of this entropy lead to popular acquisition functions used in BO procedures such as knowledge gradient, expected improvement, and entropy search. We then show how alternative choices for the loss yield a flexible family of acquisition functions that can be customized for use in novel optimization settings.
Causal Discovery from Event Sequences by Local Cause-Effect Attribution
Sequences of events, such as crashes in the stock market or outages in a network, contain strong temporal dependencies, whose understanding is crucial to react to and influence future events. In this paper, we study the problem of discovering the underlying causal structure from event sequences. To this end, we introduce a new causal model, where individual events of the cause trigger events of the effect with dynamic delays. We show that in contrast to existing methods based on Granger causality, our model is identifiable for both instant and delayed effects. We base our approach on the Algorithmic Markov Condition, by which we identify the true causal network as the one that minimizes the Kolmogorov complexity. As the Kolmogorov complexity is not computable, we instantiate our model using Minimum Description Length and show that the resulting score identifies the causal direction.
A Optimal K-priors for GLMs
We present theoretical results to show that K-priors with limited memory can achieve low gradientreconstruction error. We will discuss the optimal K-prior which can theoretically achieve perfect reconstruction error. Note that the prior is difficult to realize in practice since it requires all past training-data inputs X. Our goal here is to establish a theoretical limit, not to give practical choices. Our key idea is to choose a few input locations that provide a good representation of the training-data inputs X.
Invariant and Transportable Representations for Anti-Causal Domain Shifts and Victor Veitch Department of Computer Science, University of Chicago Department of Statistics, University of Chicago
Real-world classification problems must contend with domain shift, the (potential) mismatch between the domain where a model is deployed and the domain(s) where the training data was gathered. Methods to handle such problems must specify what structure is common between the domains and what varies. A natural assumption is that causal (structural) relationships are invariant in all domains. Then, it is tempting to learn a predictor for label Y that depends only on its causal parents. However, many real-world problems are "anti-causal" in the sense that Y is a cause of the covariates X--in this case, Y has no causal parents and the naive causal invariance is useless.
A The Embeddings
In this section, we briefly introduce the four kinds of emebddings consists the fusion embedding. The goal of position embedding module is to calibrate the position of each time point in the sequence so that the self-attention mechanism can recognize the relative positions between different time points in the input sequence. We design the token embedding module in order to enrich the features of each time point by fusion of other features from the adjacent time points within a certain interval. The role of spatial embedding is to locate and encode the spatial locations of different nodes, by which each node at different location possesses a unique spatial embedding. Thus, it enabling the model to identify nodes in different spatial and temporal planes after the dimensionality is compressed in the later computation.