xin
Stable Blanket with Hidden Variables and Cycles
Stabilized regression aims to identify a set of predictors whose conditional relationship with a response variable remains invariant across different environments. Existing graphical characterizations of the stable blanket are mainly developed for structural causal models (SCMs) without hidden variables or causal cycles. However, latent variables and feedback relationships naturally arise in many applications, and they can change both the Markov blanket and the set of predictors that remain stable under interventions. This paper studies stable blankets in graphical causal models with hidden variables, causal cycles, and both features simultaneously. For models with hidden variables, we use acyclic directed mixed graphs (ADMGs) and $m$-separation to characterize the Markov blanket and to construct intervention-stable predictor sets. We introduce the notion of an intervened sub-district and use it to describe how interventions may affect districts connected to the response. For models with cycles, we work with directed graphs (DGs) and directed mixed graphs (DMGs) together with $ฯ$-separation, treating strongly connected components (SCCs) as the basic graphical units. We then combine these ideas to analyze models with both hidden variables and cycles. The main results give graphical characterizations of Markov blankets, stable frontiers, and stable blankets in these generalized settings. In particular, we identify conditions under which the response is conditionally independent of intervention variables given a suitable predictor set, and we describe when such sets are minimal or unique. These results extend the graphical interpretation of stabilized regression beyond acyclic fully observed models.
Explanation of Dynamic Physical Field Predictions using WassersteinGrad: Application to Autoregressive Weather Forecasting
Essafouri, Younes, Raynaud, Laure, Drozda, Luciano, Risser, Laurent
As the demand to integrate Artificial Intelligence into high-stakes environments continues to grow, explaining the reasoning behind neural-network predictions has shifted from a theoretical curiosity to a strict operational requirement. Our work is motivated by the explanations of autoregressive neural predictions on dynamic physical fields, as in weather forecasting. Gradient-based feature attribution methods are widely used to explain the predictions on such data, in particular due to their scalability to high-dimensional inputs. It is also interesting to remark that gradient-based techniques such as SmoothGrad are now standard on images to robustify the explanations using pointwise averages of the attribution maps obtained from several noised inputs. Our goal is to efficiently adapt this aggregation strategy to dynamic physical fields. To do so, our first contribution is to identify a fundamental failure mode when averaging perturbed attribution maps on dynamic physical fields: stochastic input perturbations do not induce stationary amplitude noise in attribution maps, but instead cause a geometric displacement of the attributions. Consequently, pointwise averaging blurs these spatially misaligned features. To tackle this issue, we introduce WassersteinGrad, which extracts a geometric consensus of perturbed attribution maps by computing their entropic Wasserstein barycenter. The results, obtained on regional weather data and a meteorologist-validated neural model, demonstrate promising explainability properties of WassersteinGrad over gradient-based baselines across both single-step and autoregressive forecasting settings.
ParallelBackpropagationforShared-Feature Visualization
High-level visual brain regions contain subareas in which neurons appear to respond more strongly to examples of a particular semantic category, like faces or bodies, rather than objects. However, recent work has shown that while this finding holds on average, some out-of-category stimuli also activate neurons in these regions.
VC Theory for Inventory Policies
Xie, Yaqi, Ma, Will, Xin, Linwei
Advances in computational power and AI have increased interest in reinforcement learning approaches to inventory management. This paper provides a theoretical foundation for these approaches and investigates the benefits of restricting to policy structures that are well-established by decades of inventory theory. In particular, we prove generalization guarantees for learning several well-known classes of inventory policies, including base-stock and (s, S) policies, by leveraging the celebrated Vapnik-Chervonenkis (VC) theory. We apply the concepts of the Pseudo-dimension and Fat-shattering dimension from VC theory to determine the generalizability of inventory policies, that is, the difference between an inventory policy's performance on training data and its expected performance on unseen data. We focus on a classical setting without contexts, but allow for an arbitrary distribution over demand sequences and do not make any assumptions such as independence over time. We corroborate our supervised learning results using numerical simulations. Managerially, our theory and simulations translate to the following insights. First, there is a principle of "learning less is more" in inventory management: depending on the amount of data available, it may be beneficial to restrict oneself to a simpler, albeit suboptimal, class of inventory policies to minimize overfitting errors. Second, the number of parameters in a policy class may not be the correct measure of overfitting error: in fact, the class of policies defined by T time-varying base-stock levels exhibits a generalization error comparable to that of the two-parameter (s, S) policy class. Finally, our research suggests situations in which it could be beneficial to incorporate the concepts of base-stock and inventory position into black-box learning machines, instead of having these machines directly learn the order quantity actions.
Cross-strait Variations on Two Near-synonymous Loanwords xie2shang1 and tan2pan4: A Corpus-based Comparative Study
This study attempts to investigate cross-strait variations on two typical synonymous loanwords in Chinese, i.e. xie2shang1 and tan2pan4, drawn on MARVS theory. Through a comparative analysis, the study found some distributional, eventual, and contextual similarities and differences across Taiwan and Mainland Mandarin. Compared with the underused tan2pan4, xie2shang1 is significantly overused in Taiwan Mandarin and vice versa in Mainland Mandarin. Additionally, though both words can refer to an inchoative process in Mainland and Taiwan Mandarin, the starting point for xie2shang1 in Mainland Mandarin is somewhat blurring compared with the usage in Taiwan Mandarin. Further on, in Taiwan Mandarin, tan2pan4 can be used in economic and diplomatic contexts, while xie2shang1 is used almost exclusively in political contexts. In Mainland Mandarin, however, the two words can be used in a hybrid manner within political contexts; moreover, tan2pan4 is prominently used in diplomatic contexts with less reference to economic activities, while xie2sahng1 can be found in both political and legal contexts, emphasizing a role of mediation.
Xin
Most existing cross-domain recommendation algorithms focus on modeling ratings, while ignoring review texts. The review text, however, contains rich information, which can be utilized to alleviate data sparsity limitations, and interpret transfer patterns. In this paper, we investigate how to utilize the review text to improve cross-domain collaborative filtering models. The challenge lies in the existence of non-linear properties in some transfer patterns. Given this, we extend previous transfer learning models in collaborative filtering, from linear mapping functions to non-linear ones, and propose a cross-domain recommendation framework with the review text incorporated. Experimental verifications have demonstrated, for new users with sparse feedback, utilizing the review text obtains 10% improvement in the AUC metric, and the nonlinear method outperforms the linear ones by 4%.
Using artificial intelligence to advance energy technologies
Hongliang Xin, an associate professor of chemical engineering in the College of Engineering, and his collaborators have devised a new artificial intelligence framework that can accelerate discovery of materials for important technologies, such as fuel cells and carbon capture devices. Titled "Infusing theory into deep learning for interpretable reactivity prediction," their paper in the journal Nature Communications details a new approach called TinNet--short for theory-infused neural network--that combines machine-learning algorithms and theories for identifying new catalysts. Catalysts are materials that trigger or speed up chemical reactions. TinNet is based on deep learning, also known as a subfield of machine learning, which uses algorithms to mimic how human brains work. The 1996 victory of IBM's Deep Blue computer over world chess champion Garry Kasparov was one of the first advances in machine learning.
Artificial intelligence to advance energy technologies
Hongliang Xin, an associate professor of chemical engineering in the College of Engineering, and his collaborators have devised a new artificial intelligence framework that can accelerate discovery of materials for important technologies, such as fuel cells and carbon capture devices. Titled "Infusing theory into deep learning for interpretable reactivity prediction," their paper in the journal Nature Communications details a new approach called TinNet -- short for theory-infused neural network -- that combines machine-learning algorithms and theories for identifying new catalysts. Catalysts are materials that trigger or speed up chemical reactions. TinNet is based on deep learning, also known as a subfield of machine learning, which uses algorithms to mimic how human brains work. The 1996 victory of IBM's Deep Blue computer over world chess champion Garry Kasparov was one of the first advances in machine learning.
Unlocking the secrets of chemical bonding with machine learning
In a report published in Nature Communications, Hongliang Xin, associate professor of chemical engineering at Virginia Tech, and his team of researchers developed a Bayesian learning model of chemisorption, or Bayeschem for short, aiming to use artificial intelligence to unlock the nature of chemical bonding at catalyst surfaces. "It all comes down to how catalysts bind with molecules," said Xin. "The interaction has to be strong enough to break some chemical bonds at reasonably low temperatures, but not too strong that catalysts would be poisoned by reaction intermediates. This rule is known as the Sabatier principle in catalysis." Understanding how catalysts interact with different intermediates and determining how to control their bond strengths so that they are within that "goldilocks zone" is the key to designing efficient catalytic processes, Xin said. The research provides a tool for that purpose. Bayeschem works using Bayesian learning, a specific machine learning algorithm for inferring models from data.