Regression
Aspects of importance sampling in parameter selection for neural networks using ridgelet transform
The choice of parameters in neural networks is crucial in the performance, and an oracle distribution derived from the ridgelet transform enables us to obtain suitable initial parameters. In other words, the distribution of parameters is connected to the integral representation of target functions. The oracle distribution allows us to avoid the conventional backpropagation learning process; only a linear regression is enough to construct the neural network in simple cases. This study provides a new look at the oracle distributions and ridgelet transforms, i.e., an aspect of importance sampling. In addition, we propose extensions of the parameter sampling methods. We demonstrate the aspect of importance sampling and the proposed sampling algorithms via one-dimensional and high-dimensional examples; the results imply that the magnitude of weight parameters could be more crucial than the intercept parameters.
Score matching through the roof: linear, nonlinear, and latent variables causal discovery
Montagna, Francesco, Faller, Philipp M., Bloebaum, Patrick, Kirschbaum, Elke, Locatello, Francesco
Causal discovery from observational data holds great promise, but existing methods rely on strong assumptions about the underlying causal structure, often requiring full observability of all relevant variables. We tackle these challenges by leveraging the score function $\nabla \log p(X)$ of observed variables for causal discovery and propose the following contributions. First, we generalize the existing results of identifiability with the score to additive noise models with minimal requirements on the causal mechanisms. Second, we establish conditions for inferring causal relations from the score even in the presence of hidden variables; this result is two-faced: we demonstrate the score's potential as an alternative to conditional independence tests to infer the equivalence class of causal graphs with hidden variables, and we provide the necessary conditions for identifying direct causes in latent variable models. Building on these insights, we propose a flexible algorithm for causal discovery across linear, nonlinear, and latent variable models, which we empirically validate.
Achieving interpretable machine learning by functional decomposition of black-box models into explainable predictor effects
Kรถhler, David, Rรผgamer, David, Schmid, Matthias
Machine learning (ML) has increased greatly in both popularity and significance, driven by an increase in methods, computing power and data availability [33]. On July 5, 2024, a search on Web of Science for publications including the term "machine learning" yielded more than 350,000 results, corresponding to an average annual increase by more than 20% since 2006. ML models are often characterized by their high generalizability, making them particularly successful when used for supervised learning tasks like classification and risk prediction. In recent years, ML models based on deep artificial neural networks (ANNs) have led to groundbreaking results in the development of high-performing prediction models. The high prediction accuracy of modern ML models is usually achieved by optimizing complex "black-box" architectures with thousands of parameters. As a consequence, they often result in predictions that are difficult, if not impossible, to interpret. This interpretability problem has been hindering the use of ML in fields like medicine, ecology and insurance, where an understanding of the model and its inner workings is paramount to ensure user acceptance and fairness. In a recent environmental study, for example, we explored the use of ML to derive predictions of stream biological condition in the Chesapeake Bay watershed of the mid-Atlantic coast of North America [26]. Clearly, if these predictions are intended to inform future management policies (projecting, e.g., changes in land use, climate and watershed characteristics), they are required to be interpretable in terms of relevant features as well as the directions and strengths of the feature effects.
Artificial Intelligence-based Decision Support Systems for Precision and Digital Health
Deliu, Nina, Chakraborty, Bibhas
Precision health, increasingly supported by digital technologies, is a domain of research that broadens the paradigm of precision medicine, advancing everyday healthcare. This vision goes hand in hand with the groundbreaking advent of artificial intelligence (AI), which is reshaping the way we diagnose, treat, and monitor both clinical subjects and the general population. AI tools powered by machine learning have shown considerable improvements in a variety of healthcare domains. In particular, reinforcement learning (RL) holds great promise for sequential and dynamic problems such as dynamic treatment regimes and just-in-time adaptive interventions in digital health. In this work, we discuss the opportunity offered by AI, more specifically RL, to current trends in healthcare, providing a methodological survey of RL methods in the context of precision and digital health. Focusing on the area of adaptive interventions, we expand the methodological survey with illustrative case studies that used RL in real practice. This invited article has undergone anonymous review and is intended as a book chapter for the volume "Frontiers of Statistics and Data Science" edited by Subhashis Ghoshal and Anindya Roy for the International Indian Statistical Association Series on Statistics and Data Science, published by Springer. It covers the material from a short course titled "Artificial Intelligence in Precision and Digital Health" taught by the author Bibhas Chakraborty at the IISA 2022 Conference, December 26-30 2022, at the Indian Institute of Science, Bengaluru.
Estimating Distributional Treatment Effects in Randomized Experiments: Machine Learning for Variance Reduction
Byambadalai, Undral, Oka, Tatsushi, Yasui, Shota
We propose a novel regression adjustment method designed for estimating distributional treatment effect parameters in randomized experiments. Randomized experiments have been extensively used to estimate treatment effects in various scientific fields. However, to gain deeper insights, it is essential to estimate distributional treatment effects rather than relying solely on average effects. Our approach incorporates pre-treatment covariates into a distributional regression framework, utilizing machine learning techniques to improve the precision of distributional treatment effect estimators. The proposed approach can be readily implemented with off-the-shelf machine learning methods and remains valid as long as the nuisance components are reasonably well estimated. Also, we establish the asymptotic properties of the proposed estimator and present a uniformly valid inference method. Through simulation results and real data analysis, we demonstrate the effectiveness of integrating machine learning techniques in reducing the variance of distributional treatment effect estimators in finite samples.
Decoding Digital Influence: The Role of Social Media Behavior in Scientific Stratification Through Logistic Attribution Method
Scientific social stratification is a classic theme in the sociology of science. The deep integration of social media has bridged the gap between scientometrics and sociology of science. This study comprehensively analyzes the impact of social media on scientific stratification and mobility, delving into the complex interplay between academic status and social media activity in the digital age. [Research Method] Innovatively, this paper employs An Explainable Logistic Attribution Analysis from a meso-level perspective to explore the correlation between social media behaviors and scientific social stratification. It examines the impact of scientists' use of social media in the digital age on scientific stratification and mobility, uniquely combining statistical methods with machine learning. This fusion effectively integrates hypothesis testing with a substantive interpretation of the contribution of independent variables to the model. [Research Conclusion] Empirical evidence demonstrates that social media promotes stratification and mobility within the scientific community, revealing a nuanced and non-linear facilitation mechanism. Social media activities positively impact scientists' status within the scientific social hierarchy to a certain extent, but beyond a specific threshold, this impact turns negative. It shows that the advent of social media has opened new channels for academic influence, transcending the limitations of traditional academic publishing, and prompting changes in scientific stratification. Additionally, the study acknowledges the limitations of its experimental design and suggests future research directions.
Explainable AI-based Intrusion Detection System for Industry 5.0: An Overview of the Literature, associated Challenges, the existing Solutions, and Potential Research Directions
Khan, Naseem, Ahmad, Kashif, Tamimi, Aref Al, Alani, Mohammed M., Bermak, Amine, Khalil, Issa
Industry 5.0, which focuses on human and Artificial Intelligence (AI) collaboration for performing different tasks in manufacturing, involves a higher number of robots, Internet of Things (IoTs) devices and interconnections, Augmented/Virtual Reality (AR), and other smart devices. The huge involvement of these devices and interconnection in various critical areas, such as economy, health, education and defense systems, poses several types of potential security flaws. AI itself has been proven a very effective and powerful tool in different areas of cybersecurity, such as intrusion detection, malware detection, and phishing detection, among others. Just as in many application areas, cybersecurity professionals were reluctant to accept black-box ML solutions for cybersecurity applications. This reluctance pushed forward the adoption of eXplainable Artificial Intelligence (XAI) as a tool that helps explain how decisions are made in ML-based systems. In this survey, we present a comprehensive study of different XAI-based intrusion detection systems for industry 5.0, and we also examine the impact of explainability and interpretability on Cybersecurity practices through the lens of Adversarial XIDS (Adv-XIDS) approaches. Furthermore, we analyze the possible opportunities and challenges in XAI cybersecurity systems for industry 5.0 that elicit future research toward XAI-based solutions to be adopted by high-stakes industry 5.0 applications. We believe this rigorous analysis will establish a foundational framework for subsequent research endeavors within the specified domain.
Causal Inference with Complex Treatments: A Survey
Wang, Yingrong, Li, Haoxuan, Zhu, Minqin, Wu, Anpeng, Xiong, Ruoxuan, Wu, Fei, Kuang, Kun
Causal inference plays an important role in explanatory analysis and decision making across various fields like statistics, marketing, health care, and education. Its main task is to estimate treatment effects and make intervention policies. Traditionally, most of the previous works typically focus on the binary treatment setting that there is only one treatment for a unit to adopt or not. However, in practice, the treatment can be much more complex, encompassing multi-valued, continuous, or bundle options. In this paper, we refer to these as complex treatments and systematically and comprehensively review the causal inference methods for addressing them. First, we formally revisit the problem definition, the basic assumptions, and their possible variations under specific conditions. Second, we sequentially review the related methods for multi-valued, continuous, and bundled treatment settings. In each situation, we tentatively divide the methods into two categories: those conforming to the unconfoundedness assumption and those violating it. Subsequently, we discuss the available datasets and open-source codes. Finally, we provide a brief summary of these works and suggest potential directions for future research.
Deep Domain Adaptation Regression for Force Calibration of Optical Tactile Sensors
Chen, Zhuo, Ou, Ni, Jiang, Jiaqi, Luo, Shan
Optical tactile sensors provide robots with rich force information for robot grasping in unstructured environments. The fast and accurate calibration of three-dimensional contact forces holds significance for new sensors and existing tactile sensors which may have incurred damage or aging. However, the conventional neural-network-based force calibration method necessitates a large volume of force-labeled tactile images to minimize force prediction errors, with the need for accurate Force/Torque measurement tools as well as a time-consuming data collection process. To address this challenge, we propose a novel deep domain-adaptation force calibration method, designed to transfer the force prediction ability from a calibrated optical tactile sensor to uncalibrated ones with various combinations of domain gaps, including marker presence, illumination condition, and elastomer modulus. Experimental results show the effectiveness of the proposed unsupervised force calibration method, with lowest force prediction errors of 0.102N (3.4\% in full force range) for normal force, and 0.095N (6.3\%) and 0.062N (4.1\%) for shear forces along the x-axis and y-axis, respectively. This study presents a promising, general force calibration methodology for optical tactile sensors.
Conformal Thresholded Intervals for Efficient Regression
This paper introduces Conformal Thresholded Intervals (CTI), a novel conformal regression method that aims to produce the smallest possible prediction set with guaranteed coverage. Unlike existing methods that rely on nested conformal framework and full conditional distribution estimation, CTI estimates the conditional probability density for a new response to fall into each interquantile interval using off-the-shelf multi-output quantile regression. CTI constructs prediction sets by thresholding the estimated conditional interquantile intervals based on their length, which is inversely proportional to the estimated probability density. The threshold is determined using a calibration set to ensure marginal coverage. Experimental results demonstrate that CTI achieves optimal performance across various datasets.