Bayesian Learning
Bayesian Constraint Inference from User Demonstrations Based on Margin-Respecting Preference Models
Papadimitriou, Dimitris, Brown, Daniel S.
It is crucial for robots to be aware of the presence of constraints in order to acquire safe policies. However, explicitly specifying all constraints in an environment can be a challenging task. State-of-the-art constraint inference algorithms learn constraints from demonstrations, but tend to be computationally expensive and prone to instability issues. In this paper, we propose a novel Bayesian method that infers constraints based on preferences over demonstrations. The main advantages of our proposed approach are that it 1) infers constraints without calculating a new policy at each iteration, 2) uses a simple and more realistic ranking of groups of demonstrations, without requiring pairwise comparisons over all demonstrations, and 3) adapts to cases where there are varying levels of constraint violation. Our empirical results demonstrate that our proposed Bayesian approach infers constraints of varying severity, more accurately than state-of-the-art constraint inference methods.
Scalable Vision-Based 3D Object Detection and Monocular Depth Estimation for Autonomous Driving
This dissertation is a multifaceted contribution to the advancement of vision-based 3D perception technologies. In the first segment, the thesis introduces structural enhancements to both monocular and stereo 3D object detection algorithms. By integrating ground-referenced geometric priors into monocular detection models, this research achieves unparalleled accuracy in benchmark evaluations for monocular 3D detection. Concurrently, the work refines stereo 3D detection paradigms by incorporating insights and inferential structures gleaned from monocular networks, thereby augmenting the operational efficiency of stereo detection systems. The second segment is devoted to data-driven strategies and their real-world applications in 3D vision detection. A novel training regimen is introduced that amalgamates datasets annotated with either 2D or 3D labels. This approach not only augments the detection models through the utilization of a substantially expanded dataset but also facilitates economical model deployment in real-world scenarios where only 2D annotations are readily available. Lastly, the dissertation presents an innovative pipeline tailored for unsupervised depth estimation in autonomous driving contexts. Extensive empirical analyses affirm the robustness and efficacy of this newly proposed pipeline. Collectively, these contributions lay a robust foundation for the widespread adoption of vision-based 3D perception technologies in autonomous driving applications.
Arabic Text Sentiment Analysis: Reinforcing Human-Performed Surveys with Wider Topic Analysis
Almurqren, Latifah, Hodgson, Ryan, Cristea, Alexandra
Sentiment analysis (SA) has been, and is still, a thriving research area. However, the task of Arabic sentiment analysis (ASA) is still underrepresented in the body of research. This study offers the first in-depth and in-breadth analysis of existing ASA studies of textual content and identifies their common themes, domains of application, methods, approaches, technologies and algorithms used. The in-depth study manually analyses 133 ASA papers published in the English language between 2002 and 2020 from four academic databases (SAGE, IEEE, Springer, WILEY) and from Google Scholar. The in-breadth study uses modern, automatic machine learning techniques, such as topic modelling and temporal analysis, on Open Access resources, to reinforce themes and trends identified by the prior study, on 2297 ASA publications between 2010-2020. The main findings show the different approaches used for ASA: machine learning, lexicon-based and hybrid approaches. Other findings include ASA 'winning' algorithms (SVM, NB, hybrid methods). Deep learning methods, such as LSTM can provide higher accuracy, but for ASA sometimes the corpora are not large enough to support them. Additionally, whilst there are some ASA corpora and lexicons, more are required. Specifically, Arabic tweets corpora and datasets are currently only moderately sized. Moreover, Arabic lexicons that have high coverage contain only Modern Standard Arabic (MSA) words, and those with Arabic dialects are quite small. Thus, new corpora need to be created. On the other hand, ASA tools are stringently lacking. There is a need to develop ASA tools that can be used in industry, as well as in academia, for Arabic text SA. Hence, our study offers insights into the challenges associated with ASA research and provides suggestions for ways to move the field forward such as lack of Dialectical Arabic resource, Arabic tweets, corpora and data sets for SA.
Machine and deep learning methods for predicting 3D genome organization
Wall, Brydon P. G., Nguyen, My, Harrell, J. Chuck, Dozmorov, Mikhail G.
Three-Dimensional (3D) chromatin interactions, such as enhancer-promoter interactions (EPIs), loops, Topologically Associating Domains (TADs), and A/B compartments play critical roles in a wide range of cellular processes by regulating gene expression. Recent development of chromatin conformation capture technologies has enabled genome-wide profiling of various 3D structures, even with single cells. However, current catalogs of 3D structures remain incomplete and unreliable due to differences in technology, tools, and low data resolution. Machine learning methods have emerged as an alternative to obtain missing 3D interactions and/or improve resolution. Such methods frequently use genome annotation data (ChIP-seq, DNAse-seq, etc.), DNA sequencing information (k-mers, Transcription Factor Binding Site (TFBS) motifs), and other genomic properties to learn the associations between genomic features and chromatin interactions. In this review, we discuss computational tools for predicting three types of 3D interactions (EPIs, chromatin interactions, TAD boundaries) and analyze their pros and cons. We also point out obstacles of computational prediction of 3D interactions and suggest future research directions.
Applied Causal Inference Powered by ML and AI
Chernozhukov, Victor, Hansen, Christian, Kallus, Nathan, Spindler, Martin, Syrgkanis, Vasilis
This book aims to provide a working introduction to the emerging fusion of modern statistical inference - aka machine learning (ML) or artificial intelligence (AI) - and causal inference methods. The book is aimed at upper level undergraduates and master's-level students as well as doctoral students focusing on applied empirical research. A sufficient background for the core material is one semester of introductory econometrics and one semester of machine learning. We hope the book is also useful to empirical researchers looking to apply modern methods in their work. The book provides an overview of key ideas in both predictive inference and causal inference and shows how predictive tools are key ingredients to answering many causal questions.
Error bounds for particle gradient descent, and extensions of the log-Sobolev and Talagrand inequalities
Caprio, Rocco, Kuntz, Juan, Power, Samuel, Johansen, Adam M.
We prove non-asymptotic error bounds for particle gradient descent (PGD)~(Kuntz et al., 2023), a recently introduced algorithm for maximum likelihood estimation of large latent variable models obtained by discretizing a gradient flow of the free energy. We begin by showing that, for models satisfying a condition generalizing both the log-Sobolev and the Polyak--{\L}ojasiewicz inequalities (LSI and P{\L}I, respectively), the flow converges exponentially fast to the set of minimizers of the free energy. We achieve this by extending a result well-known in the optimal transport literature (that the LSI implies the Talagrand inequality) and its counterpart in the optimization literature (that the P{\L}I implies the so-called quadratic growth condition), and applying it to our new setting. We also generalize the Bakry--\'Emery Theorem and show that the LSI/P{\L}I generalization holds for models with strongly concave log-likelihoods. For such models, we further control PGD's discretization error, obtaining non-asymptotic error bounds. While we are motivated by the study of PGD, we believe that the inequalities and results we extend may be of independent interest.
Deep Horseshoe Gaussian Processes
Castillo, Ismaël, Randrianarisoa, Thibault
Deep Gaussian processes have recently been proposed as natural objects to fit, similarly to deep neural networks, possibly complex features present in modern data samples, such as compositional structures. Adopting a Bayesian nonparametric approach, it is natural to use deep Gaussian processes as prior distributions, and use the corresponding posterior distributions for statistical inference. We introduce the deep Horseshoe Gaussian process Deep-HGP, a new simple prior based on deep Gaussian processes with a squared-exponential kernel, that in particular enables data-driven choices of the key lengthscale parameters. For nonparametric regression with random design, we show that the associated tempered posterior distribution recovers the unknown true regression curve optimally in terms of quadratic loss, up to a logarithmic factor, in an adaptive way. The convergence rates are simultaneously adaptive to both the smoothness of the regression function and to its structure in terms of compositions. The dependence of the rates in terms of dimension are explicit, allowing in particular for input spaces of dimension increasing with the number of observations.
Supplementary Materials of "BAST: Bayesian Additive Regression Spanning Trees for Complex Constrained Domain "
These appendices provide supplementary details and results of BAST. Appendix A contains additional details on Bayesian estimation and prediction. Supplementary simulation details and results including hyperparameter tuning and computation time can be found in Appendix B. Finally, Appendix C provides the proof of Proposition 1. Appendix A.1 Estimation This appendix provides details on the Markov chain Monte Carlo (MCMC) algorithm discussed in Section 3.1. This probability specification works well in our experiments, but one can modify it if desired. Appendix A.2 Prediction in Two-dimensional Constrained Domains In this subsection we provide details on specifying the neighbor set N To sample the cluster membership of u, we need to determine the cluster memberships for vertices on the domain boundary, which can be done by, for instance, assigning a boundary vertex to the same cluster as its nearest vertex in S with respect to the graph distance in the CDT mesh (when the number of vertices in the CDT graph is large, we expect this to well approximate the geodesic distance).
Statistical Mechanics of Dynamical System Identification
Klishin, Andrei A., Bakarji, Joseph, Kutz, J. Nathan, Manohar, Krithika
Recovering dynamical equations from observed noisy data is the central challenge of system identification. We develop a statistical mechanical approach to analyze sparse equation discovery algorithms, which typically balance data fit and parsimony through a trial-and-error selection of hyperparameters. In this framework, statistical mechanics offers tools to analyze the interplay between complexity and fitness, in analogy to that done between entropy and energy. To establish this analogy, we define the optimization procedure as a two-level Bayesian inference problem that separates variable selection from coefficient values and enables the computation of the posterior parameter distribution in closed form. A key advantage of employing statistical mechanical concepts, such as free energy and the partition function, is in the quantification of uncertainty, especially in in the low-data limit; frequently encountered in real-world applications. As the data volume increases, our approach mirrors the thermodynamic limit, leading to distinct sparsity- and noise-induced phase transitions that delineate correct from incorrect identification. This perspective of sparse equation discovery, is versatile and can be adapted to various other equation discovery algorithms.