Directed Networks
Semi-supervised Logistic Learning Based on Exponential Tilt Mixture Models
Consider semi-supervised learning for classification, where both labeled and unlabeled data are available for training. The goal is to exploit both datasets to achieve higher prediction accuracy than just using labeled data alone. We develop a semi-supervised logistic learning method based on exponential tilt mixture models, by extending a statistical equivalence between logistic regression and exponential tilt modeling. We study maximum nonparametric likelihood estimation and derive novel objective functions which are shown to be Fisher consistent. We also propose regularized estimation and construct simple and highly interpretable EM algorithms. Finally, we present numerical results which demonstrate the advantage of the proposed methods compared with existing methods.
TitAnt: Online Real-time Transaction Fraud Detection in Ant Financial
Cao, Shaosheng, Yang, Xinxing, Chen, Cen, Zhou, Jun, Li, Xiaolong, Qi, Yuan
With the explosive growth of e-commerce and the booming of e-payment, detecting online transaction fraud in real time has become increasingly important to Fintech business. To tackle this problem, we introduce the TitAnt, a transaction fraud detection system deployed in Ant Financial, one of the largest Fintech companies in the world. The system is able to predict online real-time transaction fraud in mere milliseconds. We present the problem definition, feature extraction, detection methods, implementation and deployment of the system, as well as empirical effectiveness. Extensive experiments have been conducted on large real-world transaction data to show the effectiveness and the efficiency of the proposed system.
Uncovering Probabilistic Implications in Typological Knowledge Bases
Bjerva, Johannes, Kementchedjhieva, Yova, Cotterell, Ryan, Augenstein, Isabelle
The study of linguistic typology is rooted in the implications we find between linguistic features, such as the fact that languages with object-verb word ordering tend to have post-positions. Uncovering such implications typically amounts to time-consuming manual processing by trained and experienced linguists, which potentially leaves key linguistic universals unexplored. In this paper, we present a computational model which successfully identifies known universals, including Greenberg universals, but also uncovers new ones, worthy of further linguistic investigation. Our approach outperforms baselines previously used for this problem, as well as a strong baseline from knowledge base population.
Introduction to Bayesian Modeling with PyMC3 - Dr. Juan Camilo Orduz
We can also see this visually. We can verify the convergence of the chains formally using the Gelman Rubin test. Values close to 1.0 mean convergence. We can also test for correlation between samples in the chains. We are aiming for zero auto-correlation to get "random" samples from the posterior distribution.
Analyses of Multi-collection Corpora via Compound Topic Modeling
George, Clint P., Xia, Wei, Michailidis, George
As electronically stored data grow in daily life, obtaining novel and relevant information becomes challenging in text mining. Thus people have sought statistical methods based on term frequency, matrix algebra, or topic modeling for text mining. Popular topic models have centered on one single text collection, which is deficient for comparative text analyses. We consider a setting where one can partition the corpus into subcollections. Each subcollection shares a common set of topics, but there exists relative variation in topic proportions among collections. Including any prior knowledge about the corpus (e.g. organization structure), we propose the compound latent Dirichlet allocation (cLDA) model, improving on previous work, encouraging generalizability, and depending less on user-input parameters. To identify the parameters of interest in cLDA, we study Markov chain Monte Carlo (MCMC) and variational inference approaches extensively, and suggest an efficient MCMC method. We evaluate cLDA qualitatively and quantitatively using both synthetic and real-world corpora. The usability study on some real-world corpora illustrates the superiority of cLDA to explore the underlying topics automatically but also model their connections and variations across multiple collections.
Replacing the do-calculus with Bayes rule
Lattimore, Finnian, Rohde, David
The concept of causality has a controversial history. The question of whether it is possible to represent and address causal problems with probability theory, or if fundamentally new mathematics such as the do calculus is required has been hotly debated, e.g. Pearl (2001) states "the building blocks of our scientific and everyday knowledge are elementary facts such as "mud does not cause rain" and "symptoms do not cause disease" and those facts, strangely enough, cannot be expressed in the vocabulary of probability calculus". This has lead to a dichotomy between advocates of causal graphical modeling and the do calculus, and researchers applying Bayesian methods. In this paper we demonstrate that, while it is critical to explicitly model our assumptions on the impact of intervening in a system, provided we do so, estimating causal effects can be done entirely within the standard Bayesian paradigm. The invariance assumptions underlying causal graphical models can be encoded in ordinary Probabilistic graphical models, allowing causal estimation with Bayesian statistics, equivalent to the do calculus. Elucidating the connections between these approaches is a key step toward enabling the insights provided by each to be combined to solve real problems.
A Bayesian Solution to the M-Bias Problem
It is common practice in using regression type models for inferring causal effects, that inferring the correct causal relationship requires extra covariates are included or ``adjusted for''. Without performing this adjustment erroneous causal effects can be inferred. Given this phenomenon it is common practice to include as many covariates as possible, however such advice comes unstuck in the presence of M-bias. M-Bias is a problem in causal inference where the correct estimation of treatment effects requires that certain variables are not adjusted for i.e. are simply neglected from inclusion in the model. This issue caused a storm of controversy in 2009 when Rubin, Pearl and others disagreed about if it could be problematic to include additional variables in models when inferring causal effects. This paper makes two contributions to this issue. Firstly we provide a Bayesian solution to the M-Bias problem. The solution replicates Pearl's solution, but consistent with Rubin's advice we condition on all variables. Secondly the fact that we are able to offer a solution to this problem in Bayesian terms shows that it is indeed possible to represent causal relationships within the Bayesian paradigm, albeit in an extended space. We make several remarks on the similarities and differences between causal graphical models which implement the do-calculus and probabilistic graphical models which enable Bayesian statistics. We hope this work will stimulate more research on unifying Pearl's causal calculus using causal graphical models with traditional Bayesian statistics and probabilistic graphical models.
A Survey of Optimization Methods from a Machine Learning Perspective
Sun, Shiliang, Cao, Zehui, Zhu, Han, Zhao, Jing
Machine learning develops rapidly, which has made many theoretical breakthroughs and is widely applied in various fields. Optimization, as an important part of machine learning, has attracted much attention of researchers. With the exponential growth of data amount and the increase of model complexity, optimization methods in machine learning face more and more challenges. A lot of work on solving optimization problems or improving optimization methods in machine learning has been proposed successively. The systematic retrospect and summary of the optimization methods from the perspective of machine learning are of great significance, which can offer guidance for both developments of optimization and machine learning research. In this paper, we first describe the optimization problems in machine learning. Then, we introduce the principles and progresses of commonly used optimization methods. Next, we summarize the applications and developments of optimization methods in some popular machine learning fields. Finally, we explore and give some challenges and open problems for the optimization in machine learning.
From Incomplete, Dynamic Data to Bayesian Networks
Bayesian networks are a versatile and powerful tool to model complex phenomena and the interplay of their components in a probabilistically principled way. Moving beyond the comparatively simple case of completely observed, static data, which has received the most attention in the literature, in this paper we will review how Bayesian networks can model dynamic data and data with incomplete observations. Such data are the norm at the forefront of research and applications, and Bayesian networks are uniquely positioned to model them due to their explainability and interpretability.
Artificial Intelligence Made Easy with H2O.ai
If you're anything like my dad, you've worked in IT for decades but have only tangentially touched data science. Now, your new C-something-O wants you to fire up a data analytics team and work with new a set of buzzwords you've only vaguely heard about at conferences. Or perhaps you're a developer at a fast-moving startup and have spent weeks finalizing an algorithm, only to be stymied by issues with deploying the model onto your web application for real time use. For both cases, H2O.ai is definitely a solution worth looking into. H2O.ai positions itself as a software package that streamlines the machine learning process through its open source package H2O and AutoML.