Goto

Collaborating Authors

 Diagnosis



Causal programming: inference with structural causal models as finding instances of a relation

arXiv.org Artificial Intelligence

This paper proposes a causal inference relation and causal programming as general frameworks for causal inference with structural causal models. A tuple, $\langle M, I, Q, F \rangle$, is an instance of the relation if a formula, $F$, computes a causal query, $Q$, as a function of known population probabilities, $I$, in every model entailed by a set of model assumptions, $M$. Many problems in causal inference can be viewed as the problem of enumerating instances of the relation that satisfy given criteria. This unifies a number of previously studied problems, including causal effect identification, causal discovery and recovery from selection bias. In addition, the relation supports formalizing new problems in causal inference with structural causal models, such as the problem of research design. Causal programming is proposed as a further generalization of causal inference as the problem of finding optimal instances of the relation, with respect to a cost function.


Decision Tree Classification models to predict employee turnover

#artificialintelligence

In this project I have attempted to create supervised learning models to assist in classifying certain employee data. I pre-processed the data by removing one outlier and producing new features in Excel as the data set was small at 1056 rows. Some categorical features were also converted to numeric values in Excel. For example, Gender was originally "M" or "F", which was converted to 0 and 1 respectively. I also removed employee number as it provides no value as a feature and could compromise privacy.


Computer-aided diagnosis prior to conventional interpretation of prostate mpMRI: an international multi-reader study

#artificialintelligence

Nine radiologists (three each high, intermediate, low experience) from eight institutions participated. A total of 163 patients with 3-T mpMRI from 4/2012 to 6/2015 were included: 110 cancer patients with prostatectomy after mpMRI, 53 patients with no lesions on mpMRI and negative TRUS-guided biopsy. Readers were blinded to all outcomes and detected lesions per PI-RADSv2 on mpMRI. After 5 weeks, readers re-evaluated patients using CAD to detect lesions. Prostatectomy specimens registered to MRI were ground truth with index lesions defined on pathology.


Deep Transfer Network with Joint Distribution Adaptation: A New Intelligent Fault Diagnosis Framework for Industry Application

arXiv.org Machine Learning

In recent years, an increasing popularity of deep learning model for intelligent condition monitoring and diagnosis as well as prognostics used for mechanical systems and structures has been observed. In the previous studies, however, a major assumption accepted by default, is that the training and testing data are taking from same feature distribution. Unfortunately, this assumption is mostly invalid in real application, resulting in a certain lack of applicability for the traditional diagnosis approaches. Inspired by the idea of transfer learning that leverages the knowledge learnt from rich labeled data in source domain to facilitate diagnosing a new but similar target task, a new intelligent fault diagnosis framework, i.e., deep transfer network (DTN), which generalizes deep learning model to domain adaptation scenario, is proposed in this paper. By extending the marginal distribution adaptation (MDA) to joint distribution adaptation (JDA), the proposed framework can exploit the discrimination structures associated with the labeled data in source domain to adapt the conditional distribution of unlabeled target data, and thus guarantee a more accurate distribution matching. Extensive empirical evaluations on three fault datasets validate the applicability and practicability of DTN, while achieving many state-of-the-art transfer results in terms of diverse operating conditions, fault severities and fault types.


Asynchronous Parallel Sampling Gradient Boosting Decision Tree

arXiv.org Machine Learning

With the development of big data technology, Gradient Boosting Decision Tree, i.e. GBDT, becomes one of the most important machine learning algorithms for its accurate output. However, the training process of GBDT needs a lot of computational resources and time. In order to accelerate the training process of GBDT, the asynchronous parallel sampling gradient boosting decision tree, abbr. asynch-SGBDT is proposed in this paper. Via introducing sampling, we adapt the numerical optimization process of traditional GBDT training process into stochastic optimization process and use asynchronous parallel stochastic gradient descent to accelerate the GBDT training process. Meanwhile, the theoretical analysis of asynch-SGBDT is provided by us in this paper. Experimental results show that GBDT training process could be accelerated by asynch-SGBDT. Our asynchronous parallel strategy achieves an almost linear speedup, especially for high-dimensional sparse datasets.


What Is A Decision Tree Algorithm? โ€“ SeattleDataGuy โ€“ Medium

#artificialintelligence

Guest written by Rebecca Njeri! What is a Decision Tree? Let's start with a story. Suppose you have a business and you want to acquire some new customers. You also have a limited budget, and you want to ensure that, in advertising, you focus on customers who are the most likely to be converted.


From Random Differential Equations to Structural Causal Models: the stochastic case

arXiv.org Machine Learning

Random Differential Equations provide a natural extension of Ordinary Differential Equations to the stochastic setting. We show how, and under which conditions, every equilibrium state of a Random Differential Equation (RDE) can be described by a Structural Causal Model (SCM), while pertaining the causal semantics. This provides an SCM that captures the stochastic and causal behavior of the RDE, which can model both cycles and confounders. This enables the study of the equilibrium states of the RDE by applying the theory and statistical tools available for SCMs, for example, marginalizations and Markov properties, as we illustrate by means of an example. Our work thus provides a direct connection between two fields that so far have been developing in isolation.


A Survey on Application of Machine Learning Techniques in Optical Networks

arXiv.org Machine Learning

Today, the amount of data that can be retrieved from communications networks is extremely high and diverse (e.g., data regarding users behavior, traffic traces, network alarms, signal quality indicators, etc.). Advanced mathematical tools are required to extract useful information from this large set of network data. In particular, Machine Learning (ML) is regarded as a promising methodological area to perform network-data analysis and enable, e.g., automatized network self-configuration and fault management. In this survey we classify and describe relevant studies dealing with the applications of ML to optical communications and networking. Optical networks and system are facing an unprecedented growth in terms of complexity due to the introduction of a huge number of adjustable parameters (such as routing configurations, modulation format, symbol rate, coding schemes, etc.), mainly due to the adoption of, among the others, coherent transmission/reception technology, advanced digital signal processing and to the presence of nonlinear effects in optical fiber systems. Although a good number of research papers have appeared in the last years, the application of ML to optical networks is still in its early stage. In this survey we provide an introductory reference for researchers and practitioners interested in this field. To stimulate further work in this area, we conclude the paper proposing new possible research directions.


Compare outlier detection methods with the OutliersO3 package

#artificialintelligence

There are many different methods for identifying outliers and a lot of them are available in R. But are outliers a matter of opinion? Do all methods give the same results? Articles on outlier methods use a mixture of theory and practice. Theory is all very well, but outliers are outliers because they don't follow theory. Practice involves testing methods on data, sometimes with data simulated based on theory, better with real' datasets.