Diagnosis of chronic diseases and assistance in medical decisions is based on machine learning algorithms. In this paper, we review the classification algorithms used in the health care system (chronic diseases) and present the neural network-based Ensemble learning method. We briefly describe the commonly used algorithms and describe their critical properties. Materials and Methods: In this study, modern classification algorithms used in healthcare, examine the principles of these methods and guidelines, and to accurately diagnose and predict chronic diseases, superior machine learning algorithms with the neural network-based ensemble learning Is used. To do this, we use experimental data, real data on chronic patients (diabetes, heart, cancer) available on the UCI site. Results: We found that group algorithms designed to diagnose chronic diseases can be more effective than baseline algorithms. It also identifies several challenges to further advancing the classification of machine learning in the diagnosis of chronic diseases. Conclusion: The results show the high performance of the neural network-based Ensemble learning approach for the diagnosis and prediction of chronic diseases, which in this study reached 98.5, 99, and 100% accuracy, respectively.
Time series data is a collection of chronological observations which is generated by several domains such as medical and financial fields. Over the years, different tasks such as classification, forecasting, and clustering have been proposed to analyze this type of data. Time series data has been also used to study the effect of interventions over time. Moreover, in many fields of science, learning the causal structure of dynamic systems and time series data is considered an interesting task which plays an important role in scientific discoveries. Estimating the effect of an intervention and identifying the causal relations from the data can be performed via causal inference. Existing surveys on time series discuss traditional tasks such as classification and forecasting or explain the details of the approaches proposed to solve a specific task. In this paper, we focus on two causal inference tasks, i.e., treatment effect estimation and causal discovery for time series data, and provide a comprehensive review of the approaches in each task. Furthermore, we curate a list of commonly used evaluation metrics and datasets for each task and provide in-depth insight. These metrics and datasets can serve as benchmarks for research in the field.
This graduate textbook on machine learning tells a story of how patterns in data support predictions and consequential actions. Starting with the foundations of decision making, we cover representation, optimization, and generalization as the constituents of supervised learning. A chapter on datasets as benchmarks examines their histories and scientific bases. Self-contained introductions to causality, the practice of causal inference, sequential decision making, and reinforcement learning equip the reader with concepts and tools to reason about actions and their consequences. Throughout, the text discusses historical context and societal impact. We invite readers from all backgrounds; some experience with probability, calculus, and linear algebra suffices.
The information describing the conditions of a system or a person is constantly evolving and may become obsolete and contradict other information. A database, therefore, must be consistently updated upon the acquisition of new valid observations that contradict obsolete ones contained in the database. In this paper, we propose a novel approach for dealing with the information obsolescence problem. Our approach aims to detect, in real-time, contradictions between observations and then identify the obsolete ones, given a representation model. Since we work within an uncertain environment characterized by the lack of information, we choose to use a Bayesian network as our representation model and propose a new approximate concept, $\epsilon$-Contradiction. The new concept is parameterised by a confidence level of having a contradiction in a set of observations. We propose a polynomial-time algorithm for detecting obsolete information. We show that the resulting obsolete information is better represented by an AND-OR tree than a simple set of observations. Finally, we demonstrate the effectiveness of our approach on a real elderly fall-prevention database and showcase how this tree can be used to give reliable recommendations to doctors. Our experiments give systematically and substantially very good results.
Over the last decade, the long-running endeavour to automate high-level processes in machine learning (ML) has risen to mainstream prominence, stimulated by advances in optimisation techniques and their impact on selecting ML models/algorithms. Central to this drive is the appeal of engineering a computational system that both discovers and deploys high-performance solutions to arbitrary ML problems with minimal human interaction. Beyond this, an even loftier goal is the pursuit of autonomy, which describes the capability of the system to independently adjust an ML solution over a lifetime of changing contexts. However, these ambitions are unlikely to be achieved in a robust manner without the broader synthesis of various mechanisms and theoretical frameworks, which, at the present time, remain scattered across numerous research threads. Accordingly, this review seeks to motivate a more expansive perspective on what constitutes an automated/autonomous ML system, alongside consideration of how best to consolidate those elements. In doing so, we survey developments in the following research areas: hyperparameter optimisation, multi-component models, neural architecture search, automated feature engineering, meta-learning, multi-level ensembling, dynamic adaptation, multi-objective evaluation, resource constraints, flexible user involvement, and the principles of generalisation. We also develop a conceptual framework throughout the review, augmented by each topic, to illustrate one possible way of fusing high-level mechanisms into an autonomous ML system. Ultimately, we conclude that the notion of architectural integration deserves more discussion, without which the field of automated ML risks stifling both its technical advantages and general uptake.
The recent success of Bayesian methods in neuroscience and artificial intelligence gives rise to the hypothesis that the brain is a Bayesian machine. Since logic and learning are both practices of the human brain, it leads to another hypothesis that there is a Bayesian interpretation underlying both logical reasoning and machine learning. In this paper, we introduce a generative model of logical consequence relations. It formalises the process of how the truth value of a sentence is probabilistically generated from the probability distribution over states of the world. We show that the generative model characterises a classical consequence relation, paraconsistent consequence relation and nonmonotonic consequence relation. In particular, the generative model gives a new consequence relation that outperforms them in reasoning with inconsistent knowledge. We also show that the generative model gives a new classification algorithm that outperforms several representative algorithms in predictive accuracy and complexity on the Kaggle Titanic dataset.
We propose a flexible ensemble classification framework, Random Subspace Ensemble (RaSE), for sparse classification. In the RaSE algorithm, we aggregate many weak learners, where each weak learner is a base classifier trained in a subspace optimally selected from a collection of random subspaces. To conduct subspace selection, we propose a new criterion, ratio information criterion (RIC), based on weighted Kullback-Leibler divergence. The theoretical analysis includes the risk and Monte-Carlo variance of RaSE classifier, establishing the screening consistency and weak consistency of RIC, and providing an upper bound for the misclassification rate of RaSE classifier. In addition, we show that in a high-dimensional framework, the number of random subspaces needs to be very large to guarantee that a subspace covering signals is selected. Therefore, we propose an iterative version of RaSE algorithm and prove that under some specific conditions, a smaller number of generated random subspaces are needed to find a desirable subspace through iteration. An array of simulations under various models and real-data applications demonstrate the effectiveness and robustness of the RaSE classifier and its iterative version in terms of low misclassifi-cation rate and accurate feature ranking. The RaSE algorithm is implemented in the R package RaSEn on CRAN.
Equipping machines with comprehensive knowledge of the world's entities and their relationships has been a long-standing goal of AI. Over the last decade, large-scale knowledge bases, also known as knowledge graphs, have been automatically constructed from web contents and text sources, and have become a key asset for search engines. This machine knowledge can be harnessed to semantically interpret textual phrases in news, social media and web tables, and contributes to question answering, natural language processing and data analytics. This article surveys fundamental concepts and practical methods for creating and curating large knowledge bases. It covers models and methods for discovering and canonicalizing entities and their semantic types and organizing them into clean taxonomies. On top of this, the article discusses the automatic extraction of entity-centric properties. To support the long-term life-cycle and the quality assurance of machine knowledge, the article presents methods for constructing open schemas and for knowledge curation. Case studies on academic projects and industrial knowledge graphs complement the survey of concepts and methods.
We propose a generative Causal Adversarial Network (CAN) for learning and sampling from conditional and interventional distributions. In contrast to the existing CausalGAN which requires the causal graph to be given, our proposed framework learns the causal relations from the data and generates samples accordingly. The proposed CAN comprises a two-fold process namely Label Generation Network (LGN) and Conditional Image Generation Network (CIGN). The LGN is a GAN-based architecture which learns and samples from the causal model over labels. The sampled labels are then fed to CIGN, a conditional GAN architecture, which learns the relationships amongst labels and pixels and pixels themselves and generates samples based on them. This framework is equipped with an intervention mechanism which enables. the model to generate samples from interventional distributions. We quantitatively and qualitatively assess the performance of CAN and empirically show that our model is able to generate both interventional and conditional samples without having access to the causal graph for the application of face generation on CelebA data.
Machine learning systems are increasingly being used to make impactful decisions such as loan applications and criminal justice risk assessments, and as such, ensuring fairness of these systems is critical. This is often challenging as the labels in the data are biased. This paper studies learning fair probability distributions from biased data by explicitly modeling a latent variable that represents a hidden, unbiased label. In particular, we aim to achieve demographic parity by enforcing certain independencies in the learned model. We also show that group fairness guarantees are meaningful only if the distribution used to provide those guarantees indeed captures the real-world data. In order to closely model the data distribution, we employ probabilistic circuits, an expressive and tractable probabilistic model, and propose an algorithm to learn them from incomplete data. We evaluate our approach on a synthetic dataset in which observed labels indeed come from fair labels but with added bias, and demonstrate that the fair labels are successfully retrieved. Moreover, we show on real-world datasets that our approach not only is a better model than existing methods of how the data was generated but also achieves competitive accuracy.