Performance Analysis
Maturation Trajectories of Cortical Resting-State Networks Depend on the Mediating Frequency Band
Khan, Sheraz, Hashmi, Javeria, Mamashli, Fahimeh, Michmizos, Konstantinos, Kitzbichler, Manfred, Bharadwaj, Hari, Bekhti, Yousra, Ganesan, Santosh, Garel, Keri A, Whitfield-Gabrieli, Susan, Gollub, Randy, Kong, Jian, Vaina, Lucia M, Rana, Kunjan, Stufflebeam, Steven, Hamalainen, Matti, Kenet, Tal
The functional significance of resting state networks and their abnormal manifestations in psychiatric disorders are firmly established, as is the importance of the cortical rhythms in mediating these networks. Resting state networks are known to undergo substantial reorganization from childhood to adulthood, but whether distinct cortical rhythms, which are generated by separable neural mechanisms and are often manifested abnormally in psychiatric conditions, mediate maturation differentially, remains unknown. Using magnetoencephalography (MEG) to map frequency band specific maturation of resting state networks from age 7 to 29 in 162 participants (31 independent), we found significant changes with age in networks mediated by the beta (13-30Hz) and gamma (31-80Hz) bands. More specifically, gamma band mediated networks followed an expected asymptotic trajectory, but beta band mediated networks followed a linear trajectory. Network integration increased with age in gamma band mediated networks, while local segregation increased with age in beta band mediated networks. Spatially, the hubs that changed in importance with age in the beta band mediated networks had relatively little overlap with those that showed the greatest changes in the gamma band mediated networks. These findings are relevant for our understanding of the neural mechanisms of cortical maturation, in both typical and atypical development.
Bridge type classification: supervised learning on a modified NBI dataset
Jootoo, Achyuthan, Lattanzi, David
A key phase in the bridge design process is the selection of the structural system. Due to budget and time constraints, engineers typically rely on engineering judgment and prior experience when selecting a structural system, often considering a limited range of design alternatives. The objective of this study was to explore the suitability of supervised machine learning as a preliminary design aid that provides guidance to engineers with regards to the statistically optimal bridge type to choose, ultimately improving the likelihood of optimized design, design standardization, and reduced maintenance costs. In order to devise this supervised learning system, data for over 600,000 bridges from the National Bridge Inventory database were analyzed. Key attributes for determining the bridge structure type were identified through three feature selection techniques. Potentially useful attributes like seismic intensity and historic data on the cost of materials (steel and concrete) were then added from the US Geological Survey (USGS) database and Engineering News Record. Decision tree, Bayes network and Support Vector Machines were used for predicting the bridge design type. Due to state-to-state variations in material availability, material costs, and design codes, supervised learning models based on the complete data set did not yield favorable results. Supervised learning models were then trained and tested using 10-fold cross validation on data for each state. Inclusion of seismic data improved the model performance noticeably. The data was then resampled to reduce the bias of the models towards more common design types, and the supervised learning models thus constructed showed further improvements in performance. The average recall and precision for the state models was 88.6% and 88.0% using Decision Trees, 84.0% and 83.7% using Bayesian Networks, and 80.8% and 75.6% using SVM.
Introduction to Python Ensembles
Ensembles have rapidly become one of the hottest and most popular methods in applied machine learning. Virtually every winning Kaggle solution features them, and many data science pipelines have ensembles in them. Put simply, ensembles combine predictions from different models to generate a final prediction, and the more models we include the better it performs. Better still, because ensembles combine baseline predictions, they perform at least as well as the best baseline model. Ensembles give us a performance boost almost for free!
Communicating the Business Value of Data Science – Hacker Noon
You've created a model to predict which sales leads are likely to turn into convert into paying customers. The model is a true work of algorithmic beauty -- its your Mona Lisa, but better since it actually does something useful. You have tuned the parameters and cross-validated the results, which means the only thing left to do is present your work to the company's executives. "No problem" you think to yourself, you'll just talk them through a Confusion Matrix and ROC curve and wait for them to lavish you in riches and praise. Anyone who has tried to walk business stakeholders through predictive model performance knows that you are more likely to get strained expressions and blank stares than a standing ovation.
Association of Pathological Fibrosis With Renal Survival Using Deep Neural Networks
Color to a set of bars within the histogram was assigned based on the Kidney Disease Outcomes Quality Initiative (KDOQI) guideline driven cutoff values for high and low creatinine. Model predictions were performed on the remaining 30% of the data (n 662), and a receiver operating characteristic (ROC) curve was generated. Color to a set of bars within the histogram was assigned based on the KDOQI guideline driven cutoff value for nephrotic-range proteinuria (g/d). Model predictions were performed on the remaining 30% of the data (n 648), and an ROC curve was generated.
Convex Formulations for Fair Principal Component Analysis
Though there is a growing body of literature on fairness for supervised learning, the problem of incorporating fairness into unsupervised learning has been less well-studied. This paper studies fairness in the context of principal component analysis (PCA). We first present a definition of fairness for dimensionality reduction, and our definition can be interpreted as saying that a reduction is fair if information about a protected class (e.g., race or gender) cannot be inferred from the dimensionality-reduced data points. Next, we develop convex optimization formulations that can improve the fairness (with respect to our definition) of PCA and kernel PCA. These formulations are semidefinite programs (SDP's), and we demonstrate the effectiveness of our formulations using several datasets. We conclude by showing how our approach can be used to perform a fair (with respect to age) clustering of health data that may be used to set health insurance rates.
PCA-Based Missing Information Imputation for Real-Time Crash Likelihood Prediction Under Imbalanced Data
Ke, Jintao, Zhang, Shuaichao, Yang, Hai, Chen, Xiqun
The real-time crash likelihood prediction has been an important research topic. Various classifiers, such as support vector machine (SVM) and tree-based boosting algorithms, have been proposed in traffic safety studies. However, few research focuses on the missing data imputation in real-time crash likelihood prediction, although missing values are commonly observed due to breakdown of sensors or external interference. Besides, classifying imbalanced data is also a difficult problem in real-time crash likelihood prediction, since it is hard to distinguish crash-prone cases from non-crash cases which compose the majority of the observed samples. In this paper, principal component analysis (PCA) based approaches, including LS-PCA, PPCA, and VBPCA, are employed for imputing missing values, while two kinds of solutions are developed to solve the problem in imbalanced data. The results show that PPCA and VBPCA not only outperform LS-PCA and other imputation methods (including mean imputation and k-means clustering imputation), in terms of the root mean square error (RMSE), but also help the classifiers achieve better predictive performance. The two solutions, i.e., cost-sensitive learning and synthetic minority oversampling technique (SMOTE), help improve the sensitivity by adjusting the classifiers to Corresponding author Email address: chenxiqun@zju.edu.cn Keywords: Real-time crash likelihood prediction, PCA-based missing data imputation, cost-sensitive learning, SMOTE, support vector machine, AdaBoost 1. Introduction Prediction of traffic crash has been a major research topic in transportation safety studies. Crashes, especially on urban expressways, can trigger heavy traffic congestions, impose huge external costs, and reduce the level of service of transportation infrastructures. Therefore, the accurate and reliable prediction of crash risks is critical to the success of proactive safety management strategies on urban expressways. There have been fruitful studies in the domain of the real-time crash likelihood estimation (Abdel-Aty and Pemmanaboina, 2006; Abdel-Aty et al., 2007, 2008; Ahmed and Abdel-Aty, 2012). It has been reported that crash occurrence was affected by four major factors: real-time traffic state, drivers' behavior, environment factors, and road geometry (Ahmed and Abdel-Aty, 2013b).
Recovering Loss to Followup Information Using Denoising Autoencoders
Imagine this scenario: In a clinical trial investigating the toxicity of a new chemotherapy drug to treat breast cancer, some patients drop out of the trial before completion for various reasons, hence we do not have the data for final outcome on the dropped out patients. What if the patients who drop out of the trial before completion are the ones who experienced toxicity and are unwilling to continue the treatment, this reason however is not recorded in the database and the patients are marked as "lost to followup". If the investigators were to analyze the data using conventional methods where loss to followup is ignored and not properly accounted for, they will estimate the toxicity to be far less than what it really is. These results can lead to adapting a drug, that is otherwise unsafe. Similarly if patients who are feeling better dropout of the trial before completion, the estimates of toxicity would be far greater than the real value, leading to rejection of a potential lifesaver drug.
Crit\`eres de qualit\'e d'un classifieur g\'en\'eraliste
This paper considers the problem of choosing a good classifier. For each problem there exist an optimal classifier, but none are optimal, regarding the error rate, in all cases. Because there exists a large number of classifiers, a user would rather prefer an all-purpose classifier that is easy to adjust, in the hope that it will do almost as good as the optimal. In this paper we establish a list of criteria that a good generalist classifier should satisfy . We first discuss data analytic, these criteria are presented. Six among the most popular classifiers are selected and scored according to these criteria. Tables allow to easily appreciate the relative values of each. In the end, random forests turn out to be the best classifiers.
Enhanced version of AdaBoostM1 with J48 Tree learning method
Kang, Kyongche, Michalak, Jack
Machine Learning focuses on the construction and study of systems that can learn from data. This is connected with the classification problem, which usually is what Machine Learning algorithms are designed to solve. When a machine learning method is used by people with no special expertise in machine learning, it is important that the method be'robust' in classification, in the sense that reasonable performance is obtained with minimal tuning of the problem at hand. Algorithms are evaluated based on how'robust' they can classify the given data. In this paper, we propose a quantifiable measure of'robustness', and describe a particular learning method that is robust according to this measure in the context of classification problem. We proposed Adaptive Boosting (AdaBoostM1) with J48(C4.5 tree) as a base learner with tuning weight threshold (P) and number of iterations (I) for boosting algorithm. To benchmark the performance, we used the baseline classifier, AdaBoostM1 with Decision Stump as base learner without tuning parameters. By tuning parameters and using J48 as base learner, we are able to reduce the overall average error rate ratio (errorC/errorNB) from 2.4 to 0.9 for development sets of data and 2.1 to 1.2 for evaluation sets of data.