Performance Analysis
An Explainable Classification Model for Chronic Kidney Disease Patients
Currently, Chronic Kidney Disease (CKD) is experiencing a globally increasing incidence and high cost to health systems. A delayed recognition leads to premature mortality due to progressive loss of kidney function. The employment of data mining to discover subtle patterns in CKD indicators would contribute to an early diagnosis. This work develops a classifier model that would support healthcare professionals in the early diagnosis of CKD patients. Through a data pipeline, an exhaustive search is performed to find the best data mining classifier with different parameters of the data preparation's sub-stages like data missing or feature selection. Therefore, Extra Trees is selected as the best classifier with a 100% and 99% of accuracy with, respectively, cross-validation technique and with new unseen data. Moreover, the 8 features selected are employed to assess the explainability of the model's results denoting which features are more relevant in the model's output.
Stance Detection with BERT Embeddings for Credibility Analysis of Information on Social Media
Karande, Hema, Walambe, Rahee, Benjamin, Victor, Kotecha, Ketan, Raghu, T. S.
The evolution of electronic media is a mixed blessing. Due to the easy access, low cost, and faster reach of the information, people search out and devour news from online social networks. In contrast, the increasing acceptance of social media reporting leads to the spread of fake news. This is a minacious problem that causes disputes and endangers societal stability and harmony. Fake news spread has gained attention from researchers due to its vicious nature. proliferation of misinformation in all media, from the internet to cable news, paid advertising and local news outlets, has made it essential for people to identify the misinformation and sort through the facts. Researchers are trying to analyze the credibility of information and curtail false information on such platforms. Credibility is the believability of the piece of information at hand. Analyzing the credibility of fake news is challenging due to the intent of its creation and the polychromatic nature of the news. In this work, we propose a model for detecting fake news. Our method investigates the content of the news at the early stage i.e. when the news is published but is yet to be disseminated through social media. Our work interprets the content with automatic feature extraction and the relevance of the text pieces. In summary, we introduce stance as one of the features along with the content of the article and employ the pre-trained contextualized word embeddings BERT to obtain the state-of-art results for fake news detection. The experiment conducted on the real-world dataset indicates that our model outperforms the previous work and enables fake news detection with an accuracy of 95.32%.
Revisiting the Negative Data of Distantly Supervised Relation Extraction
Xie, Chenhao, Liang, Jiaqing, Liu, Jingping, Huang, Chengsong, Huang, Wenhao, Xiao, Yanghua
Distantly supervision automatically generates plenty of training samples for relation extraction. However, it also incurs two major problems: noisy labels and imbalanced training data. Previous works focus more on reducing wrongly labeled relations (false positives) while few explore the missing relations that are caused by incompleteness of knowledge base (false negatives). Furthermore, the quantity of negative labels overwhelmingly surpasses the positive ones in previous problem formulations. In this paper, we first provide a thorough analysis of the above challenges caused by negative data. Next, we formulate the problem of relation extraction into as a positive unlabeled learning task to alleviate false negative problem. Thirdly, we propose a pipeline approach, dubbed \textsc{ReRe}, that performs sentence-level relation detection then subject/object extraction to achieve sample-efficient training. Experimental results show that the proposed method consistently outperforms existing approaches and remains excellent performance even learned with a large quantity of false positive samples.
Machine Learning for Accurate Intraoperative Pediatric Middle Ear Effusion Diagnosis
OBJECTIVES: Misdiagnosis of acute and chronic otitis media in children can result in significant consequences from either undertreatment or overtreatment. Our objective was to develop and train an artificial intelligence algorithm to accurately predict the presence of middle ear effusion in pediatric patients presenting to the operating room for myringotomy and tube placement. METHODS: We trained a neural network to classify images as โ normalโ (no effusion) or โabnormalโ (effusion present) using tympanic membrane images from children taken to the operating room with the intent of performing myringotomy and possible tube placement for recurrent acute otitis media or otitis media with effusion. Model performance was tested on held-out cases and fivefold cross-validation. RESULTS: The mean training time for the neural network model was 76.0 (SD ยฑ 0.01) seconds. Our model approach achieved a mean image classification accuracy of 83.8% (95% confidence interval [CI]: 82.7โ84.8). In support of this classification accuracy, the model produced an area under the receiver operating characteristic curve performance of 0.93 (95% CI: 0.91โ0.94) and F1-score of 0.80 (95% CI: 0.77โ0.82). CONCLUSIONS: Artificial intelligenceโassisted diagnosis of acute or chronic otitis media in children may generate value for patients, families, and the health care system by improving point-of-care diagnostic accuracy. With a small training data set composed of intraoperative images obtained at time of tympanostomy tube insertion, our neural network was accurate in predicting the presence of a middle ear effusion in pediatric ear cases. This diagnostic accuracy performance is considerably higher than human-expert otoscopy-based diagnostic performance reported in previous studies.
Ensemble machine learning approach for screening of coronary heart disease based on echocardiography and risk factors
Zhang, Jingyi, Zhu, Huolan, Chen, Yongkai, Yang, Chenguang, Cheng, Huimin, Li, Yi, Zhong, Wenxuan, Wang, Fang
Background: Extensive clinical evidence suggests that a preventive screening of coronary heart disease (CHD) at an earlier stage can greatly reduce the mortality rate. We use 64 two-dimensional speckle tracking echocardiography (2D-STE) features and seven clinical features to predict whether one has CHD. Methods: We develop a machine learning approach that integrates a number of popular classification methods together by model stacking, and generalize the traditional stacking method to a two-step stacking method to improve the diagnostic performance. Results: By borrowing strengths from multiple classification models through the proposed method, we improve the CHD classification accuracy from around 70% to 87.7% on the testing set. The sensitivity of the proposed method is 0.903 and the specificity is 0.843, with an AUC of 0.904, which is significantly higher than those of the individual classification models. Conclusions: Our work lays a foundation for the deployment of speckle tracking echocardiography-based screening tools for coronary heart disease.
Towards Personalized Fairness based on Causal Notion
Li, Yunqi, Chen, Hanxiong, Xu, Shuyuan, Ge, Yingqiang, Zhang, Yongfeng
Recommender systems are gaining increasing and critical impacts on human and society since a growing number of users use them for information seeking and decision making. Therefore, it is crucial to address the potential unfairness problems in recommendations. Just like users have personalized preferences on items, users' demands for fairness are also personalized in many scenarios. Therefore, it is important to provide personalized fair recommendations for users to satisfy their personalized fairness demands. Besides, previous works on fair recommendation mainly focus on association-based fairness. However, it is important to advance from associative fairness notions to causal fairness notions for assessing fairness more properly in recommender systems. Based on the above considerations, this paper focuses on achieving personalized counterfactual fairness for users in recommender systems. To this end, we introduce a framework for achieving counterfactually fair recommendations through adversary learning by generating feature-independent user embeddings for recommendation. The framework allows recommender systems to achieve personalized fairness for users while also covering non-personalized situations. Experiments on two real-world datasets with shallow and deep recommendation algorithms show that our method can generate fairer recommendations for users with a desirable recommendation performance.
Measuring Model Fairness under Noisy Covariates: A Theoretical Perspective
Prost, Flavien, Awasthi, Pranjal, Blumm, Nick, Kumthekar, Aditee, Potter, Trevor, Wei, Li, Wang, Xuezhi, Chi, Ed H., Chen, Jilin, Beutel, Alex
In this work we study the problem of measuring the fairness of a machine learning model under noisy information. Focusing on group fairness metrics, we investigate the particular but common situation when the evaluation requires controlling for the confounding effect of covariate variables. In a practical setting, we might not be able to jointly observe the covariate and group information, and a standard workaround is to then use proxies for one or more of these variables. Prior works have demonstrated the challenges with using a proxy for sensitive attributes, and strong independence assumptions are needed to provide guarantees on the accuracy of the noisy estimates. In contrast, in this work we study using a proxy for the covariate variable and present a theoretical analysis that aims to characterize weaker conditions under which accurate fairness evaluation is possible. Furthermore, our theory identifies potential sources of errors and decouples them into two interpretable parts $\gamma$ and $\epsilon$. The first part $\gamma$ depends solely on the performance of the proxy such as precision and recall, whereas the second part $\epsilon$ captures correlations between all the variables of interest. We show that in many scenarios the error in the estimates is dominated by $\gamma$ via a linear dependence, whereas the dependence on the correlations $\epsilon$ only constitutes a lower order term. As a result we expand the understanding of scenarios where measuring model fairness via proxies can be an effective approach. Finally, we compare, via simulations, the theoretical upper-bounds to the distribution of simulated estimation errors and show that assuming some structure on the data, even weak, is key to significantly improve both theoretical guarantees and empirical results.
Multi-Perspective Anomaly Detection
Madan, Manav, Jakob, Peter, Schmid-Schirling, Tobias, Valada, Abhinav
Multi-view classification is inspired by the behavior of humans, especially when fine-grained features or in our case rarely occurring anomalies are to be detected. Current contributions point to the problem of how high-dimensional data can be fused. In this work, we build upon the deep support vector data description algorithm and address multi-perspective anomaly detection using three different fusion techniques i.e. early fusion, late fusion, and late fusion with multiple decoders. We employ different augmentation techniques with a denoising process to deal with scarce one-class data, which further improves the performance (ROC AUC = 80\%). Furthermore, we introduce the dices dataset that consists of over 2000 grayscale images of falling dices from multiple perspectives, with 5\% of the images containing rare anomalies (e.g. drill holes, sawing, or scratches). We evaluate our approach on the new dices dataset using images from two different perspectives and also benchmark on the standard MNIST dataset. Extensive experiments demonstrate that our proposed approach exceeds the state-of-the-art on both the MNIST and dices datasets. To the best of our knowledge, this is the first work that focuses on addressing multi-perspective anomaly detection in images by jointly using different perspectives together with one single objective function for anomaly detection.
The State of AI Ethics Report (January 2021)
Gupta, Abhishek, Royer, Alexandrine, Wright, Connor, Khan, Falaah Arif, Heath, Victoria, Galinkin, Erick, Khurana, Ryan, Ganapini, Marianna Bergamaschi, Fancy, Muriam, Sweidan, Masa, Akif, Mo, Butalid, Renjie
The 3rd edition of the Montreal AI Ethics Institute's The State of AI Ethics captures the most relevant developments in AI Ethics since October 2020. It aims to help anyone, from machine learning experts to human rights activists and policymakers, quickly digest and understand the field's ever-changing developments. Through research and article summaries, as well as expert commentary, this report distills the research and reporting surrounding various domains related to the ethics of AI, including: algorithmic injustice, discrimination, ethical AI, labor impacts, misinformation, privacy, risk and security, social media, and more. In addition, The State of AI Ethics includes exclusive content written by world-class AI Ethics experts from universities, research institutes, consulting firms, and governments. Unique to this report is "The Abuse and Misogynoir Playbook," written by Dr. Katlyn Tuner (Research Scientist, Space Enabled Research Group, MIT), Dr. Danielle Wood (Assistant Professor, Program in Media Arts and Sciences; Assistant Professor, Aeronautics and Astronautics; Lead, Space Enabled Research Group, MIT) and Dr. Catherine D'Ignazio (Assistant Professor, Urban Science and Planning; Director, Data + Feminism Lab, MIT). The piece (and accompanying infographic), is a deep-dive into the historical and systematic silencing, erasure, and revision of Black women's contributions to knowledge and scholarship in the United Stations, and globally. Exposing and countering this Playbook has become increasingly important following the firing of AI Ethics expert Dr. Timnit Gebru (and several of her supporters) at Google. This report should be used not only as a point of reference and insight on the latest thinking in the field of AI Ethics, but should also be used as a tool for introspection as we aim to foster a more nuanced conversation regarding the impacts of AI on the world.
Analyzing Machine Learning Approaches for Online Malware Detection in Cloud
Kimmell, Jeffrey C, Abdelsalam, Mahmoud, Gupta, Maanak
The variety of services and functionality offered by various cloud service providers (CSP) have exploded lately. Utilizing such services has created numerous opportunities for enterprises infrastructure to become cloud-based and, in turn, assisted the enterprises to easily and flexibly offer services to their customers. The practice of renting out access to servers to clients for computing and storage purposes is known as Infrastructure as a Service (IaaS). The popularity of IaaS has led to serious and critical concerns with respect to the cyber security and privacy. In particular, malware is often leveraged by malicious entities against cloud services to compromise sensitive data or to obstruct their functionality. In response to this growing menace, malware detection for cloud environments has become a widely researched topic with numerous methods being proposed and deployed. In this paper, we present online malware detection based on process level performance metrics, and analyze the effectiveness of different baseline machine learning models including, Support Vector Classifier (SVC), Random Forest Classifier (RFC), KNearest Neighbor (KNN), Gradient Boosted Classifier (GBC), Gaussian Naive Bayes (GNB) and Convolutional Neural Networks (CNN). Our analysis conclude that neural network models can most accurately detect the impact malware have on the process level features of virtual machines in the cloud, and therefore are best suited to detect them. Our models were trained, validated, and tested by using a dataset of 40,680 malicious and benign samples. The dataset was complied by running different families of malware (collected from VirusTotal) in a live cloud environment and collecting the process level features.