Accuracy
Evan Fournier's debut was delayed by a false positive COVID test
It was nothing more than what Brad Stevens termed "a curveball," as it turned out. After an initial false positive COVID test, Evan Fournier turned in a string of negative tests, leading to his first-time availability for the Celtics Monday night against New Orleans. "He will play significant minutes, as he will all the rest of the year," Stevens said of how he planned to begin with the talented wing player, acquired from Orlando at the trade deadline for the since-waived Jeff Teague and two second-round draft picks. "We had an obvious need for another wing that can do what he does, and we're fortunate he's with us, and he's on our team," said the Celtics coach. "So I got a chance to go over to the gym (Sunday) while he was shooting around when we got back and then this morning we went through some stuff prior to our shootaround, we shot around as a team for 30 minutes, so he's gotten the crash course in a very short amount of time. He's been there, done that. He's played against us, you know, tons of times, probably knows our plays as well as anybody, and certainly we just want him to play to his strengths and not worry about anything else."
Deep Learning in current Neuroimaging: a multivariate approach with power and type I error control but arguable generalization ability
Jiménez-Mesa, Carmen, Ramírez, Javier, Suckling, John, Vöglein, Jonathan, Levin, Johannes, Górriz, Juan Manuel, ADNI, Alzheimer's Disease Neuroimaging Initiative, DIAN, Dominantly Inherited Alzheimer Network
Discriminative analysis in neuroimaging by means of deep/machine learning techniques is usually tested with validation techniques, whereas the associated statistical significance remains largely under-developed due to their computational complexity. In this work, a non-parametric framework is proposed that estimates the statistical significance of classifications using deep learning architectures. In particular, a combination of autoencoders (AE) and support vector machines (SVM) is applied to: (i) a one-condition, within-group designs often of normal controls (NC) and; (ii) a two-condition, between-group designs which contrast, for example, Alzheimer's disease (AD) patients with NC (the extension to multi-class analyses is also included). A random-effects inference based on a label permutation test is proposed in both studies using cross-validation (CV) and resubstitution with upper bound correction (RUB) as validation methods. This allows both false positives and classifier overfitting to be detected as well as estimating the statistical power of the test. Several experiments were carried out using the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset, the Dominantly Inherited Alzheimer Network (DIAN) dataset, and a MCI prediction dataset. We found in the permutation test that CV and RUB methods offer a false positive rate close to the significance level and an acceptable statistical power (although lower using cross-validation). A large separation between training and test accuracies using CV was observed, especially in one-condition designs. This implies a low generalization ability as the model fitted in training is not informative with respect to the test set. We propose as solution by applying RUB, whereby similar results are obtained to those of the CV test set, but considering the whole set and with a lower computational cost per iteration.
Text Classification Using Hybrid Machine Learning Algorithms on Big Data
Asogwa, D. C., Anigbogu, S. O., Onyenwe, I. E., Sani, F. A.
Recently, there are unprecedented data growth originating from different online platforms which contribute to big data in terms of volume, velocity, variety and veracity (4Vs). Given this nature of big data which is unstructured, performing analytics to extract meaningful information is currently a great challenge to big data analytics. Collecting and analyzing unstructured textual data allows decision makers to study the escalation of comments/posts on our social media platforms. Hence, there is need for automatic big data analysis to overcome the noise and the non-reliability of these unstructured dataset from the digital media platforms. However, current machine learning algorithms used are performance driven focusing on the classification/prediction accuracy based on known properties learned from the training samples. With the learning task in a large dataset, most machine learning models are known to require high computational cost which eventually leads to computational complexity. In this work, two supervised machine learning algorithms are combined with text mining techniques to produce a hybrid model which consists of Na\"ive Bayes and support vector machines (SVM). This is to increase the efficiency and accuracy of the results obtained and also to reduce the computational cost and complexity. The system also provides an open platform where a group of persons with a common interest can share their comments/messages and these comments classified automatically as legal or illegal. This improves the quality of conversation among users. The hybrid model was developed using WEKA tools and Java programming language. The result shows that the hybrid model gave 96.76% accuracy as against the 61.45% and 69.21% of the Na\"ive Bayes and SVM models respectively.
Human Activity Analysis and Recognition from Smartphones using Machine Learning Techniques
Rabbi, Jakaria, Fuad, Md. Tahmid Hasan, Awal, Md. Abdul
Human Activity Recognition (HAR) is considered a valuable research topic in the last few decades. Different types of machine learning models are used for this purpose, and this is a part of analyzing human behavior through machines. It is not a trivial task to analyze the data from wearable sensors for complex and high dimensions. Nowadays, researchers mostly use smartphones or smart home sensors to capture these data. In our paper, we analyze these data using machine learning models to recognize human activities, which are now widely used for many purposes such as physical and mental health monitoring. We apply different machine learning models and compare performances. We use Logistic Regression (LR) as the benchmark model for its simplicity and excellent performance on a dataset, and to compare, we take Decision Tree (DT), Support Vector Machine (SVM), Random Forest (RF), and Artificial Neural Network (ANN). Additionally, we select the best set of parameters for each model by grid search. We use the HAR dataset from the UCI Machine Learning Repository as a standard dataset to train and test the models. Throughout the analysis, we can see that the Support Vector Machine performed (average accuracy 96.33%) far better than the other methods. We also prove that the results are statistically significant by employing statistical significance test methods.
Artificial Intelligence and IoT: Naive Bayes
A project-based course to build an AIoT system from theory to prototype. Artificial Intelligence and Automation with Zang Cloud Sample codes are provided for every project in this course. You will receive a certificate of completion when finishing this course. There is also Udemy 30 Day Money Back Guarantee, if you are not satisfied with this course. This course teaches you how to build an AIoT system from theory to prototype particularly using Naive Bayes algorithm.
Face Recognition as a Method of Authentication in a Web-Based System
Mugalu, Ben Wycliff, Wamala, Rodrick Calvin, Serugunda, Jonathan, Katumba, Andrew
Online information systems currently heavily rely on the username and password traditional method for protecting information and controlling access. With the advancement in biometric technology and popularity of fields like AI and Machine Learning, biometric security is becoming increasingly popular because of the usability advantage. This paper reports how machine learning based face recognition can be integrated into a web-based system as a method of authentication to reap the benefits of improved usability. This paper includes a comparison of combinations of detection and classification algorithms with FaceNet for face recognition. The results show that a combination of MTCNN for detection, Facenet for generating embeddings, and LinearSVC for classification outperforms other combinations with a 95% accuracy. The resulting classifier is integrated into the web-based system and used for authenticating users.
SQAPlanner: Generating Data-Informed Software Quality Improvement Plans
Rajapaksha, Dilini, Tantithamthavorn, Chakkrit, Jiarpakdee, Jirayus, Bergmeir, Christoph, Grundy, John, Buntine, Wray
Software Quality Assurance (SQA) planning aims to define proactive plans, such as defining maximum file size, to prevent the occurrence of software defects in future releases. To aid this, defect prediction models have been proposed to generate insights as the most important factors that are associated with software quality. Such insights that are derived from traditional defect models are far from actionable-i.e., practitioners still do not know what they should do or avoid to decrease the risk of having defects, and what is the risk threshold for each metric. A lack of actionable guidance and risk threshold can lead to inefficient and ineffective SQA planning processes. In this paper, we investigate the practitioners' perceptions of current SQA planning activities, current challenges of such SQA planning activities, and propose four types of guidance to support SQA planning. We then propose and evaluate our AI-Driven SQAPlanner approach, a novel approach for generating four types of guidance and their associated risk thresholds in the form of rule-based explanations for the predictions of defect prediction models. Finally, we develop and evaluate an information visualization for our SQAPlanner approach. Through the use of qualitative survey and empirical evaluation, our results lead us to conclude that SQAPlanner is needed, effective, stable, and practically applicable. We also find that 80% of our survey respondents perceived that our visualization is more actionable. Thus, our SQAPlanner paves a way for novel research in actionable software analytics-i.e., generating actionable guidance on what should practitioners do and not do to decrease the risk of having defects to support SQA planning.
Local Explanations via Necessity and Sufficiency: Unifying Theory and Practice
Watson, David, Gultchin, Limor, Taly, Ankur, Floridi, Luciano
Necessity and sufficiency are the building blocks of all successful explanations. Yet despite their importance, these notions have been conceptually underdeveloped and inconsistently applied in explainable artificial intelligence (XAI), a fast-growing research area that is so far lacking in firm theoretical foundations. Building on work in logic, probability, and causality, we establish the central role of necessity and sufficiency in XAI, unifying seemingly disparate methods in a single formal framework. We provide a sound and complete algorithm Figure 1: We describe minimal sufficient factors (here, sets for computing explanatory factors with respect to of features) for a given input (top row), with the aim of a given context, and demonstrate its flexibility and preserving or flipping the original prediction. We report a competitive performance against state of the art alternatives sufficiency score for each set and a cumulative necessity on various tasks.
A PSO Strategy of Finding Relevant Web Documents using a New Similarity Measure
C, Dr. Ramya, S, Dr. Shreedhara K
In the world of the Internet and World Wide Web, which offers a tremendous amount of information, an increasing emphasis is being given to searching services and functionality. Currently, a majority of web portals offer their searching utilities, be it better or worse. These can search for the content within the sites, mainly text the textual content of documents. In this paper a novel similarity measure called SMDR (Similarity Measure for Documents Retrieval) is proposed to help retrieve more similar documents from the repository thus contributing considerably to the effectiveness of Web Information Retrieval (WIR) process. Bio-inspired PSO methodology is used with the intent to reduce the response time of the system and optimizes WIR process, hence contributes to the efficiency of the system. This paper also demonstrates a comparative study of the proposed system with the existing method in terms of accuracy, sensitivity, F-measure and specificity. Finally, extensive experiments are conducted on CACM collections. Better precision-recall rates are achieved than the existing system. Experimental results demonstrate the effectiveness and efficiency of the proposed system.
Characterizing and Detecting Mismatch in Machine-Learning-Enabled Systems
Lewis, Grace A., Bellomo, Stephany, Ozkaya, Ipek
Increasing availability of machine learning (ML) frameworks and tools, as well as their promise to improve solutions to data-driven decision problems, has resulted in popularity of using ML techniques in software systems. However, end-to-end development of ML-enabled systems, as well as their seamless deployment and operations, remain a challenge. One reason is that development and deployment of ML-enabled systems involves three distinct workflows, perspectives, and roles, which include data science, software engineering, and operations. These three distinct perspectives, when misaligned due to incorrect assumptions, cause ML mismatches which can result in failed systems. We conducted an interview and survey study where we collected and validated common types of mismatches that occur in end-to-end development of ML-enabled systems. Our analysis shows that how each role prioritizes the importance of relevant mismatches varies, potentially contributing to these mismatched assumptions. In addition, the mismatch categories we identified can be specified as machine readable descriptors contributing to improved ML-enabled system development. In this paper, we report our findings and their implications for improving end-to-end ML-enabled system development.