Accuracy
Fault Matters: Sensor Data Fusion for Detection of Faults using Dempster-Shafer Theory of Evidence in IoT-Based Applications
Ghosh, Nimisha, Paul, Rourab, Maity, Satyabrata, Maity, Krishanu, Saha, Sayantan
Fault detection in sensor nodes is a pertinent issue that has been an important area of research for a very long time. But it is not explored much as yet in the context of Internet of Things. Internet of Things work with a massive amount of data so the responsibility for guaranteeing the accuracy of the data also lies with it. Moreover, a lot of important and critical decisions are made based on these data, so ensuring its correctness and accuracy is also very important. Also, the detection needs to be as precise as possible to avoid negative alerts. For this purpose, this work has adopted Dempster-Shafer Theory of Evidence which is a popular learning method to collate the information from sensors to come up with a decision regarding the faulty status of a sensor node. To verify the validity of the proposed method, simulations have been performed on a benchmark data set and data collected through a test bed in a laboratory set-up. For the different types of faults, the proposed method shows very competent accuracy for both the benchmark (99.8%) and laboratory data sets (99.9%) when compared to the other state-of-the-art machine learning techniques.
Scientists develop artificial intelligence system to detect cardiac arrest in sleep
Washington: Scientists have developed a new artificial intelligence (AI) system to monitor people for cardiac arrest while they are asleep without touching them. People experiencing cardiac arrest will suddenly become unresponsive and either stop breathing or gasp for air, a sign known as agonal breathing, said rese-archers at the University of Washington (UW) in the US. A new skill for a smart speaker -- like Google Home and Amazon Alexa -- or smartphone lets the device detect the gasping sound of agonal breathing and call for help. Immediate Cardiop-ulmonary resuscitation (CPR) can double or triple someone's chance of survival, but that requires a bystander to be present. CPR is an emergency procedure that combines chest compressions often with artificial ventilation in an effort to manually preserve intact brain function. Recent research suggests that one of the most common locations for an out-of-hospital cardiac arrest is in a patient's bedroom, where no one is likely around or awake to respond and provide care.
7 Steps to Mastering Intermediate Machine Learning with Python -- 2019 Edition
Are you interested in learning more about machine learning with Python? I recently wrote 7 Steps to Mastering Basic Machine Learning with Python -- 2019 Edition, a first step in an attempt to updated a pair of posts I wrote some time back (7 Steps to Mastering Machine Learning With Python and 7 More Steps to Mastering Machine Learning With Python), a pair of posts which are getting stale at this point, having been around for a few years. It's time to add on to the "basic" post with a set of steps for learning "intermediate" level machine learning with Python. We're talking "intermediate" in a relative sense, however, so do not expect to be a research-caliber machine learning engineer after getting through this post. The learning path is aimed at those with some understanding of programming, computer science concepts, and/or machine learning in an abstract sense, who are wanting to be able to use the implementations of machine learning algorithms of the prevalent Python libraries to build their own machine learning models.
7 Steps to Mastering Intermediate Machine Learning with Python -- 2019 Edition
Are you interested in learning more about machine learning with Python? I recently wrote 7 Steps to Mastering Basic Machine Learning with Python -- 2019 Edition, a first step in an attempt to updated a pair of posts I wrote some time back (7 Steps to Mastering Machine Learning With Python and 7 More Steps to Mastering Machine Learning With Python), a pair of posts which are getting stale at this point, having been around for a few years. It's time to add on to the "basic" post with a set of steps for learning "intermediate" level machine learning with Python. We're talking "intermediate" in a relative sense, however, so do not expect to be a research-caliber machine learning engineer after getting through this post. The learning path is aimed at those with some understanding of programming, computer science concepts, and/or machine learning in an abstract sense, who are wanting to be able to use the implementations of machine learning algorithms of the prevalent Python libraries to build their own machine learning models.
Nested Cavity Classifier: performance and remedy
Mustafa, Waleed A., Yousef, Waleed A.
Many articles and books considered the assessment of classifiers using simulated and real-world datasets (e.g., (Raudys and Pikelis, 1980; Efron and Tibshirani, 1997; Hastie et al., 2001)); but none of them considered a systematic assessment of NCC. However, Inselberg and Avidan (2000) compared NCC with other classifiers only on few real high-dimensional datasets; that study mentioned the superiority of NCC over other classifiers. NCC, as described below, builds decision regions geometrically using convex hulls. This partitioning mechanism has a drawback on the performance of the NCC (as explained in Section 3). NCC classifies any testing observation--regardless to its class, whether "class 1" or "class 2"--as class, say, "class 2" as long as it does not lie inside the range of the training data set; i.e., within the minimum and maximum values of each dimension. Since this is not always true, the present article proposes combining NCC with LDA to classify observations outside the range of the training set.
MercurialMonkey/Harvard-University-Capstone-Project-Data-Science
I have submitted my own project using a dataset of my choosing. My project has been reviewed both by my peers and the professor. I chose to work with Credit Card Fraud Detection, It is important that credit card companies are able to recognize fraudulent credit card transactions so that customers are not charged for items that they did not purchase. The datasets contains transactions made by credit cards in September 2013 by european cardholders. Due to imbalancing nature of the data, many observations could be predicted as False Negative, in this case Legal Transactions instead of Fraudolent Transaction.
Comparing Classifiers: Decision Trees, K-NN & Naive Bayes
A myriad of options exist for classification. That said, three popular classification methods-- Decision Trees, k-NN & Naive Bayes--can be tweaked for practically every situation. Naive Bayes and K-NN, are both examples of supervised learning (where the data comes already labeled). Decision trees are easy to use for small amounts of classes. If you're trying to decide between the three, your best option is to take all three for a test drive on your data, and see which produces the best results.
Proof-of-concept system uses smart speakers to catch signs of cardiac arrest
In an effort to tackle in-home cardiac arrest, University of Washington researchers have devised a novel contactless system that uses smartphones or voice-based personal assistants to identify telltale breathing patterns that accompany an attack. The proof-of-concept strategy, described in an NPJ Digital Medicine paper published this morning, involved a supervised machine learning model called a support-vector machine that was trained for use in the bedroom, a controlled environment in which the majority of in-home cardiac arrests occur. "Sometimes reported as'gasping' breaths, agonal respirations may hold potential as an audible diagnostic biomarker, particularly in unwitnessed cardiac arrests that occur in a private residence, the location of [two-thirds] of all [out-of-hospital cardiac arrests]," the researchers wrote. "The widespread adoption of smartphones and smart speakers (projected to be in 75% of US households by 2020) presents a unique opportunity to identify this audible biomarker and connect unwitnessed cardiac arrest victims to emergency medical services (EMS) or others who can administer cardiopulmonary resuscitation." Cross-validation analysis of the trained classifier yielded an overall sensitivity and specificity of 97.24% and 99.51%.
Approximating Interactive Human Evaluation with Self-Play for Open-Domain Dialog Systems
Ghandeharioun, Asma, Shen, Judy Hanwen, Jaques, Natasha, Ferguson, Craig, Jones, Noah, Lapedriza, Agata, Picard, Rosalind
Building an open-domain conversational agent is a challenging problem. Current evaluation methods, mostly post-hoc judgments of single-turn evaluation, do not capture conversation quality in a realistic interactive context. In this paper, we investigate interactive human evaluation and provide evidence for its necessity; we then introduce a novel, model-agnostic, and dataset-agnostic method to approximate it. In particular, we propose a self-play scenario where the dialog system talks to itself and we calculate a combination of proxies such as sentiment and semantic coherence on the conversation trajectory. We show that this metric is capable of capturing the human-rated quality of a dialog model better than any automated metric known to-date, achieving a significant Pearson correlation (r>.7, p<.05). To investigate the strengths of this novel metric and interactive evaluation in comparison to state-of-the-art metrics and one-turn evaluation, we perform extended experiments with a set of models, including several that make novel improvements to recent hierarchical dialog generation architectures through sentiment and semantic knowledge distillation on the utterance level. Finally, we open-source the interactive evaluation platform we built and the dataset we collected to allow researchers to efficiently deploy and evaluate generative dialog models.
Joint Detection of Malicious Domains and Infected Clients
Prasse, Paul, Knaebel, Rene, Machlica, Lukas, Pevny, Tomas, Scheffer, Tobias
Detection of malware-infected computers and detection of malicious web domains based on their encrypted HTTPS traffic are challenging problems, because only addresses, timestamps, and data volumes are observable. The detection problems are coupled, because infected clients tend to interact with malicious domains. Traffic data can be collected at a large scale, and antivirus tools can be used to identify infected clients in retrospect. Domains, by contrast, have to be labeled individually after forensic analysis. We explore transfer learning based on sluice networks; this allows the detection models to bootstrap each other. In a large-scale experimental study, we find that the model outperforms known reference models and detects previously unknown malware, previously unknown malware families, and previously unknown malicious domains.