Goto

Collaborating Authors

 Performance Analysis


A comparison of apartment rent price prediction using a large dataset: Kriging versus DNN

arXiv.org Machine Learning

The hedonic approach based on a regression model has been widely adopted for the prediction of real estate property price and rent. In particular, a spatial regression technique called Kriging, a method of interpolation that was advanced in the field of spatial statistics, are known to enable high accuracy prediction in light of the spatial dependence of real estate property data. Meanwhile, there has been a rapid increase in machine learning-based prediction using a large (big) dataset and its effectiveness has been demonstrated in previous studies. However, no studies have ever shown the extent to which predictive accuracy differs for Kriging and machine learning techniques using big data. Thus, this study compares the predictive accuracy of apartment rent price in Japan between the nearest neighbor Gaussian processes (NNGP) model, which enables application of Kriging to big data, and the deep neural network (DNN), a representative machine learning technique, with a particular focus on the data sample size (n = 10^4, 10^5, 10^6) and differences in predictive performance. Our analysis showed that, with an increase in sample size, the out-of-sample predictive accuracy of DNN approached that of NNGP and they were nearly equal on the order of n = 10^6. Furthermore, it is suggested that, for both higher and lower end properties whose rent price deviates from the median, DNN may have a higher predictive accuracy than that of NNGP.


A Review of Statistical Learning Machines from ATR to DNA Microarrays: design, assessment, and advice for practitioners

arXiv.org Machine Learning

Statistical Learning is the process of estimating an unknown probabilistic input-output relationship of a system using a limited number of observations; and a statistical learning machine (SLM) is the machine that learned such a process. While their roots grow deeply in Probability Theory, SLMs are ubiquitous in the modern world. Automatic Target Recognition (ATR) in military applications, Computer Aided Diagnosis (CAD) in medical imaging, DNA microarrays in Genomics, Optical Character Recognition (OCR), Speech Recognition (SR), spam email filtering, stock market prediction, etc., are few examples and applications for SLM; diverse fields but one theory. The field of Statistical Learning can be decomposed to two basic subfields, Design and Assessment. Three main groups of specializations-namely statisticians, engineers, and computer scientists (ordered ascendingly by programming capabilities and descendingly by mathematical rigor)-exist on the venue of this field and each takes its elephant bite. Exaggerated rigorous analysis of statisticians sometimes deprives them from considering new ML techniques and methods that, yet, have no "complete" mathematical theory. On the other hand, immoderate add-hoc simulations of computer scientists sometimes derive them towards unjustified and immature results. A prudent approach is needed that has the enough flexibility to utilize simulations and trials and errors without sacrificing any rigor. If this prudent attitude is necessary for this field it is necessary, as well, in other fields of Engineering.


Test-Driven Machine Learning

#artificialintelligence

First, before I start, I want to say something about what that is, or what I understand from this. So, here is one interpretation. It is about using data, obviously. So, it has relationships to analytics and data science, and it is, obviously, part of AI in some way. This is my little taxonomy, how I see things linking together. You have computer science, and that has subfields like AI, software engineering, and machine learning is typically considered to be subfield of AI, but a lot of principles of software engineering apply in this area. This is what I want to talk about today. It's heavily used in data science. So, the difference between AI and data science is somewhat fluid if you like, but data science tries to understand what's in data and tries to understand questions about data. But then it tries to use this to make decisions, and then we are back at AI, artificial intelligence, where it's mostly about automating decision making. We have a couple of definitions. AI means using intelligence, making machines intelligent, and that means you can somehow function appropriate in an environment with foresight. Machine learning is a field that looks for algorithms that can automatically improve their performance without explicit programming, but by observing relevant data. And yes, I've thrown in data science as well for good measure, the scientific process of turning data into insight for making better decisions. If you have opened any newspaper, you must have seen the discussion around the ethical dimensions of artificial intelligence, machine learning or data science. Testing touches on that as well because there are quite a few problems in that space, and I'm just listing two here. So, you use data, obviously, to do machine learning. Where does this data come from, and are you allowed to use it? Do you violate any privacy laws, or are you building models that you use to make decisions about people? If you do that, then the general data protection regulation in the EU says you have to be able to explain to an individual if you're making a decision based on an algorithm or a machine, if this decision is of any kind of significant impact. That means, in machine learning, a lot of models are already out of the door because you can't do that. You can't explain why a certain decision comes out of a machine learning model if you use particular models.


Applying AI: getting underneath machine learning โ€“ Avira Insights

#artificialintelligence

If you attended last year's RSA conference, you may have left with the idea that all you needed to build a complete cyber-security solution was a machine learning engine (or better yet, "advanced next-gen Artificial Intelligence"). Every cyber-security company uses machine learning (or AI) because it is a powerful technique for malware analysis. But it is by no means the only one. Applied naรฏvely, it may not even work effectively. Sometimes, a powerful scanning engine is all that is required (it's'cheap'), or even just a great database of known malware hashes (it's fast).


LIAAD at SemDeep-5 Challenge: Word-in-Context (WiC)

arXiv.org Artificial Intelligence

In LMMS has two useful properties: 1) uses contextual particular, it focuses on polysemous words which word embeddings to produce sense embeddings, have been hard to represent as embeddings due and 2) covers a large set of over 117K to the meaning conflation deficiency (Camacho-senses from WordNet 3.0. The first property allows Collados and Pilehvar, 2018). The task's objective for comparing precomputed sense embeddings is to detect if target words occurring in a pair of against contextual word embeddings generated sentences carry the same meaning.


Assessing the Applicability of Authorship Verification Methods

arXiv.org Machine Learning

Authorship verification (AV) is a research subject in the field of digital text forensics that concerns itself with the question, whether two documents have been written by the same person. During the past two decades, an increasing number of proposed AV approaches can be observed. However, a closer look at the respective studies reveals that the underlying characteristics of these methods are rarely addressed, which raises doubts regarding their applicability in real forensic settings. The objective of this paper is to fill this gap by proposing clear criteria and properties that aim to improve the characterization of existing and future AV approaches. Based on these properties, we conduct three experiments using 12 existing AV approaches, including the current state of the art. The examined methods were trained, optimized and evaluated on three self-compiled corpora, where each corpus focuses on a different aspect of applicability. Our results indicate that part of the methods are able to cope with very challenging verification cases such as 250 characters long informal chat conversations (72.7% accuracy) or cases in which two scientific documents were written at different times with an average difference of 15.6 years (> 75% accuracy). However, we also identified that all involved methods are prone to cross-topic verification cases.


Fault Matters: Sensor Data Fusion for Detection of Faults using Dempster-Shafer Theory of Evidence in IoT-Based Applications

arXiv.org Artificial Intelligence

Fault detection in sensor nodes is a pertinent issue that has been an important area of research for a very long time. But it is not explored much as yet in the context of Internet of Things. Internet of Things work with a massive amount of data so the responsibility for guaranteeing the accuracy of the data also lies with it. Moreover, a lot of important and critical decisions are made based on these data, so ensuring its correctness and accuracy is also very important. Also, the detection needs to be as precise as possible to avoid negative alerts. For this purpose, this work has adopted Dempster-Shafer Theory of Evidence which is a popular learning method to collate the information from sensors to come up with a decision regarding the faulty status of a sensor node. To verify the validity of the proposed method, simulations have been performed on a benchmark data set and data collected through a test bed in a laboratory set-up. For the different types of faults, the proposed method shows very competent accuracy for both the benchmark (99.8%) and laboratory data sets (99.9%) when compared to the other state-of-the-art machine learning techniques.


Scientists develop artificial intelligence system to detect cardiac arrest in sleep

#artificialintelligence

Washington: Scientists have developed a new artificial intelligence (AI) system to monitor people for cardiac arrest while they are asleep without touching them. People experiencing cardiac arrest will suddenly become unresponsive and either stop breathing or gasp for air, a sign known as agonal breathing, said rese-archers at the University of Washington (UW) in the US. A new skill for a smart speaker -- like Google Home and Amazon Alexa -- or smartphone lets the device detect the gasping sound of agonal breathing and call for help. Immediate Cardiop-ulmonary resuscitation (CPR) can double or triple someone's chance of survival, but that requires a bystander to be present. CPR is an emergency procedure that combines chest compressions often with artificial ventilation in an effort to manually preserve intact brain function. Recent research suggests that one of the most common locations for an out-of-hospital cardiac arrest is in a patient's bedroom, where no one is likely around or awake to respond and provide care.


7 Steps to Mastering Intermediate Machine Learning with Python -- 2019 Edition

#artificialintelligence

Are you interested in learning more about machine learning with Python? I recently wrote 7 Steps to Mastering Basic Machine Learning with Python -- 2019 Edition, a first step in an attempt to updated a pair of posts I wrote some time back (7 Steps to Mastering Machine Learning With Python and 7 More Steps to Mastering Machine Learning With Python), a pair of posts which are getting stale at this point, having been around for a few years. It's time to add on to the "basic" post with a set of steps for learning "intermediate" level machine learning with Python. We're talking "intermediate" in a relative sense, however, so do not expect to be a research-caliber machine learning engineer after getting through this post. The learning path is aimed at those with some understanding of programming, computer science concepts, and/or machine learning in an abstract sense, who are wanting to be able to use the implementations of machine learning algorithms of the prevalent Python libraries to build their own machine learning models.


7 Steps to Mastering Intermediate Machine Learning with Python -- 2019 Edition

#artificialintelligence

Are you interested in learning more about machine learning with Python? I recently wrote 7 Steps to Mastering Basic Machine Learning with Python -- 2019 Edition, a first step in an attempt to updated a pair of posts I wrote some time back (7 Steps to Mastering Machine Learning With Python and 7 More Steps to Mastering Machine Learning With Python), a pair of posts which are getting stale at this point, having been around for a few years. It's time to add on to the "basic" post with a set of steps for learning "intermediate" level machine learning with Python. We're talking "intermediate" in a relative sense, however, so do not expect to be a research-caliber machine learning engineer after getting through this post. The learning path is aimed at those with some understanding of programming, computer science concepts, and/or machine learning in an abstract sense, who are wanting to be able to use the implementations of machine learning algorithms of the prevalent Python libraries to build their own machine learning models.