AITopics | Accuracy

Collaborating Authors

Accuracy

News Overviews Instructional Materials AI-Alerts Classics

Implementing your own k-nearest neighbour algorithm using Python

#artificialintelligenceMar-23-2016, 04:15:43 GMT

In machine learning, you may often wish to build predictors that allows to classify things into categories based on some set of associated values. For example, it is possible to provide a diagnosis to a patient based on data from previous patients. Many algorithms have been developed for automated classification, and common ones include random forests, support vector machines, Naïve Bayes classifiers, and many types of neural networks. To get a feel for how classification works, we take a simple example of a classification algorithm – k-Nearest Neighbours (kNN) – and build it from scratch in Python 2. You can use a mostly imperative style of coding, rather than a declarative/functional one with lambda functions and list comprehensions to keep things simple if you are starting with Python. Here, we will provide an introduction to the latter approach.

artificial intelligence, machine learning, neighbour, (15 more...)

#artificialintelligence

Genre: Overview (0.69)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.48)

Add feedback

Predicting litigation likelihood and time to litigation for patents

Wongchaisuwat, Papis, Klabjan, Diego, McGinnis, John O.

arXiv.org Machine LearningMar-23-2016

Patent lawsuits are costly and time-consuming. An ability to forecast a patent litigation and time to litigation allows companies to better allocate budget and time in managing their patent portfolios. We develop predictive models for estimating the likelihood of litigation for patents and the expected time to litigation based on both textual and non-textual features. Our work focuses on improving the state-of-the-art by relying on a different set of features and employing more sophisticated algorithms with more realistic data. The rate of patent litigations is very low, which consequently makes the problem difficult. The initial model for predicting the likelihood is further modified to capture a time-to-litigation perspective.

data mining, machine learning, patent, (18 more...)

arXiv.org Machine Learning

1603.07394

Country: North America > United States (1.00)

Genre: Research Report (0.64)

Industry:

Law > Litigation (1.00)
Government > Regional Government > North America Government > United States Government (0.96)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.97)
Information Technology > Data Science > Data Mining (0.90)

Add feedback

Is your Classification Model making lucky guesses?

#artificialintelligenceMar-22-2016, 21:15:50 GMT

At the heart of a classification model is the ability to assign a class to an object based on its description or features. When we build a classification model, often we have to prove that the model we built is significantly better than random guessing. How do we know if our machine learning model performs better than a classifier built by assigning labels or classes arbitrarily (through random guess, weighted guess etc.)? I will call the latter non-machine learning classifiers as these do not learn from the data. A machine learning classifier should be smarter and should not be making just lucky guesses!

artificial intelligence, machine learning, probability, (13 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.54)

Add feedback

A Gentle Guide to Machine Learning MonkeyLearn Blog

#artificialintelligenceMar-20-2016, 16:16:01 GMT

Machine Learning is a subfield within Artificial Intelligence that builds algorithms that allow computers to learn to perform tasks from data instead of being explicitly programmed. We can make machines learn to do things! The first time I heard that, it blew my mind. That means that we can program computers to learn things by themselves! The ability of learning is one of the most important aspects of intelligence. Translating that power to machines, sounds like a huge step towards making them more intelligent. And in fact, Machine Learning is the area that is making most of the progress in Artificial Intelligence today; being a trendy topic right now and pushing the possibility to have more intelligent machines.

artificial intelligence, inductive learning, machine learning, (14 more...)

#artificialintelligence

Industry: Leisure & Entertainment > Games (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.58)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

Add feedback

Extracting Predictive Information from Heterogeneous Data Streams using Gaussian Processes

Ghoshal, Sid, Roberts, Stephen

arXiv.org Machine LearningMar-20-2016

Financial markets are notoriously complex environments, presenting vast amounts of noisy, yet potentially informative data. We consider the problem of forecasting financial time series from a wide range of information sources using online Gaussian Processes with Automatic Relevance Determination (ARD) kernels. We measure the performance gain, quantified in terms of Normalised Root Mean Square Error (NRMSE), Median Absolute Deviation (MAD) and Pearson correlation, from fusing each of four separate data domains: time series technicals, sentiment analysis, options market data and broker recommendations. We show evidence that ARD kernels produce meaningful feature rankings that help retain salient inputs and reduce input dimensionality, providing a framework for sifting through financial complexity. We measure the performance gain from fusing each domain's heterogeneous data streams into a single probabilistic model. In particular our findings highlight the critical value of options data in mapping out the curvature of price space and inspire an intuitive, novel direction for research in financial prediction.

correlation, machine learning, natural language, (17 more...)

arXiv.org Machine Learning

1603.06202

Genre:

Research Report > New Finding (0.66)
Research Report > Experimental Study (0.47)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

A Probabilistic Machine Learning Approach to Detect Industrial Plant Faults

Xiao, Wei

arXiv.org Machine LearningMar-18-2016

Fault detection in industrial plants is a hot research area as more and more sensor data are being collected throughout the industrial process. Automatic data-driven approaches are widely needed and seen as a promising area of investment. This paper proposes an effective machine learning algorithm to predict industrial plant faults based on classification methods such as penalized logistic regression, random forest and gradient boosted tree. A fault's start time and end time are predicted sequentially in two steps by formulating the original prediction problems as classification problems. The algorithms described in this paper won first place in the Prognostics and Health Management Society 2015 Data Challenge.

artificial intelligence, machine learning, start time, (18 more...)

arXiv.org Machine Learning

1603.0577

Country: North America > United States (1.00)

Genre: Research Report (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Combining Human and Artificial Intelligence for Analyzing Health Data

Duhaime, Erik P. (Massachusetts Institute of Technology)

AAAI ConferencesMar-16-2016

Artificial intelligence (AI) systems are increasingly capable of analyzing health data such as medical images (e.g., skin lesions) and test results (e.g., ECGs). However, because it can be difficult to determine when an AI-generated diagnosis should be trusted and acted upon—especially when it conflicts with a human-generated one—many AI systems are not utilized effectively, if at all. Similarly, advances in information technology have made it possible to quickly solicit multiple diagnoses from diverse groups of people throughout the world, but these technologies are underutilized because it is difficult to determine which of multiple diagnoses should be trusted and acted upon. Here, I propose a method of soliciting and combining multiple diagnoses that will harness the collective intelligence of both human and artificial intelligence for analyzing health data.

accuracy, combining human and artificial intelligence, diagnosis, (9 more...)

AAAI Conferences

2016 AAAI Spring Symposium Series

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.15)
North America > United States > New York > New York County > New York City (0.05)
North America > United States > New Jersey > Mercer County > Princeton (0.05)
North America > United States > Massachusetts > Middlesex County > Reading (0.05)

Industry:

Health & Medicine > Diagnostic Medicine (0.50)
Health & Medicine > Therapeutic Area > Dermatology (0.35)

Technology:

Information Technology > Biomedical Informatics > Clinical Informatics (0.83)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.54)

Add feedback

Bias Correction for Regularized Regression and its Application in Learning with Streaming Data

Wu, Qiang

arXiv.org Machine LearningMar-15-2016

We propose an approach to reduce the bias of ridge regression and regularization kernel network. When applied to a single data set the new algorithms have comparable learning performance with the original ones. When applied to incremental learning with block wise streaming data the new algorithms are more efficient due to bias reduction. Both theoretical characterizations and simulation studies are used to verify the effectiveness of these new algorithms.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

1603.04882

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.36)

Add feedback

Novel Feature Extraction, Selection and Fusion for Effective Malware Family Classification

Ahmadi, Mansour, Ulyanov, Dmitry, Semenov, Stanislav, Trofimov, Mikhail, Giacinto, Giorgio

arXiv.org Artificial IntelligenceMar-10-2016

Modern malware is designed with mutation characteristics, namely polymorphism and metamorphism, which causes an enormous growth in the number of variants of malware samples. Categorization of malware samples on the basis of their behaviors is essential for the computer security community, because they receive huge number of malware everyday, and the signature extraction process is usually based on malicious parts characterizing malware families. Microsoft released a malware classification challenge in 2015 with a huge dataset of near 0.5 terabytes of data, containing more than 20K malware samples. The analysis of this dataset inspired the development of a novel paradigm that is effective in categorizing malware variants into their actual family groups. This paradigm is presented and discussed in the present paper, where emphasis has been given to the phases related to the extraction, and selection of a set of novel features for the effective representation of malware samples. Features can be grouped according to different characteristics of malware behavior, and their fusion is performed according to a per-class weighting paradigm. The proposed method achieved a very high accuracy ($\approx$ 0.998) on the Microsoft Malware Challenge dataset.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Artificial Intelligence

1511.04317

Country: North America > United States (1.00)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Add feedback

Selective Inference Approach for Statistically Sound Predictive Pattern Mining

Suzumura, Shinya, Nakagawa, Kazuya, Sugiyama, Mahito, Tsuda, Koji, Takeuchi, Ichiro

arXiv.org Machine LearningMar-9-2016

Discovering statistically significant patterns from databases is an important challenging problem. The main obstacle of this problem is in the difficulty of taking into account the selection bias, i.e., the bias arising from the fact that patterns are selected from extremely large number of candidates in databases. In this paper, we introduce a new approach for predictive pattern mining problems that can address the selection bias issue. Our approach is built on a recently popularized statistical inference framework called selective inference. In selective inference, statistical inferences (such as statistical hypothesis testing) are conducted based on sampling distributions conditional on a selection event. If the selection event is characterized in a tractable way, statistical inferences can be made without minding selection bias issue. However, in pattern mining problems, it is difficult to characterize the entire selection process of mining algorithms. Our main contribution in this paper is to solve this challenging problem for a class of predictive pattern mining problems by introducing a novel algorithmic framework. We demonstrate that our approach is useful for finding statistically significant patterns from databases.

data mining, machine learning, pattern recognition, (16 more...)

arXiv.org Machine Learning

1602.04601

Genre: Research Report > Experimental Study (0.32)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.49)

Add feedback