AITopics

1610.01712

Country: Asia > India > Maharashtra > Mumbai (0.24)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.71)

Industry: Health & Medicine > Therapeutic Area > Oncology > Head & Neck Cancer (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Bax, Eric, Kooti, Farshad

Ensemble Validation: Selectivity has a Price, but Variety is Free

arXiv.org Machine LearningOct-4-2016

If classifiers are selected from a hypothesis class to form an ensemble, bounds on average error rate over the selected classifiers include a co mponent for selectivity, which grows as the fraction of hypothesis classifiers selected for the ensemble shrinks, and a component for variety, which grows with the size of the hypothesis class or in-sample data set. W e show that the component for se lectivity asymptotically dominates the component for variety, meaning tha t variety is essentially free.

artificial intelligence, classifier, machine learning, (16 more...)

1610.01234

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.41)

#artificialintelligenceOct-2-2016, 00:30:43 GMT

Predicting CTRs on Criteo's display ads – Experiments with Machine Learning

Before we dive into exploring and building various models to achieve our objective, we must zero in on a quality metric that'll help us compare them. The most natural choice for a quality metric in the case of a classification problem seems to be that of the 0–1 classification error/accuracy, i.e., the percentage of instances where our model predicted an incorrect/correct label. In our case, the labels would be click and no-click. The alternative is to either use the area under the ROC curve (AUC) or the log-loss as the quality metric. Since the official metric as recommended on the Kaggle's website for this dataset is log-loss, we're going to use the same for the scope of our analysis.

artificial intelligence, machine learning, quality metric, (6 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.42)

#artificialintelligenceOct-1-2016, 19:15:54 GMT

MonkeyLearn - Explore the confusion matrix

The confusion matrix is great way to visualize the performance of a classifier and detect false positives and false negatives within your data. Now you can click on the confusion matrix and check out which samples are causing the confusions, making it much easier to clean and curate the training data to improve classifiers.

artificial intelligence, confusion matrix, machine learning, (1 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Ypsilantis, Petros-Pavlos, Montana, Giovanni

Recurrent Convolutional Networks for Pulmonary Nodule Detection in CT Imaging

arXiv.org Machine LearningSep-30-2016

Computed tomography (CT) generates a stack of cross-sectional images covering a region of the body. The visual assessment of these images for the identification of potential abnormalities is a challenging and time consuming task due to the large amount of information that needs to be processed. In this article we propose a deep artificial neural network architecture, ReCTnet, for the fully-automated detection of pulmonary nodules in CT scans. The architecture learns to distinguish nodules and normal structures at the pixel level and generates three-dimensional probability maps highlighting areas that are likely to harbour the objects of interest. Convolutional and recurrent layers are combined to learn expressive image representations exploiting the spatial dependencies across axial slices. We demonstrate that leveraging intra-slice dependencies substantially increases the sensitivity to detect pulmonary nodules without inflating the false positive rate. On the publicly available LIDC/IDRI dataset consisting of 1,018 annotated CT scans, ReCTnet reaches a detection sensitivity of 90.5% with an average of 4.5 false positives per scan. Comparisons with a competing multi-channel convolutional neural network for multi-slice segmentation and other published methodologies using the same dataset provide evidence that ReCTnet offers significant performance gains.

artificial intelligence, machine learning, nodule, (19 more...)

1609.09143

Country: North America > United States (0.67)

Genre: Research Report > New Finding (0.93)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Nuclear Medicine (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningSep-30-2016

Turing learning: a metric-free approach to inferring behavior and its application to swarms

Li, Wei, Gauci, Melvin, Gross, Roderich

We propose Turing Learning, a novel system identification method for inferring the behavior of natural or artificial systems. Turing Learning simultaneously optimizes two populations of computer programs, one representing models of the behavior of the system under investigation, and the other representing classifiers. By observing the behavior of the system as well as the behaviors produced by the models, two sets of data samples are obtained. The classifiers are rewarded for discriminating between these two sets, that is, for correctly categorizing data samples as either genuine or counterfeit. Conversely, the models are rewarded for 'tricking' the classifiers into categorizing their data samples as genuine. Unlike other methods for system identification, Turing Learning does not require predefined metrics to quantify the difference between the system and its models. We present two case studies with swarms of simulated robots and prove that the underlying behaviors cannot be inferred by a metric-based system identification method. By contrast, Turing Learning infers the behaviors with high accuracy. It also produces a useful by-product - the classifiers - that can be used to detect abnormal behavior in the swarm. Moreover, we show that Turing Learning also successfully infers the behavior of physical robot swarms. The results show that collective behaviors can be directly inferred from motion trajectories of individuals in the swarm, which may have significant implications for the study of animal collectives. Furthermore, Turing Learning could prove useful whenever a behavior is not easily characterizable using metrics, making it suitable for a wide range of applications.

artificial intelligence, evolutionary algorithm, machine learning, (16 more...)

doi: 10.1007/s11721-016-0126-1

1603.04904

Country: North America > United States > New Jersey (0.28)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Blalock, Davis W., Guttag, John V.

EXTRACT: Strong Examples from Weakly-Labeled Sensor Data

arXiv.org Machine LearningSep-29-2016

Thanks to the rise of wearable and connected devices, sensor-generated time series comprise a large and growing fraction of the world's data. Unfortunately, extracting value from this data can be challenging, since sensors report low-level signals (e.g., acceleration), not the high-level events that are typically of interest (e.g., gestures). We introduce a technique to bridge this gap by automatically extracting examples of real-world events in low-level data, given only a rough estimate of when these events have taken place. By identifying sets of features that repeat in the same temporal arrangement, we isolate examples of such diverse events as human actions, power consumption patterns, and spoken words with up to 96% precision and recall. Our method is fast enough to run in real time and assumes only minimal knowledge of which variables are relevant or the lengths of events. Our evaluation uses numerous publicly available datasets and over 1 million samples of manually labeled sensor data.

artificial intelligence, machine learning, time sery, (17 more...)

1609.09196

Country: North America > United States > New York (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

#artificialintelligenceSep-27-2016, 18:32:45 GMT

Kaggle Ensembling Guide

Model ensembling is a very powerful technique to increase accuracy on a variety of ML tasks. In this article I will share my ensembling approaches for Kaggle Competitions. For the first part we look at creating ensembles from submission files. The second part will look at creating ensembles through stacked generalization/blending. I answer why ensembling reduces the generalization error. Finally I show different methods of ensembling, together with their results and code to try it out for yourself. This is how you win ML competitions: you take other peoples' work and ensemble them together." The most basic and convenient way to ensemble is to ensemble Kaggle submission CSV files. You only need the predictions on the test set for these methods -- no need to retrain a model. This makes it a quick way to ensemble already existing model predictions, ideal when teaming up. Let's see why model ensembling reduces error rate and why it works better to ensemble low-correlated model ...

artificial intelligence, machine learning, prediction, (18 more...)

Country: North America > United States > Hawaii (0.04)

Genre: Contests & Prizes (0.34)

Industry: Media > Film (0.47)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.49)

#artificialintelligenceSep-27-2016, 11:35:31 GMT

How machine learning can help the security industry

Machine learning (ML) is such a hot area in security right now. At the 2016 RSA Conference, you would be hard pressed to find a company that is not claiming to use ML for security. To the layperson, ML seems like the magic solution to all security problems. Take a bunch of unlabeled data, pump it through a system with some ML magic inside, and it can somehow identify patterns even human experts can't find -- all while learning and adapting to new behaviors and threats. Rather than having to code the rules, these systems can discover the rules all by themselves.

artificial intelligence, authentication, machine learning, (9 more...)

Industry: Information Technology > Security & Privacy (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.33)

#artificialintelligenceSep-26-2016, 04:50:33 GMT

Machine Learning: Filtering Email for Spam or Ham - Code School Blog

You may have seen our previous posts on machine learning -- specifically, how to let your code learn from text and working with stop words, stemming, and spam. So today, we're going to build our machine learning-based spam filter, using the tools we walked through in those posts: tokenizer, stemmer, and naive bayes classifier. We are going to work with bluebird promise library here, so if you are not used to promises, please take a look at the bluebird API reference. Before we begin, it's important to have good training data. You can download some here -- we are interested in two.

artificial intelligence, email, machine learning, (15 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.56)