AITopics | Performance Analysis

Collaborating Authors

Performance Analysis

News Overviews Instructional Materials AI-Alerts Classics

Tighter bounds lead to improved classifiers

arXiv.org Machine LearningDec-28-2016

The standard approach to supervised classification involves the minimization of a log-loss as an upper bound to the classification error. While this is a tight bound early on in the optimization, it overemphasizes the influence of incorrectly classified examples far from the decision boundary. Updating the upper bound during the optimization leads to improved classification rates while transforming the learning into a sequence of minimization problems. In addition, in the context where the classifier is part of a larger system, this modification makes it possible to link the performance of the classifier to that of the whole system, allowing the seamless introduction of external constraints.

artificial intelligence, machine learning, optimization problem, (15 more...)

arXiv.org Machine Learning

1606.09202

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.69)

Add feedback

A Sparse Nonlinear Classifier Design Using AUC Optimization

Kakkar, Vishal, Shevade, Shirish K., Sundararajan, S, Garg, Dinesh

arXiv.org Machine LearningDec-27-2016

AUC (Area under the ROC curve) is an important performance measure for applications where the data is highly imbalanced. Learning to maximize AUC performance is thus an important research problem. Using a max-margin based surrogate loss function, AUC optimization problem can be approximated as a pairwise rankSVM learning problem. Batch learning methods for solving the kernelized version of this problem suffer from scalability and may not result in sparse classifiers. Recent years have witnessed an increased interest in the development of online or single-pass online learning algorithms that design a classifier by maximizing the AUC performance. The AUC performance of nonlinear classifiers, designed using online methods, is not comparable with that of nonlinear classifiers designed using batch learning algorithms on many real-world datasets. Motivated by these observations, we design a scalable algorithm for maximizing AUC performance by greedily adding the required number of basis functions into the classifier model. The resulting sparse classifiers perform faster inference. Our experimental results show that the level of sparsity achievable can be order of magnitude smaller than the Kernel RankSVM model without affecting the AUC performance much.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Machine Learning

1612.08633

Country: Asia > India (0.15)

Genre: Research Report > New Finding (0.66)

Industry: Education (0.55)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.35)

Add feedback

Distributed Real-Time Sentiment Analysis for Big Data Social Streams

Rahnama, Amir Hossein Akhavan

arXiv.org Machine LearningDec-27-2016

Big data trend has enforced the data-centric systems to have continuous fast data streams. In recent years, real-time analytics on stream data has formed into a new research field, which aims to answer queries about what-is-happening-now with a negligible delay. The real challenge with real-time stream data processing is that it is impossible to store instances of data, and therefore online analytical algorithms are utilized. To perform real-time analytics, pre-processing of data should be performed in a way that only a short summary of stream is stored in main memory. In addition, due to high speed of arrival, average processing time for each instance of data should be in such a way that incoming instances are not lost without being captured. Lastly, the learner needs to provide high analytical accuracy measures. Sentinel is a distributed system written in Java that aims to solve this challenge by enforcing both the processing and learning process to be done in distributed form. Sentinel is built on top of Apache Storm, a distributed computing platform. Sentinels learner, Vertical Hoeffding Tree, is a parallel decision tree-learning algorithm based on the VFDT, with ability of enabling parallel classification in distributed environments. Sentinel also uses SpaceSaving to keep a summary of the data stream and stores its summary in a synopsis data structure. Application of Sentinel on Twitter Public Stream API is shown and the results are discussed.

data mining, machine learning, real time system, (20 more...)

arXiv.org Machine Learning

doi: 10.1109/CoDIT.2014.6996998

1612.08543

Country: Europe > Finland (0.14)

Genre: Research Report (0.51)

Industry:

Information Technology (0.89)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.47)
Health & Medicine > Therapeutic Area > Immunology (0.47)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
(3 more...)

Add feedback

Data Science Has Been Using Rebel Statistics for a Long Time

@machinelearnbotDec-26-2016, 10:25:03 GMT

Many of those who call themselves statisticians just won't admit that data science heavily relies on and uses (heretical, rule-breaking) statistical science, or they don't recognize the true statistical nature of these data science techniques (some are 15-year old), or are opposed to the modernization of their statistical arsenal. They already missed the train when machine learning became a popular discipline (also heavily based on statistics) more than 15 years ago. Now machine learning professionals, who are statistical practitioners working on problems such as clustering, far outnumber statisticians. Many times, I have interacted with statisticians who think that anyone not calling himself statistician, knows nothing or little about statistics; see my recent bio published here, or visit the LinkedIn profiles of many data scientists, to debunk this myth. Any statistical technique that is not in their old books are considered heretical at best, or non-statistic at worst, or most of the time, not understood.

artificial intelligence, data mining, machine learning, (17 more...)

@machinelearnbot

Genre: Research Report > Experimental Study (0.49)

Industry: Information Technology (0.97)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.49)
(2 more...)

Add feedback

Multi-Region Neural Representation: A novel model for decoding visual stimuli in human brains

Yousefnezhad, Muhammad, Zhang, Daoqiang

arXiv.org Machine LearningDec-26-2016

Multivariate Pattern (MVP) classification holds enormous potential for decoding visual stimuli in the human brain by employing task-based fMRI data sets. There is a wide range of challenges in the MVP techniques, i.e. decreasing noise and sparsity, defining effective regions of interest (ROIs), visualizing results, and the cost of brain studies. In overcoming these challenges, this paper proposes a novel model of neural representation, which can automatically detect the active regions for each visual stimulus and then utilize these anatomical regions for visualizing and analyzing the functional activities. Therefore, this model provides an opportunity for neuroscientists to ask this question: what is the effect of a stimulus on each of the detected regions instead of just study the fluctuation of voxels in the manually selected ROIs. Moreover, our method introduces analyzing snapshots of brain image for decreasing sparsity rather than using the whole of fMRI time series. Further, a new Gaussian smoothing method is proposed for removing noise of voxels in the level of ROIs. The proposed method enables us to combine different fMRI data sets for reducing the cost of brain studies. Experimental studies on 4 visual categories (words, consonants, objects and nonsense photos) confirm that the proposed method achieves superior performance to state-of-the-art methods.

artificial intelligence, machine learning, snapshot, (17 more...)

arXiv.org Machine Learning

1612.08392

Country: Asia > China (0.14)

Genre: Research Report > Promising Solution (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Add feedback

How Blockchains could transform Artificial Intelligence - Dataconomy

#artificialintelligenceDec-25-2016, 00:05:11 GMT

This essay has described how blockchain technology can help AI, by drawing on my personal experiences in both AI and blockchain research.

artificial intelligence, blockchain, machine learning, (18 more...)

#artificialintelligence

Country:

Asia > China (0.04)
Europe (0.04)

Industry:

Information Technology > Security & Privacy (1.00)
Banking & Finance > Trading (0.94)

Technology:

Information Technology > e-Commerce > Financial Technology (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Add feedback

4 Reasons Your Machine Learning Model is Wrong (and How to Fix It)

#artificialintelligenceDec-23-2016, 20:55:36 GMT

There are a number of machine learning models to choose from. We can use Linear Regression to predict a value, Logistic Regression to classify distinct outcomes, and Neural Networks to model non-linear behaviors. When we build these models, we always use a set of historical data to help our machine learning algorithms learn what is the relationship between a set of input features to a predicted output. But even if this model can accurately predict a value from historical data, how do we know it will work as well on new data? Or more plainly, how do we evaluate whether a machine learning model is actually "good"?

artificial intelligence, machine learning, positive class, (15 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.38)

Add feedback

Fast and Adaptive Sparse Precision Matrix Estimation in High Dimensions

Liu, Weidong, Luo, Xi

arXiv.org Machine LearningDec-22-2016

This paper proposes a new method for estimating sparse precision matrices in the high dimensional setting. It has been popular to study fast computation and adaptive procedures for this problem. We propose a novel approach, called Sparse Column-wise Inverse Operator, to address these two issues. We analyze an adaptive procedure based on cross validation, and establish its convergence rate under the Frobenius norm. The convergence rates under other matrix norms are also established. This method also enjoys the advantage of fast computation for large-scale problems, via a coordinate descent algorithm. Numerical merits are illustrated using both simulated and real datasets. In particular, it performs favorably on an HIV brain tissue dataset and an ADHD resting-state fMRI dataset.

artificial intelligence, machine learning, matrix, (18 more...)

arXiv.org Machine Learning

doi: 10.1016/J.Jmva.2014.11.005

1203.3896

Country: North America > United States (0.68)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology > HIV (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.37)

Add feedback

Robust Contextual Outlier Detection: Where Context Meets Sparsity

Liang, Jiongqian, Parthasarathy, Srinivasan

arXiv.org Artificial IntelligenceDec-22-2016

Outlier detection is a fundamental data science task with applications ranging from data cleaning to network security. Given the fundamental nature of the task, this has been the subject of much research. Recently, a new class of outlier detection algorithms has emerged, called {\it contextual outlier detection}, and has shown improved performance when studying anomalous behavior in a specific context. However, as we point out in this article, such approaches have limited applicability in situations where the context is sparse (i.e. lacking a suitable frame of reference). Moreover, approaches developed to date do not scale to large datasets. To address these problems, here we propose a novel and robust approach alternative to the state-of-the-art called RObust Contextual Outlier Detection (ROCOD). We utilize a local and global behavioral model based on the relevant contexts, which is then integrated in a natural and robust fashion. We also present several optimizations to improve the scalability of the approach. We run ROCOD on both synthetic and real-world datasets and demonstrate that it outperforms other competitive baselines on the axes of efficacy and efficiency (40X speedup compared to modern contextual outlier detection methods). We also drill down and perform a fine-grained analysis to shed light on the rationale for the performance gains of ROCOD and reveal its effectiveness when handling objects with sparse contexts.

artificial intelligence, data mining, machine learning, (17 more...)

arXiv.org Artificial Intelligence

1607.08329

Country: North America > United States (0.68)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (0.48)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Beginners Tutorial on XGBoost and Parameter Tuning in R

#artificialintelligenceDec-21-2016, 16:40:13 GMT

Last week, we learned about Random Forest Algorithm. Now we know it helps us reduce a model's variance by building models on resampled data and thereby increases its generalization capability.

artificial intelligence, classification, machine learning, (16 more...)

#artificialintelligence

Genre: Instructional Material > Course Syllabus & Notes (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.30)

Add feedback