AITopics | Performance Analysis

Many aspects of the design of efficient crowdsourcing processes, such as defining workers bonuses, fair prices and time limits of the tasks, involve knowledge of the likely duration of the task at hand. In this work we introduce a new timesensitive Bayesian aggregation method that simultaneously estimates a tasks duration and obtains reliable aggregations of crowdsourced judgments. Our method, called BCCTime, uses latent variables to represent the uncertainty about the workers completion time, the tasks duration and the workers accuracy. To relate the quality of a judgment to the time a worker spends on a task, our model assumes that each task is completed within a latent time window within which all workers with a propensity to genuinely attempt the labelling task (i.e., no spammers) are expected to submit their judgments. In contrast, workers with a lower propensity to valid labelling, such as spammers, bots or lazy labellers, are assumed to perform tasks considerably faster or slower than the time required by normal workers. Specifically, we use efficient message-passing Bayesian inference to learn approximate posterior probabilities of (i) the confusion matrix of each worker, (ii) the propensity to valid labelling of each worker, (iii) the unbiased duration of each task and (iv) the true label of each task. Using two real- world public datasets for entity linking tasks, we show that BCCTime produces up to 11% more accurate classifications and up to 100% more informative estimates of a tasks duration compared to stateoftheart methods.

confusion matrix, dataset, judgment, (13 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.5175

AI Access Foundation

11016

Journal of Artificial Intelligence Research

Country:

Europe > Switzerland (0.05)
Asia > India (0.04)
North America > United States > Washington > King County > Redmond (0.04)
(3 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Communications > Social Media > Crowdsourcing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
(2 more...)

Add feedback

Identifying Depression on Twitter

Nadeem, Moin

arXiv.org Machine LearningJul-25-2016

Social media has recently emerged as a premier method to disseminate information online. Through these online networks, tens of millions of individuals communicate their thoughts, personal experiences, and social ideals. We therefore explore the potential of social media to predict, even prior to onset, Major Depressive Disorder (MDD) in online personas. We employ a crowdsourced method to compile a list of Twitter users who profess to being diagnosed with depression. Using up to a year of prior social media postings, we utilize a Bag of Words approach to quantify each tweet. Lastly, we leverage several statistical classifiers to provide estimates to the risk of depression. Our work posits a new methodology for constructing our classifier by treating social as a text-classification problem, rather than a behavioral one on social media platforms. By using a corpus of 2.5M tweets, we achieved an 81% accuracy rate in classification, with a precision score of .86. We believe that this method may be helpful in developing tools that estimate the risk of an individual being depressed, can be employed by physicians, concerned individuals, and healthcare agencies to aid in diagnosis, even possibly enabling those suffering from depression to be more proactive about recovering from their mental health.

artificial intelligence, machine learning, social media, (15 more...)

arXiv.org Machine Learning

1607.07384

Country: North America > United States > Maryland (0.14)

Genre: Research Report > New Finding (0.47)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Scalable Link Prediction in Dynamic Networks via Non-Negative Matrix Factorization

Zhu, Linhong, Guo, Dong, Yin, Junming, Steeg, Greg Ver, Galstyan, Aram

arXiv.org Artificial IntelligenceJul-23-2016

We propose a scalable temporal latent space model for link prediction in dynamic social networks, where the goal is to predict links over time based on a sequence of previous graph snapshots. The model assumes that each user lies in an unobserved latent space and interactions are more likely to form between similar users in the latent space representation. In addition, the model allows each user to gradually move its position in the latent space as the network structure evolves over time. We present a global optimization algorithm to effectively infer the temporal latent space, with a quadratic convergence rate. Two alternative optimization algorithms with local and incremental updates are also proposed, allowing the model to scale to larger networks without compromising prediction accuracy. Empirically, we demonstrate that our model, when evaluated on a number of real-world dynamic networks, significantly outperforms existing approaches for temporal link prediction in terms of both scalability and predictive power.

artificial intelligence, data mining, machine learning, (21 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TKDE.2016.2591009

1411.3675

Country:

North America > United States (0.46)
Asia (0.46)

Genre: Research Report > New Finding (0.67)

Industry:

Education (0.46)
Health & Medicine (0.46)
Information Technology > Services (0.36)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
(3 more...)

Add feedback

8 Tactics to Combat Imbalanced Classes in Your Machine Learning Dataset - Machine Learning Mastery

#artificialintelligenceJul-22-2016, 15:20:14 GMT

You are working on your dataset. You create a classification model and get 90% accuracy immediately. You dive a little deeper and discover that 90% of the data belongs to one class. This is an example of an imbalanced dataset and the frustrating results it can cause. In this post you will discover the tactics that you can use to deliver great results on machine learning datasets with imbalanced data.

artificial intelligence, data mining, machine learning, (16 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.32)
Information Technology > Data Science > Data Mining > Anomaly Detection (0.31)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.30)

Add feedback

Latent Variable Discovery Using Dependency Patterns

Zhang, Xuhui, Korb, Kevin B., Nicholson, Ann E., Mascaro, Steven

arXiv.org Machine LearningJul-22-2016

The causal discovery of Bayesian networks is an active and important research area, and it is based upon searching the space of causal models for those which can best explain a pattern of probabilistic dependencies shown in the data. However, some of those dependencies are generated by causal structures involving variables which have not been measured, i.e., latent variables. Some such patterns of dependency "reveal" themselves, in that no model based solely upon the observed variables can explain them as well as a model using a latent variable. That is what latent variable discovery is based upon. Here we did a search for finding them systematically, so that they may be applied in latent variable discovery in a more rigorous fashion.

artificial intelligence, latent, machine learning, (14 more...)

arXiv.org Machine Learning

1607.06617

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

Add feedback

Untangling AdaBoost-based Cost-Sensitive Classification. Part II: Empirical Analysis

Landesa-Vázquez, Iago, Alba-Castro, José Luis

arXiv.org Artificial IntelligenceJul-22-2016

A lot of approaches, each following a different strategy, have been proposed in the literature to provide AdaBoost with cost-sensitive properties. In the first part of this series of two papers, we have presented these algorithms in a homogeneous notational framework, proposed a clustering scheme for them and performed a thorough theoretical analysis of those approaches with a fully theoretical foundation. The present paper, in order to complete our analysis, is focused on the empirical study of all the algorithms previously presented over a wide range of heterogeneous classification problems. The results of our experiments, confirming the theoretical conclusions, seem to reveal that the simplest approach, just based on cost-sensitive weight initialization, is the one showing the best and soundest results, despite having been recurrently overlooked in the literature.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

1507.04126

Genre: Research Report > New Finding (0.45)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

An experiment in trying to predict Google rankings

#artificialintelligenceJul-21-2016, 11:50:31 GMT

Machine learning is quickly becoming an indispensable tool for many large companies. Everyone has, for sure, heard about Google's AI algorithm beating the World Champion in Go, as well as technologies like RankBrain, but machine learning does not have to be a mystical subject relegated to the domain of math researchers. There are many approachable libraries and technologies that show promise of being very useful to any industry that has data to play with. Machine learning also has the ability to turn traditional website marketing and SEO on its head. Late last year, my colleagues and I (rather naively) began an experiment in which we threw several popular machine learning algorithms at the task of predicting ranking in Google. We ended up with an assembly that achieved 41 percent true positive and 41 percent true negative on our data set.

algorithm, artificial intelligence, machine learning, (14 more...)

#artificialintelligence

Country:

Europe > Ukraine > Kyiv Oblast > Kyiv (0.05)
South America > Brazil (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

An experiment in trying to predict Google rankings

#artificialintelligenceJul-21-2016, 01:51:18 GMT

Machine learning is quickly becoming an indispensable tool for many large companies. Everyone has, for sure, heard about Google's AI algorithm beating the World Champion in Go, as well as technologies like RankBrain, but machine learning does not have to be a mystical subject relegated to the domain of math researchers. There are many approachable libraries and technologies that show promise of being very useful to any industry that has data to play with. Machine learning also has the ability to turn traditional website marketing and SEO on its head. Late last year, my colleagues and I (rather naively) began an experiment in which we threw several popular machine learning algorithms at the task of predicting ranking in Google. We ended up with an assembly that achieved 41 percent true positive and 41 percent true negative on our data set. In the following paragraphs, I will take you through our experiment, and I will also discuss a few important libraries and technologies that are important for SEOs to begin understanding.

algorithm, artificial intelligence, machine learning, (14 more...)

#artificialintelligence

Country:

Europe > Ukraine > Kyiv Oblast > Kyiv (0.05)
South America > Brazil (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Left/Right Hand Segmentation in Egocentric Videos

Betancourt, Alejandro, Morerio, Pietro, Barakova, Emilia, Marcenaro, Lucio, Rauterberg, Matthias, Regazzoni, Carlo

arXiv.org Artificial IntelligenceJul-21-2016

Wearable cameras allow people to record their daily activities from a user-centered (First Person Vision) perspective. Due to their favorable location, wearable cameras frequently capture the hands of the user, and may thus represent a promising usermachine interaction tool for different applications. Existent First Person Vision methods handle hand segmentation as a backgroundforeground problem, ignoring two important facts: i) hands are not a single "skin-like" moving element, but a pair of interacting cooperative entities, ii) close hand interactions may lead to hand-to-hand occlusions and, as a consequence, create a single hand-like segment. These facts complicate a proper understanding of hand movements and interactions. Our approach extends traditional background-foreground strategies, by including a hand-identification step (left-right) based on a Maxwell distribution of angle and position. Hand-to-hand occlusions are addressed by exploiting temporal superpixels. The experimental results show that, in addition to a reliable left/right hand-segmentation, our approach considerably improves the traditional background-foreground hand-segmentation. Keywords: Hand-Segmentation, Hand-identification, Egocentric Vision, First Person Vision 1. Introduction The recent widespread availability of wearable devices has quickly attracted the interest of researchers, computer scientists and high-tech companies [1]. The 90's idea of a body-worn device that is always ready to be used is nowadays possible, and its potential applicability to real problems is evident. In general, the wearable sensor that most attracted researchers' attention is the video camera: while enjoying a unique position to record what the user is seeing, it suffers from important issues and technical challenges [2]. Images and videos recorded from this perspective are commonly referred to as First-Person Vision (FPV) or Egocentric videos [2].

artificial intelligence, machine learning, occlusion, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.cviu.2016.09.005

1607.06264

Country:

Europe (1.00)
North America > United States (0.68)

Genre: Research Report > New Finding (0.48)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Media (0.74)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Introduction to Naive Bayes

#artificialintelligenceJul-20-2016, 17:11:23 GMT

I think there's a rule somewhere that says "You can't call yourself a data scientist until you've used a Naive Bayes classifier". This article is my attempt at laying the groundwork for Naive Bayes in a practical and intuitive fashion. Let's start with a problem to motivate our formulation of Naive Bayes. Suppose we own a professional networking site similar to LinkedIn. Users sign up, type some information about themselves, and then roam the network looking for jobs/connections/etc. Until recently, we only required users to enter their current job title, but now we're asking them what industry they work in.

artificial intelligence, frequency, machine learning, (15 more...)

#artificialintelligence

Technology: