Goto

Collaborating Authors

 Performance Analysis


Blockchains for Artificial Intelligence – The BigchainDB Blog

#artificialintelligence

And, it was first published on Dataconomy on Dec 21, 2016; I'm reposting here for ease of access. In May 2017 I gave an updated talk; here's the slides & video.] In recent years, AI (artificial intelligence) researchers have finally cracked problems that they've worked on for decades, from Go to human-level speech recognition. A key piece was the ability to gather and learn on mountains of data, which pulled error rates past the success line. In short, big data has transformed AI, to an almost unreasonable level. Blockchain technology could transform AI too, in its own particular ways. Some applications of blockchains to AI are mundane, like audit trails on AI models. Some appear almost unreasonable, like AI that can own itself -- AI DAOs. All of them are opportunities. This article will explore these applications. Before we discuss applications, let's first review what's different about blockchains compared to traditional big-data distributed databases like MongoDB.


Blockchains for Artificial Intelligence – The BigchainDB Blog

#artificialintelligence

And, it was first published on Dataconomy on Dec 21, 2016; I'm reposting here for ease of access. In May 2017 I gave an updated talk; here's the slides & video.] In recent years, AI (artificial intelligence) researchers have finally cracked problems that they've worked on for decades, from Go to human-level speech recognition. A key piece was the ability to gather and learn on mountains of data, which pulled error rates past the success line. In short, big data has transformed AI, to an almost unreasonable level. Blockchain technology could transform AI too, in its own particular ways. Some applications of blockchains to AI are mundane, like audit trails on AI models. Some appear almost unreasonable, like AI that can own itself -- AI DAOs. All of them are opportunities. This article will explore these applications. Before we discuss applications, let's first review what's different about blockchains compared to traditional big-data distributed databases like MongoDB.


Practical Naive Bayes -- Classification of Amazon Reviews

@machinelearnbot

If you search around the internet looking for applying Naive Bayes classification on text, you'll find a ton of articles that talk about the intuition behind the algorithm, maybe some slides from a lecture about the math and some notation behind it, and a bunch of articles I'm not going to link here that pretty much just paste some code and call it an explanation. So I'm going to try to do a little more here, by hopefully writing and explaining enough, is let you yourself write a working Naive Bayes classifier. There are three sections here. First is setup, and what format I'm expecting your text to be in for the classification. Second, I'll talk about how to run naive Bayes on your own, using slow Python data structures.


A Statistical Approach to Increase Classification Accuracy in Supervised Learning Algorithms

arXiv.org Machine Learning

Probabilistic mixture models have been widely used for different machine learning and pattern recognition tasks such as clustering, dimensionality reduction, and classification. In this paper, we focus on trying to solve the most common challenges related to supervised learning algorithms by using mixture probability distribution functions. With this modeling strategy, we identify sub-labels and generate synthetic data in order to reach better classification accuracy. It means we focus on increasing the training data synthetically to increase the classification accuracy.


Cross-Validation: Concept and Example in R

@machinelearnbot

In Machine Learning, Cross-validation is a resampling method used for model evaluation to avoid testing a model on the same dataset on which it was trained. This is a common mistake, especially that a separate testing dataset is not always available. However, this usually leads to inaccurate performance measures (as the model will have an almost perfect score since it is being tested on the same data it was trained on). To avoid this kind of mistakes, cross validation is usually preferred.


Balancing Interpretability and Predictive Accuracy for Unsupervised Tensor Mining

arXiv.org Machine Learning

Very frequently, tensor mining is done in an entirely unsupervised way, since ground truth and labels are either very expensive or hard to obtain. Our problem, thus, is: given a potentially very large and sparse tensor, and its R-component decomposition, compute a quality measure for that decomposition. Subsequently, using that quality metric, we would like to identify a "good" number R of components, and ultimately minimize human intervention and trial-and-error fine tuning. This problem is extremely hard. In fact, even computing the rank of a tensor has been shown to be an NPhard problem, in stark contrast to the matrix rank which can be easily computed in polynomial time. Fortunately, there exist heuristics that are able to assist with the above problem and have been shown to work well in practice, in the field of Chemometrics. Such a powerful and intuitive heuristic is the so-called "Core Consistency Diagnostic" [1], which given a tensor and its PARAFAC decomposition, provides a quality measure, which we can in turn use as a proxy of how interpretable our results are.


Salient Object Detection: A Survey

arXiv.org Artificial Intelligence

Detecting and segmenting salient objects in natural scenes, often referred to as salient object detection, has attracted a lot of interest in computer vision. While many models have been proposed and several applications have emerged, yet a deep understanding of achievements and issues is lacking. We aim to provide a comprehensive review of the recent progress in salient object detection and situate this field among other closely related areas such as generic scene segmentation, object proposal generation, and saliency for fixation prediction. Covering 228 publications, we survey i) roots, key concepts, and tasks, ii) core techniques and main modeling trends, and iii) datasets and evaluation metrics in salient object detection. We also discuss open problems such as evaluation metrics and dataset bias in model performance and suggest future research directions.


How Do Machine Learning Programs "Learn"?

#artificialintelligence

In this article, we look at two machine learning (ML) techniques, Naive Bayes classifier and neural networks, and demystify how they work. With all the hype surrounding self-driving cars and video-game-playing AI robots, it's worth taking a step back and reminding ourselves how machine learning programs actually "learn". In this article, we look at two machine learning (ML) techniques–spam filters and neural networks–and demystify how they work. And if you're not sure what machine learning even is, read about the difference between artificial intelligence, machine learning, and deep learning. One common machine learning algorithm is the Naive Bayes classifier, which is used for filtering spam emails.


Using a Customized Cost Function to Deal With Unbalanced Data - DZone AI

#artificialintelligence

As pointed out in this KDnuggets article, we often only have a few examples of the thing that we want to predict in our data. The use cases are countless: only a small part of our website visitors purchase eventually, only a few of our transactions are fraudulent, etc. This is a real problem when using machine learning. That's because the algorithms usually need many examples of each class to extract the general rules in your data, and the instances in minority classes can be discarded as noise, causing some useful rules to never be found. The KDnuggets article explained several techniques that can be used to address this problem.


As final number emergers, showtime calls Mayweather-McGregor "massive" pay-per-view success

Los Angeles Times

The "one-time-only" boxing match between a 40-year-old who retired two years ago and an Irishman making his pro debut in the sport is positioned to become the greatest-selling pay-per-view fight of all time Friday. Showtime Executive Vice President Stephen Espinoza said "it's too early to declare a hard number" but Saturday's Floyd Mayweather Jr.-Conor McGregor fight is "tracking in the mid-to-high 4 million pay-per view buys." "If we don't reach the record, we're going to be very, very close," and "we consider it a massive success." "It was an exciting, entertaining fight and there was massive interest," in it, Espinoza told The Times, crediting strong digital sales to boost the overall domestic sales. The bout is also expected to surpass the $600 million generated in total revenue by Mayweather's less-entertaining unanimous-decision triumph over seven-division champion Manny Pacquiao, with final pay-per-view numbers expected by next week.