AITopics

doi: 10.1038/srep38580

1607.03502

Country: North America > United States (0.68)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(4 more...)

Im, Daniel Jiwoong, Taylor, Graham W.

Learning a metric for class-conditional KNN

arXiv.org Machine LearningJul-11-2016

Naive Bayes Nearest Neighbour (NBNN) is a simple and effective framework which addresses many of the pitfalls of K-Nearest Neighbour (KNN) classification. It has yielded competitive results on several computer vision benchmarks. Its central tenet is that during NN search, a query is not compared to every example in a database, ignoring class information. Instead, NN searches are performed within each class, generating a score per class. A key problem with NN techniques, including NBNN, is that they fail when the data representation does not capture perceptual (e.g.~class-based) similarity. NBNN circumvents this by using independent engineered descriptors (e.g.~SIFT). To extend its applicability outside of image-based domains, we propose to learn a metric which captures perceptual similarity. Similar to how Neighbourhood Components Analysis optimizes a differentiable form of KNN classification, we propose "Class Conditional" metric learning (CCML), which optimizes a soft form of the NBNN selection rule. Typical metric learning algorithms learn either a global or local metric. However, our proposed method can be adjusted to a particular level of locality by tuning a single parameter. An empirical evaluation on classification and retrieval tasks demonstrates that our proposed method clearly outperforms existing learned distance metrics across a variety of image and non-image datasets.

artificial intelligence, machine learning, neighbour, (16 more...)

1607.0305

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)

#artificialintelligenceJul-10-2016, 23:25:33 GMT

A Gentle Introduction to Bloom Filter

Bloom filters are probabilistic space-efficient data structures. They are very similar to hashtables; they are used exclusively membership existence in a set. However, they have a very powerful property which allows to make trade-off between space and false-positive rate when it comes to membership existence. Since it can make a tradeoff between space and false positive rate, it is called probabilistic data structure. Let's detail a little bit on the space-efficiency.

artificial intelligence, bloom filter, machine learning, (13 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Richman, Oran, Mannor, Shie

How to Allocate Resources For Features Acquisition?

arXiv.org Machine LearningJul-10-2016

We study classification problems where features are corrupted by noise and where the magnitude of the noise in each feature is influenced by the resources allocated to its acquisition. This is the case, for example, when multiple sensors share a common resource (power, bandwidth, attention, etc.). We develop a method for computing the optimal resource allocation for a variety of scenarios and derive theoretical bounds concerning the benefit that may arise by non-uniform allocation. We further demonstrate the effectiveness of the developed method in simulations.

artificial intelligence, classifier, machine learning, (19 more...)

1607.02763

Genre: Research Report (0.50)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

#artificialintelligenceJul-9-2016, 21:35:22 GMT

Detecting Fraudulent Skype Users via Machine Learning

As part of my Data Science class with General Assembly, we each gave a presentation about a real-world application of data science. My talk was about using machine learning to detect fraud on Skype, and was based upon an excellent paper by Microsoft Research published in November 2013. Although Skype already had measures in place to detect fraud (e.g., credit card fraud, spam instant messages), the research team's goal was to improve the detection of "stealthy fraudulent users" that evade Skype's defenses for a prolonged period. They built a machine learning classifier that flagged potentially fraudulent users, and was able to detect 68% of these users with a false positive rate of 5%. The novelty in their approach was the fusing of disparate data types (profile information, Skype product usage, and Skype social activity) into a single classifier.

artificial intelligence, detecting fraudulent skype user, machine learning, (6 more...)

Industry: Law Enforcement & Public Safety > Fraud (0.86)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

International Business TimesJul-8-2016, 21:35:54 GMT

Daniel Cormier vs. Anderson Silva: Actual Start Time, Betting Odds, PPV Info, Prediction For UFC 200 Fight

It was supposed to be the perfect main event Dana White originally envisioned for UFC 200, but light heavyweight champion Daniel Cormier's impromptu square off with the legendary Anderson Silva still has all the fixings of a major mixed martial arts showdown. Originally, Cormier was to rematch Jon Jones in a unification bout Saturday night at Las Vegas' T-Mobile Arena. However, earlier this week it was revealed Jones failed a drug test and was pulled from the main event in lieu of an investigation into whether or not he used performance enhancing drugs. Silva, who last fought in February but hasn't claimed a victory since 2012, stepped up and essentially saved what was intended to be UFC's biggest PPV event since 100th edition seven years ago. Still, the bout was bumped down two slots with women's bantam weight champion Miesha Tate's face off with challenger Amanda Nunes leapfrogging, and Brock Lesnar and Mark Hunt's battle standing as the main event.

artificial intelligence, machine learning, silva, (11 more...)

International Business Times

Country:

North America > United States > Nevada > Clark County > Las Vegas (0.26)
North America > United States > Louisiana (0.06)

Industry: Leisure & Entertainment > Sports > Martial Arts (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.64)

#artificialintelligenceJul-7-2016, 23:11:00 GMT

Bitly

This is the third in a series of posts on how to build a Data Science Portfolio. If you like this and want to know when the next post in the series is released, you can subscribe at the bottom of the page. Data science companies are increasingly looking at portfolios when making hiring decisions. One of the reasons for this is that a portfolio is the best way to judge someone's real-world skills. The good news for you is that a portfolio is entirely within your control. If you put some work in, you can make a great portfolio that companies are impressed by. The first step in making a high-quality portfolio is to know what skills to demonstrate. Any good portfolio will be composed of multiple projects, each of which may demonstrate 1-2 of the above points. This is the third post in a series that will cover how to make a well-rounded data science portfolio.

artificial intelligence, dataset, machine learning, (16 more...)

Country: North America > United States (0.17)

Genre: Workflow (0.49)

Industry:

Banking & Finance > Loans (0.32)
Banking & Finance > Real Estate (0.31)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.31)

#artificialintelligenceJul-7-2016, 14:31:30 GMT

A Gentle Guide to Machine Learning MonkeyLearn Blog

Machine Learning is a subfield within Artificial Intelligence that builds algorithms that allow computers to learn to perform tasks from data instead of being explicitly programmed. We can make machines learn to do things! The first time I heard that, it blew my mind. That means that we can program computers to learn things by themselves! The ability of learning is one of the most important aspects of intelligence. Translating that power to machines, sounds like a huge step towards making them more intelligent. And in fact, Machine Learning is the area that is making most of the progress in Artificial Intelligence today; being a trendy topic right now and pushing the possibility to have more intelligent machines.

artificial intelligence, inductive learning, machine learning, (14 more...)

Country: North America > United States > California > Santa Clara County > Palo Alto (0.04)

Industry: Leisure & Entertainment > Games (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.58)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

Zeng, Jiaming, Ustun, Berk, Rudin, Cynthia

Interpretable Classification Models for Recidivism Prediction

arXiv.org Machine LearningJul-7-2016

We investigate a long-debated question, which is how to create predictive models of recidivism that are sufficiently accurate, transparent, and interpretable to use for decision-making. This question is complicated as these models are used to support different decisions, from sentencing, to determining release on probation, to allocating preventative social services. Each use case might have an objective other than classification accuracy, such as a desired true positive rate (TPR) or false positive rate (FPR). Each (TPR, FPR) pair is a point on the receiver operator characteristic (ROC) curve. We use popular machine learning methods to create models along the full ROC curve on a wide range of recidivism prediction problems. We show that many methods (SVM, Ridge Regression) produce equally accurate models along the full ROC curve. However, methods that designed for interpretability (CART, C5.0) cannot be tuned to produce models that are accurate and/or interpretable. To handle this shortcoming, we use a new method known as SLIM (Supersparse Linear Integer Models) to produce accurate, transparent, and interpretable models along the full ROC curve. These models can be used for decision-making for many different use cases, since they are just as accurate as the most powerful black-box machine learning models, but completely transparent, and highly interpretable.

age 1st, artificial intelligence, machine learning, (14 more...)

doi: 10.1111/rssa.12227

1503.0781

Country:

Europe (0.92)
North America > United States > Illinois (0.28)
North America > United States > California (0.28)

Genre: Research Report > New Finding (0.93)

Industry:

Law > Criminal Law (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Law Enforcement & Public Safety > Corrections (1.00)
(4 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

arXiv.org Machine LearningJul-5-2016

How to Evaluate the Quality of Unsupervised Anomaly Detection Algorithms?

Goix, Nicolas

When sufficient labeled data are available, classical criteria based on Receiver Operating Characteristic (ROC) or Precision-Recall (PR) curves can be used to compare the performance of un-supervised anomaly detection algorithms. However , in many situations, few or no data are labeled. This calls for alternative criteria one can compute on non-labeled data. In this paper, two criteria that do not require labels are empirically shown to discriminate accurately (w.r.t. ROC or PR based criteria) between algorithms. These criteria are based on existing Excess-Mass (EM) and Mass-Volume (MV) curves, which generally cannot be well estimated in large dimension. A methodology based on feature sub-sampling and aggregating is also described and tested, extending the use of these criteria to high-dimensional datasets and solving major drawbacks inherent to standard EM and MV curves.

artificial intelligence, data mining, machine learning, (17 more...)

1607.01152

Country:

North America > United States (0.28)
Europe (0.28)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area (0.69)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)