The Hitchhiker's Guide to Feature Extraction

#artificialintelligence

Good Features are the backbone of any machine learning model. And good feature creation often needs domain knowledge, creativity, and lots of time. TLDR; this post is about useful feature engineering methods and tricks that I have learned and end up using often. Have you read about featuretools yet? If not, then you are going to be delighted.


The Hitchhiker's Guide to Feature Extraction

#artificialintelligence

Good Features are the backbone of any machine learning model. And good feature creation often needs domain knowledge, creativity, and lots of time. And some other ideas to think about feature creation. TLDR; this post is about useful feature engineering methods and tricks that I have learned and end up using often. Have you read about featuretools yet? If not, then you are going to be delighted.


K-Means Clustering: Unsupervised Learning for Recommender Systems

#artificialintelligence

Unsupervised Learning has been called the closest thing we have to "actual" Artificial Intelligence, in the sense of General AI, with K-Means Clustering one of its simplest, but most powerful applications. I am not here to discuss whether those claims are true or not, as I am not an expert nor a philosopher. I will however state, that I am often amazed by how well unsupervised learning techniques, even the most rudimentary, capture patterns in the data that I would expect only people to find. Today we'll apply unsupervised learning on a Dataset I gathered myself. It's a database of professional Magic: The Gathering decks that I crawled from mtgtop8.com, an awesome website if you're into Magic: the Gathering.


Central Clustering of Categorical Data with Automated Feature Weighting

AAAI Conferences

The ability to cluster high-dimensional categorical data is essential for many machine learning applications such as bioinfomatics. Currently, central clustering of categorical data is a difficult problem due to the lack of a geometrically interpretable definition of a cluster center. In this paper, we propose a novel kernel-density-based definition using a Bayes-type probability estimator. Then, a new algorithm called k-centers is proposed for central clustering of categorical data, incorporating a new feature weighting scheme by which each attribute is automatically assigned with a weight measuring its individual contribution for the clusters. Experimental results on real-world data show outstanding performance of the proposed algorithm, especially in recognizing the biological patterns in DNA sequences.


Chimps can gesticulate with the best of them

Popular Science

You've probably been doing it since you were a baby. But not every animal is as gesturally gifted as you.