kaggle


New Poll: Which Data Science / Machine Learning methods and tools you used?

#artificialintelligence

New KDnuggets Poll is asking: Which Data Science / Machine Learning methods and tools you used in the past 12 months for work or a real-world project? Please vote below and we will summarize the results and examine the trends in early December. Poll Which Data Science / Machine Learning methods and tools you used in the past 12 months for a real-world application? Kaggle survey asked: What data science methods are used at work? and the top answers were Gradient Boosted Machines


Why India's data scientists make a fraction of their US counterparts FactorDaily

@machinelearnbot

Data scientists and machine learning engineers in India make about one-tenth of what their counterparts in the United States do, a leading global survey shows. The median annual salary in India, based on 450 responses, is $11,715 (Rs 7.5 lakhs), a fraction of the comparable annual earnings in the US ($110,000). The median for all respondents from 52 countries, whose data was considered in the calculations, is $55,441. Kaggle, the world's largest global online community of data scientists, statisticians and machine learning engineers, published its The State of Data Science & Machine Learning annual survey earlier this week, deriving insights on 16,000 respondents in a report that polled the data science and machine learning industry. The Google-owned platform currently boasts of over a million members and is known to attract the world's smartest data scientists by holding public and private data science competitions.


[D] How to build a Portfolio as a Machine Learning/Data Science Engineer in industry ? • r/MachineLearning

@machinelearnbot

I have this portfolio with jupyter notebooks done by me. Several of them need to be reworked or deleted, but most of them are okay. One of them is similar to things which I did while I worked in a bank. As for the first project - this is my attempt to build a site with handwritten digit recognition system with online training. This portfolio really helped me when I was looking for a job.


Implementing MaLSTM on Kaggle's Quora Question Pairs competition

@machinelearnbot

In the past few years, deep learning is all the fuss in the tech industry. To keep up on things I like to get my hands dirty implementing interesting network architectures I come across in article readings. Few months ago I came across a very nice article called Siamese Recurrent Architectures for Learning Sentence Similarity.It offers a pretty straightforward approach to the common problem of sentence similarity. Named MaLSTM ("Ma" for Manhattan distance), its architecture is depicted in figure 1 (diagram excludes the sentence preprocessing part). Notice that since this is a Siamese network, it is easier to train because it shares weights on both sides.


How (and why) to create a good validation set · fast.ai

#artificialintelligence

An all-too-common scenario: a seemingly impressive machine learning model is a complete failure when implemented in production. The fallout includes leaders who are now skeptical of machine learning and reluctant to try it again. One of the most likely culprits for this disconnect between results in development vs results in production is a poorly chosen validation set (or even worse, no validation set at all). Depending on the nature of your data, choosing a validation set can be the most important step. Although sklearn offers a train_test_split method, this method takes a random subset of the data, which is a poor choice for many real-world problems.


You Could Become an AI Master Before You Know It. Here's How.

MIT Technology Review

At first blush, Scot Barton might not seem like an AI pioneer. He isn't building self-driving cars or teaching computers to thrash humans at computer games. But within his role at Farmers Insurance, he is blazing a trail for the technology. Barton leads a team that analyzes data to answer questions about customer behavior and the design of different policies. His group is now using all sorts of cutting-edge machine-learning techniques, from deep neural networks to decision trees.


The biggest headache in machine learning? Cleaning dirty data off the spreadsheets

@machinelearnbot

If you imagine the life of a machine learning researcher, you might think it's quite glamorous. You'll program self-driving cars, work for the biggest names in tech, and your software could even lead to the downfall of humanity. But, as a new survey of data scientists and machine learners shows, those expectations need adjusting, because the biggest challenge in these professions is something quite mundane: cleaning dirty data. This comes from a survey conducted by data science community Kaggle (which was acquired by Google earlier this year). Some 16,700 of the site's 1.3 million members responded to the questionnaire, and when asked about the biggest barriers faced at work, the most common answer was "dirty data," followed by a lack of talent in the field.


Solve These Tough Data Problems and Watch Job Offers Roll In

WIRED

Late in 2015, Gilberto Titericz, an electrical engineer at Brazil's state oil company Petrobras, told his boss he planned to resign, after seven years maintaining sensors and other hardware in oil plants. By devoting hundreds of hours of leisure time to the obscure world of competitive data analysis, Titericz had recently become the world's top-ranked data scientist, by one reckoning. "Only when I wanted to quit did they realize they had the number-one data scientist," he says. Petrobras held on to its champ for a time by moving Titericz into a position that used his data skills. But since topping the rankings that October he'd received a stream of emails from recruiters around the globe, including representatives of Tesla and Google.


You Could Become an AI Master Before You Know It. Here's How.

#artificialintelligence

At first blush, Scot Barton might not seem like an AI pioneer. He isn't building self-driving cars or teaching computers to thrash humans at computer games. But within his role at Farmers Insurance, he is blazing a trail for the technology. Barton leads a team that analyzes data to answer questions about customer behavior and the design of different policies. His group is now using all sorts of cutting-edge machine-learning techniques, from deep neural networks to decision trees.


AI Algorithms Are Starting to Teach AI Algorithms

#artificialintelligence

At first blush, Scot Barton might not seem like an AI pioneer. He isn't building self-driving cars or teaching computers to thrash humans at computer games. But within his role at Farmers Insurance, he is blazing a trail for the technology. Barton leads a team that analyzes data to answer questions about customer behavior and the design of different policies. His group is now using all sorts of cutting-edge machine-learning techniques, from deep neural networks to decision trees.