Machine Learning: Filtering Email for Spam or Ham - Code School Blog


You may have seen our previous posts on machine learning -- specifically, how to let your code learn from text and working with stop words, stemming, and spam. So today, we're going to build our machine learning-based spam filter, using the tools we walked through in those posts: tokenizer, stemmer, and naive bayes classifier. We are going to work with bluebird promise library here, so if you are not used to promises, please take a look at the bluebird API reference. Before we begin, it's important to have good training data. You can download some here -- we are interested in two.

Weekly Digest, December 19


Data Science for IoT vs Classic Data Science: 10 Differences Enterprise AI insights from the AI Europe event in London Is it time to consider data in motion in your big data projects?

IBM hits new AI milestone with new industry record for speech recognition - Computer Business Review


The company created a technology that recognises spoken words ever closer to human parity. IBM reached a new AI milestone in speech recognition, achieving an industry record of 5.5% word error rate using the Switchboard linguistic corpus. The company broke the industry record by extending its deep learning technologies and incorporating an acoustic model that learns from positive examples while taking advantage of negative ones. The model gets smarter and performs better when similar speech patterns are repeated. IBM achieved another major AI milestone in conversational speech recognition last year with a computer system that reached a word error rate of 6.9%.

Random forest explained in simple terms - Listen Data


If omitted, randomForest will run in unsupervised mode. Arguments mtry: number of variables selected at each split - default sqrt(no of variables) for classification ntree: number of trees to grow: default 500 nodesize: minimum size of terminal nodes default 1 Step III: Find the number of trees where the out of bag error rate stabilizes and reach minimum. Step IV: Find the optimal number of variables selected at each split Select mtry value with minimum out of bag(OOB) error. It returns the optimal number of mtry (paramter used in randomforest package).

Sophos Adds Advanced Machine Learning to Its Next-Generation Endpoint Protection Portfolio with Acquisition of Invincea


Sophos (LSE: SOPH), a global leader in network and endpoint security, today announced it has entered into an agreement to acquire Invincea, a visionary provider of next-generation malware protection. Invincea's endpoint security portfolio is designed to detect and prevent unknown malware and sophisticated attacks via its patented deep learning neural-network algorithms. It has been consistently ranked as among the best performing machine learning, signature-less next-generation endpoint technologies in third-party testing and rated highly both for high detection and low false-positive rates. Headquartered in Fairfax, Va., Invincea was founded by chief executive officer Anup Ghosh to address the rapidly growing zero-day security threat from nation states, cyber criminals and rogue actors. Invincea's flagship product X by Invincea uses deep learning neural networks and behavioral monitoring to detect previously unseen malware and stops attacks before damage occurs.