Results


What is hardcore data science – in practice?

@machinelearnbot

For example, for personalized recommendations, we have been working with learning to rank methods that learn individual rankings over item sets. Figure 1: Typical data science workflow, starting with raw data that is turned into features and fed into learning algorithms, resulting in a model that is applied on future data. This means that this pipeline is iterated and improved many times, trying out different features, different forms of preprocessing, different learning methods, or maybe even going back to the source and trying to add more data sources. Probably the main difference between production systems and data science systems is that production systems are real-time systems that are continuously running.


Don't fall for the AI hype: Here are the ingredients you need to build an actual useful thing

#artificialintelligence

Artificial intelligence these days is sold as if it were a magic trick. Data is fed into a neural net – or black box – as a stream of jumbled numbers, and voilà! It comes out the other side completely transformed, like a rabbit pulled from a hat. That's possible in a lab, or even on a personal dev machine, with carefully cleaned and tuned data. However, it is takes a lot, an awful lot, of effort to scale machine-learning algorithms up to something resembling a multiuser service – something useful, in other words.


10 Ways Machine Learning Is Revolutionizing Manufacturing

#artificialintelligence

Bottom line: Every manufacturer has the potential to integrate machine learning into their operations and become more competitive by gaining predictive insights into production. Machine learning's core technologies align well with the complex problems manufacturers face daily. From striving to keep supply chains operating efficiently to producing customized, built- to-order products on time, machine learning algorithms have the potential to bring greater predictive accuracy to every phase of production. Many of the algorithms being developed are iterative, designed to learn continually and seek optimized outcomes. These algorithms iterate in milliseconds, enabling manufacturers to seek optimized outcomes in minutes versus months.


Artificial intelligence is now Intel's major focus

#artificialintelligence

At the forefront of these AI ambitions is a new platform called Nervana, which follows Intel's acquisition of deep-learning startup Nervana Systems earlier this year. Setting its sights on an area currently dominated by Nvidia's graphics processing unit (GPU) technology, one of the Nervana platform's main focuses will be deep learning and training neural networks – the software process behind machine learning that is based on a set of algorithms that attempt to model high-level abstractions in data. Google, for instance, is investing heavily in research exploring virtually all aspects of machine learning, including deep learning and more classical algorithms, something it calls "Machine Intelligence". One way is in manufacturing, as intelligent computer systems replace certain human-operated jobs.


Artificial Intelligence is now Intel's major focus

#artificialintelligence

With technology governing almost every aspect of our lives, industry experts are defining these modern times as the "platinum age of innovation"; verging on the threshold of discoveries that could change human society irreversibly, for better or worse. At the forefront of this revolution is the field of artificial intelligence (AI), a technology that is more vibrant than ever due to the acceleration of technological progress in machine learning - the process of giving computers with the ability to learn without being explicitly programmed - as well as the realisation by big tech vendors of its potential. One major tech behemoth fuelling the fire of this fast moving juggernaut called AI is Intel, a company that has long invested in the science and engineering of making computers more intelligent. The Californian company held an'AI Day' in San Francisco showcasing its new strategy dedicated solely to AI, with the introduction of new AI-specific products, as well as investments for the development of specific AI-related tech. And Alphr were in town to hear all about it.


Artificial intelligence is now Intel's major focus

#artificialintelligence

At the forefront of these AI ambitions is a new platform called Nervana, which follows Intel's acquisition of deep-learning startup Nervana Systems earlier this year. Setting its sights on an area currently dominated by Nvidia's graphics processing unit (GPU) technology, one of the Nervana platform's main focuses will be deep learning and training neural networks – the software process behind machine learning that is based on a set of algorithms that attempt to model high-level abstractions in data. Google, for instance, is investing heavily in research exploring virtually all aspects of machine learning, including deep learning and more classical algorithms, something it calls "Machine Intelligence". One way is in manufacturing, as intelligent computer systems replace certain human-operated jobs.


How To Get Better Machine Learning Performance

#artificialintelligence

Machine Learning Performance Improvement Cheat Sheet Photo by NASA, some rights reserved. This cheat sheet is designed to give you ideas to lift performance on your machine learning problem. Outcome: You should now have a short list of highly tuned algorithms on your machine learning problem, maybe even just one. In fact, you can often get good performance from combining the predictions from multiple "good enough" models rather than from multiple highly tuned (and fragile) models.


Questions To Ask When Moving Machine Learning From Practice to Production

#artificialintelligence

With growing interest in neural networks and deep learning, individuals and companies are claiming ever-increasing adoption rates of artificial intelligence into their daily workflows and product offerings. Coupled with breakneck speeds in AI-research, the new wave of popularity shows a lot of promise for solving some of the harder problems out there. That said, I feel that this field suffers from a gulf between appreciating these developments and subsequently deploying them to solve "real-world" tasks. A number of frameworks, tutorials and guides have popped up to democratize machine learning, but the steps that they prescribe often don't align with the fuzzier problems that need to be solved. This post is a collection of questions (with some (maybe even incorrect) answers) that are worth thinking about when applying machine learning in production.


Moving machine learning from practice to production

#artificialintelligence

With growing interest in neural networks and deep learning, individuals and companies are claiming ever-increasing adoption rates of artificial intelligence into their daily workflows and product offerings. Spending some time on planning your infrastructure, standardizing setup and defining workflows early-on can save valuable time with each additional model that you build. After building, training and deploying your models to production, the task is still not complete unless you have monitoring systems in place. Periodically saving production statistics (data samples, predicted results, outlier specifics) has proven invaluable in performing analytics (and error postmortems) over deployments.


Laplace noising versus simulated out of sample methods (cross frames)

#artificialintelligence

Please read on for my discussion of some of the limitations of the technique, and how we solve the problem for impact coding (also called "effects codes"), and a worked example in R.We define a nested model as any model where the results of a sub-model are used as inputs for a later model. And I now think such a theorem would actually have fairly unsatisfying statement as a one possible "bad real world data" situation violates the usual "no re-use" requirements of differential privacy; duplicated or related columns or variables break the Laplace noising technique. But library code needs to work in the limit (as you don't know ahead of time what users will throw at it) and there are a lot of mechanisms that do produce duplicate, near-duplicate, and related columns in data sources used for data science (one of the difference between data science and classical statistics is data science tends to apply machine learning techniques on very under-curated data sets). The results on our artificial "each column five times" data set are below: Notice that the Laplace noising technique test performances are significantly degraded (performance on held-out test usually being a better simulation of future model performance than performance on the training set).