Machine Learning

Graph neural networks: a review of methods and applications


It's another graph neural networks survey paper today! Clearly, this covers much of the same territory as we looked at earlier in the week, but when we're lucky enough to get two surveys published in short succession it can add a lot to compare the two different perspectives and sense of what's important. In particular here, Zhou et al., have a different formulation for describing the core GNN problem, and a nice approach to splitting out the various components. Rather than make this a standalone write-up, I'm going to lean heavily on the Graph neural network survey we looked at on Wednesday and try to enrich my understanding starting from there. For this survey, the GNN problem is framed based on the formulation in the original GNN paper, 'The graph neural network model,' Scarselli 2009.

The Best Free Books for Learning Data Science


The Elements of Statistical Learning - Another valuable statistics text that covers just about everything you might want to know, and then some (it's over 750 pages long). Make sure you get the most updated version of the book from here (as of this writing, that's the 2017 edition). Data Mining and Analysis - This Cambridge University Press text will take you deep into the statistics and algorithms used for various types of data analysis. Do you need books to learn data science?

Māori loanwords project becomes easier with machine learning


A machine learning model was used by researchers from the University of Waikato, in New Zealand, to narrow down a massive 8 million tweets to a more manageable 1.2 million in order to look at how te reo Māori is being used in the genre. According to a recent press release, the team focused on 77 Māori loanwords, or te reo Māori words used in an English context, and used them as training data for their machine learning model. Machine learning allows data scientists to provide a computer with a large data set, and teach it to make predictions based on that data. The initial 8 million tweets contained a fair bit of distracting data'noise'. The irrelevant tweets are those that are not used in a New Zealand English context, or were otherwise unrelated.

Artificial Intelligence to boost Earth system science


A study by German scientists from Jena and Hamburg, published today in the journal Nature, shows that artificial intelligence (AI) can substantially improve our understanding of the climate and the Earth system. Especially the potential of deep learning has only partially been exhausted so far. In particular, complex dynamic processes such as hurricanes, fire propagation, and vegetation dynamics can be better described with the help of AI. As a result, climate and Earth system models will be improved, with new models combining artificial intelligence and physical modeling. In the past decades mainly static attributes have been investigated using machine learning approaches, such as the distribution of soil properties from the local to the global scale.

China's tech companies are taking a more American approach to international expansion


Shanghai-based artificial intelligence company Yitu Technology announced this month that is launching its first R&D center outside of China in Singapore. The move is part of a larger trend among Chinese tech companies hoping to achieve two goals: Access top foreign engineering and scientific talent by setting up R&D centers in key global knowledge hubs, and embed themselves deeper in local ecosystems to spur new long-term growth engines -- most notably in Southeast Asia. The ultimate goal for many of China's leading tech companies is to become true multinationals. Their strategy is to build a significant presence in their huge home market and then leverage that to branch out internationally. However, they face a steep learning curve: The free-for-all ethos and Darwinian natural selection that guide their modus operandi in China often prove to be counter-productive in smaller, more insulated markets.

Beginning with Machine Learning - Part 1


This question pops into almost everyone's head who so ever wants to play with this new technology. I myself wondered as to from where should I begin, what should I cover and how can I learn quickly! I am not here to give you a list of articles from where you can read or explore. But I will help you through it. To have a basic understanding of almost every important concept so that you can dig into that as well.

Mozilla to use machine learning to find code bugs before they ship


In a bid to cut the number of coding errors made in its Firefox browser, Mozilla is deploying Clever-Commit, a machine-learning-driven coding assistant developed in conjunction with game developer Ubisoft. Clever-Commit analyzes code changes as developers commit them to the Firefox codebase. It compares them to all the code it has seen before to see if they look similar to code that the system knows to be buggy. If the assistant thinks that a commit looks suspicious, it warns the developer. Presuming its analysis is correct, it means that the bug can be fixed before it gets committed into the source repository.

Data Management Experts Share Best Practices for Machine Learning


Machine learning is on the rise at businesses hungry for greater automation and intelligence with use cases spreading across industries. At the same time, most projects are still in their early phases as companies learn how to deal with selecting data sets and data platforms to architecting and optimizing data pipelines. DBTA recently held a webinar with Gaurav Deshpande, VP of marketing, TigerGraph, and Prakash Chokalingam, product manager, Databricks, who discussed key technologies and strategies for dealing with machine learning. There are several trends affecting machine learning, according to Chokalingam. Companies deal with data challenges such as data corruption, read scan inefficiency, slow ingestion, schema management, data versioning and rollbacks.

Should I Open-Source My Model? – Towards Data Science


I have worked on the problem of open-sourcing Machine Learning versus sensitivity for a long time, especially in disaster response contexts: when is it right/wrong to release data or a model publicly? This article is a list of frequently asked questions, the answers that are best practice today, and some examples of where I have encountered them. The criticism of OpenAI's decision included how it limits the research community's ability to replicate the results, and how the action in itself contributes to media fear of AI that is hyperbolic right now. It was this tweet that first caught my eye. Anima Anankumar has a lot of experience bridging the gap between research and practical applications of Machine Learning.

Cloudera's Hilary Mason: To make AI useful, make it more "boring"


When it comes to Artificial Intelligence, the industry is at a crossroads of fascination versus function. We're awed by the technology, but a number of forces are conspiring to minimize the progress we're making, especially in the Enterprise. There are a few people out there who are adamantly trying to address this. One of them is Hilary Mason, Cloudera's GM of Machine Learning. Mason was previously Chief Data Scientist at Bitly, then founder and CEO of Fast Forward Labs, which Cloudera acquired in 2017.