Goto

Collaborating Authors

 Overview


A survey of cross-lingual embedding models

#artificialintelligence

In past blog posts, we discussed different models, objective functions, and hyperparameter choices that allow us to learn accurate word embeddings. However, these models are generally restricted to capture representations of words in the language they were trained on. The availability of resources, training data, and benchmarks in English leads to a disproportionate focus on the English language and a negligence of the plethora of other languages that are spoken around the world. In our globalised society, where national borders increasingly blur, where the Internet gives everyone equal access to information, it is thus imperative that we do not only seek to eliminate bias pertaining to gender or race inherent in our representations, but also aim to address our bias towards language. To remedy this and level the linguistic playing field, we would like to leverage our existing knowledge in English to equip our models with the capability to process other languages. Perfect machine translation (MT) would allow this. However, we do not need to actually translate examples, as long as we are able to project examples into a common subspace such as the one in Figure 1. Ultimately, our goal is to learn a shared embedding space between words in all languages. Equipped with such a vector space, we are able to train our models on data in any language. By projecting examples available in one language into this space, our model simultaneously obtains the capability to perform predictions in all other languages (we are glossing over some considerations here; for these, refer to this section). This is the promise of cross-lingual embeddings. Over the course of this blog post, I will give an overview of models and algorithms that have been used to come closer to this elusive goal of capturing the relations between words in multiple languages in a common embedding space. Note: While neural MT approaches implicitly learn a shared cross-lingual embedding space by optimizing for the MT objective, we will focus on models that explicitly learn cross-lingual word representations throughout this blog post. These methods generally do so at a much lower cost than MT and can be considered to be to MT what word embedding models (word2vec, GloVe, etc.) are to language modelling.


Implementing your own k-nearest neighbour algorithm using Python

#artificialintelligence

In machine learning, you may often wish to build predictors that allows to classify things into categories based on some set of associated values. For example, it is possible to provide a diagnosis to a patient based on data from previous patients. Many algorithms have been developed for automated classification, and common ones include random forests, support vector machines, Naรฏve Bayes classifiers, and many types of neural networks. To get a feel for how classification works, we take a simple example of a classification algorithm โ€“ k-Nearest Neighbours (kNN) โ€“ and build it from scratch in Python 2. You can use a mostly imperative style of coding, rather than a declarative/functional one with lambda functions and list comprehensions to keep things simple if you are starting with Python. Here, we will provide an introduction to the latter approach.


When is sparse dictionary learning well-posed?

arXiv.org Machine Learning

HE emergence of response properties of neurons in the mammalian visual cortex from the optimization of dictionaries for sparse coding of natural images marked an exciting development in computational neuroscience [1]-[4]. Many dictionary learning algorithms have since been developed and applied to a variety of problems in signal processing and machine learning (see [5] for a comprehensive review). A popular formulation of the idea is to encode each of N data points as a linear combination of at most k n-dimensional vectors from an inferred dictionary of size m, where k m N. Certain applications to data analysis call for a unique such "sparse structure". For instance, detecting forgeries by analysis of local painting style [6], [7] requires that all dictionaries consistent with training data do not differ appreciably in their ability to sparsely encode new samples. Recently, algorithms with proven convergence under certain conditions have been proposed (see [8, Sec.


A Survey of Computational Treatments of Biomolecules by Robotics-Inspired Methods Modeling Equilibrium Structure and Dynamic

Journal of Artificial Intelligence Research

More than fifty years of research in molecular biology have demonstrated that the ability of small and large molecules to interact with one another and propagate the cellular processes in the living cell lies in the ability of these molecules to assume and switch between specific structures under physiological conditions. Elucidating biomolecular structure and dynamics at equilibrium is therefore fundamental to furthering our understanding of biological function, molecular mechanisms in the cell, our own biology, disease, and disease treatments. By now, there is a wealth of methods designed to elucidate biomolecular structure and dynamics contributed from diverse scientific communities. In this survey, we focus on recent methods contributed from the Robotics community that promise to address outstanding challenges regarding the disparate length and time scales that characterize dynamic molecular processes in the cell. In particular, we survey robotics-inspired methods designed to obtain efficient representations of structure spaces of molecules in isolation or in assemblies for the purpose of characterizing equilibrium structure and dynamics. While an exhaustive review is an impossible endeavor, this survey balances the description of important algorithmic contributions with a critical discussion of outstanding computational challenges. The objective is to spur further research to address outstanding challenges in modeling equilibrium biomolecular structure and dynamics.


Identity-sensitive Word Embedding through Heterogeneous Networks

arXiv.org Machine Learning

Most existing word embedding approaches do not distinguish the same words in different contexts, therefore ignoring their contextual meanings. As a result, the learned embeddings of these words are usually a mixture of multiple meanings. In this paper, we acknowledge multiple identities of the same word in different contexts and learn the \textbf{identity-sensitive} word embeddings. Based on an identity-labeled text corpora, a heterogeneous network of words and word identities is constructed to model different-levels of word co-occurrences. The heterogeneous network is further embedded into a low-dimensional space through a principled network embedding approach, through which we are able to obtain the embeddings of words and the embeddings of word identities. We study three different types of word identities including topics, sentiments and categories. Experimental results on real-world data sets show that the identity-sensitive word embeddings learned by our approach indeed capture different meanings of words and outperforms competitive methods on tasks including text classification and word similarity computation.


IBMVoice: Beyond AI: Human Machine Collaboration For The Advancement Of Humankind

#artificialintelligence

It seems like almost every day a new headline warns us that artificial intelligence (AI) will soon take over the world, or at the very least steal jobs. Even when AI is not in the news, Hollywood offers up a steady stream of entertainment that depicts a very near future in which life as we know it is threatened by super-intelligent machines. These scenarios have something in common: they oversimplify and misrepresent an important and broader set of transformative technologies that hold great promise for business and society. They indulge in fantasy rather than take into account a rational and better-informed dialogue currently underway in the scientific, policy and business communities about what we consider the third age of computing -- the cognitive era. Cognitive computing -- of which AI is but one part -- refers to an entirely new class of technologies whose purpose is to deepen human engagement, scale and elevate expertise, enable new products and services, and enhance exploration and discovery.


Three key trends for 2017 - The Advisor

#artificialintelligence

This piece, written by MWD Advisors' lead analysts Angela Ashenden, Neil Ward-Dutton and Craig Wentworth, provides an overview of the key trends we see organisations facing in 2017 (and beyond). If you're in a technology leadership role, you should be exploring how these trends will impact the ways you plan, design, deliver and support services and capabilities. At MWD Advisors, our research program is focused, at a high level, on how digital technology changes work. The core of our scope of research is the systems and platforms organisations use to share knowledge, make decisions and co-ordinate work; we look at how new technologies are changing the picture, and what distinguishes successful organisations that reap the benefits of new technologies from those that struggle to drive meaningful change. The three trends we outline below are fundamentally changing the relationship between technology and business.


Machine Learning on Human Connectome Data from MRI

arXiv.org Machine Learning

Functional MRI (fMRI) and diffusion MRI (dMRI) are non-invasive imaging modalities that allow in-vivo analysis of a patient's brain network (known as a connectome). Use of these technologies has enabled faster and better diagnoses and treatments of neurological disorders and a deeper understanding of the human brain. Recently, researchers have been exploring the application of machine learning models to connectome data in order to predict clinical outcomes and analyze the importance of subnetworks in the brain. Connectome data has unique properties, which present both special challenges and opportunities when used for machine learning. The purpose of this work is to review the literature on the topic of applying machine learning models to MRI-based connectome data. This field is growing rapidly and now encompasses a large body of research. To summarize the research done to date, we provide a comparative, structured summary of 77 relevant works, tabulated according to different criteria, that represent the majority of the literature on this topic. (We also published a living version of this table online at http://connectomelearning.cs.sfu.ca that the community can continue to contribute to.) After giving an overview of how connectomes are constructed from dMRI and fMRI data, we discuss the variety of machine learning tasks that have been explored with connectome data. We then compare the advantages and drawbacks of different machine learning approaches that have been employed, discussing different feature selection and feature extraction schemes, as well as the learning models and regularization penalties themselves. Throughout this discussion, we focus particularly on how the methods are adapted to the unique nature of graphical connectome data. Finally, we conclude by summarizing the current state of the art and by outlining what we believe are strategic directions for future research.


Quantum Enhanced Inference in Markov Logic Networks

arXiv.org Machine Learning

Markov logic networks (MLNs) reconcile two opposing schools in machine learning and artificial intelligence: causal networks, which account for uncertainty extremely well, and first-order logic, which allows for formal deduction. An MLN is essentially a first-order logic template to generate Markov networks. Inference in MLNs is probabilistic and it is often performed by approximate methods such as Markov chain Monte Carlo (MCMC) Gibbs sampling. An MLN has many regular, symmetric structures that can be exploited at both first-order level and in the generated Markov network. We analyze the graph structures that are produced by various lifting methods and investigate the extent to which quantum protocols can be used to speed up Gibbs sampling with state preparation and measurement schemes. We review different such approaches, discuss their advantages, theoretical limitations, and their appeal to implementations. We find that a straightforward application of a recent result yields exponential speedup compared to classical heuristics in approximate probabilistic inference, thereby demonstrating another example where advanced quantum resources can potentially prove useful in machine learning.


This survey drone takes safety seriously

PCWorld

New Zealand-based drone manufacturer Altus Intelligence wants to make sure its US$39,000 survey drones don't end up as rubble. Most of Altus' customers use its flagship drone, the Long Range Extreme Weather (LRX), for construction and engineering surveying/mapping and expect a rugged, dependable vehicle to get the job done. That's where the LRX's three separate fail-safe systems come in. The first is a triple auto pilot design, meaning that if anything goes wrong with one of the GPS streams, the other two will take over. The LRX is also armed with eight staggered propellers.