Goto

Collaborating Authors

 Oceania


Learning Embeddings into Entropic Wasserstein Spaces

arXiv.org Machine Learning

Euclidean embeddings of data are fundamentally limited in their ability to capture latent semantic structures, which need not conform to Euclidean spatial assumptions. Here we consider an alternative, which embeds data as discrete probability distributions in a Wasserstein space, endowed with an optimal transport metric. Wasserstein spaces are much larger and more flexible than Euclidean spaces, in that they can successfully embed a wider variety of metric structures. We exploit this flexibility by learning an embedding that captures semantic information in the Wasserstein distance between embedded distributions. We examine empirically the representational capacity of our learned Wasserstein embeddings, showing that they can embed a wide variety of metric structures with smaller distortion than an equivalent Euclidean embedding. We also investigate an application to word embedding, demonstrating a unique advantage of Wasserstein embeddings: We can visualize the high-dimensional embedding directly, since it is a probability distribution on a low-dimensional space.


We don't see AI opportunity

#artificialintelligence

If a picture tells a thousand words, these are the two jostling foremost in a patient's mind when a radiologist scans their body for a better image of that suspicious lump or mass. But there is so much more a picture can tell us about cancer, particularly if we consider the possibilities of artificial intelligence. In 2017, US scientists announced they had developed an algorithm, or a computerised tool, to identify skin cancers through analysis of photographs. The algorithm scans a photo of a patch of skin to look for common forms of skin cancer, performing on par with board-certified dermatologists in identifying malignant melanomas (the third most common cancer in Australia) and keratinocyte carcinoma. This technology might enable skin cancer detection in country clinics and suburban GPs' offices at the highest accuracy available.


AI-Powered Gun Detection Is Coming to Mosques Worldwide Following Christchurch Shootings

#artificialintelligence

In March, a gunman walked into two mosques in Christchurch, New Zealand, opened fire, and killed dozens of worshippers. According to a police official, the suspected gunman was arrested 36 minutes after police were called to the scene. Now, a tech company believes its smart security cameras can prevent attacks like the tragedy in Christchurch, and says it plans to install its AI-powered systems in mosques around the world. Athena Security, the tech company behind the security system, and Al-Ameri International Trading announced the Keep Mosques Safe initiative last week. Al-Ameri International Trading, along with several Islamic non-profit groups, will fund the Keep Mosques Safe effort.


Feature Selection and Feature Extraction in Pattern Analysis: A Literature Review

arXiv.org Machine Learning

Pattern analysis often requires a pre-processing stage for extracting or selecting features in order to help the classification, prediction, or clustering stage discriminate or represent the data in a better way. The reason for this requirement is that the raw data are complex and difficult to process without extracting or selecting appropriate features beforehand. This paper reviews theory and motivation of different common methods of feature selection and extraction and introduces some of their applications. Some numerical implementations are also shown for these methods. Finally, the methods in feature selection and extraction are compared.


Toybox: A Suite of Environments for Experimental Evaluation of Deep Reinforcement Learning

arXiv.org Machine Learning

While ALE has enabled demonstration and evaluation of much more complex behaviors of deep RL agents, it Evaluation of deep reinforcement learning (RL) presents challenges as a suite of evaluation environments is inherently challenging. In particular, learned for topics on the frontier of deep RL. policies are largely opaque, and hypotheses about Challenge: Limited variation within games. Very little about the behavior of deep RL agents are difficult to individual games can be systematically altered, so ALE is test in black-box environments. Considerable effort poorly suited to testing how changes in the environment has gone into addressing opacity, but almost affect training and performance. New benchmarks such as no effort has been devoted to producing highquality OpenAI's Sonic the Hedgehog emulator and CoinRun inject environments for experimental evaluation environmental variation into the training schedule, while of agent behavior.


Adversarial Variational Embedding for Robust Semi-supervised Learning

arXiv.org Machine Learning

Semi-supervised learning is sought for leveraging the unlabelled data when labelled data is difficult or expensive to acquire. Deep generative models (e.g., Variational Autoencoder (VAE)) and semisupervised Generative Adversarial Networks (GANs) have recently shown promising performance in semi-supervised classification for the excellent discriminative representing ability. However, the latent code learned by the traditional VAE is not exclusive (repeatable) for a specific input sample, which prevents it from excellent classification performance. In particular, the learned latent representation depends on a non-exclusive component which is stochastically sampled from the prior distribution. Moreover, the semi-supervised GAN models generate data from pre-defined distribution (e.g., Gaussian noises) which is independent of the input data distribution and may obstruct the convergence and is difficult to control the distribution of the generated data. To address the aforementioned issues, we propose a novel Adversarial Variational Embedding (AVAE) framework for robust and effective semi-supervised learning to leverage both the advantage of GAN as a high quality generative model and VAE as a posterior distribution learner. The proposed approach first produces an exclusive latent code by the model which we call VAE++, and meanwhile, provides a meaningful prior distribution for the generator of GAN. The proposed approach is evaluated over four different real-world applications and we show that our method outperforms the state-of-the-art models, which confirms that the combination of VAE++ and GAN can provide significant improvements in semisupervised classification.


IBM's AI can detect glaucoma from eye scans

#artificialintelligence

It's also frighteningly common: 3.5% of the population aged 40 years or older (about 60.5 million in 2010) has been diagnosed with the disease, and the number is expected to steeply rise in the next year. Early detection and treatment is essential -- glaucoma progresses irreversibly and almost imperceptibly. Toward that end, scientists at IBM Research and New York University describe in a paper a noninvasive technique that uses AI to detect patterns characteristic of glaucoma in retina imaging data. It's scheduled to be presented at the Association for Research in Vision and Ophthalmology later this month in Vancouver. "From a biological point of view, we know there are associations between visual function and retinal structure," wrote senior research scientist and manager at IBM Research Australia Rahil Garnavi in a blog post.


Learning Causality: Synthesis of Large-Scale Causal Networks from High-Dimensional Time Series Data

arXiv.org Machine Learning

There is an abundance of complex dynamic systems that are critical to our daily lives and our society but that are hardly understood, and even with today's possibilities to sense and collect large amounts of experimental data, they are so complex and continuously evolving that it is unlikely that their dynamics will ever be understood in full detail. Nevertheless, through computational tools we can try to make the best possible use of the current technologies and available data. We believe that the most useful models will have to take into account the imbalance between system complexity and available data in the context of limited knowledge or multiple hypotheses. The complex system of biological cells is a prime example of such a system that is studied in systems biology and has motivated the methods presented in this paper. They were developed as part of the DARPA Rapid Threat Assessment (RTA) program, which is concerned with understanding of the mechanism of action (MoA) of toxins or drugs affecting human cells. Using a combination of Gaussian processes and abstract network modeling, we present three fundamentally different machine-learning-based approaches to learn causal relations and synthesize causal networks from high-dimensional time series data. While other types of data are available and have been analyzed and integrated in our RTA work, we focus on transcriptomics (that is gene expression) data obtained from high-throughput microarray experiments in this paper to illustrate capabilities and limitations of our algorithms. Our algorithms make different but overall relatively few biological assumptions, so that they are applicable to other types of biological data and potentially even to other complex systems that exhibit high dimensionality but are not of biological nature.



On Overfitting and Asymptotic Bias in Batch Reinforcement Learning with Partial Observability

Journal of Artificial Intelligence Research

This paper provides an analysis of the tradeoff between asymptotic bias (suboptimality with unlimited data) and overfitting (additional suboptimality due to limited data) in the context of reinforcement learning with partial observability. Our theoretical analysis formally characterizes that while potentially increasing the asymptotic bias, a smaller state representation decreases the risk of overfitting. This analysis relies on expressing the quality of a state representation by bounding $L_1$ error terms of the associated belief states.  Theoretical results are empirically illustrated when the state representation is a truncated history of observations, both on synthetic POMDPs and on a large-scale POMDP in the context of smartgrids, with real-world data. Finally, similarly to known results in the fully observable setting, we also briefly discuss and empirically illustrate how using function approximators and adapting the discount factor may enhance the tradeoff between asymptotic bias and overfitting in the partially observable context.