Goto

Collaborating Authors

 Oceania


Real life tips and learnings of applying artificial intelligence in marketing

#artificialintelligence

This post is by Anton Buchner, a senior consultant with TrinityP3. Anton is one of Australia's leaders in data-driven marketing. Helping navigate through the bells, whistles and hype to identify genuine marketing value when it comes to technology, digital activity, and the resulting data footprint. And the Artificial Intelligence (AI) space is no exception. Over the past few years we've seen the rise and rise of AI discussion and solutions in marketing. I have spent the past month talking to a wide variety of industry thought leaders and experts in the AI space – from business, agency, and tech vendor perspectives. With the aim of identifying how Australian marketers are using AI solutions to enhance and anticipate consumer interaction. In this post, I would like to share some of their experiences and learnings to date. However, before we jump in, as I'm sure most of you know, AI dates back decades. Let's take a quick look back at how AI emerged.



Rethinking Softmax with Cross-Entropy: Neural Network Classifier as Mutual Information Estimator

arXiv.org Machine Learning

Mutual information is widely applied to learn latent representations of observations, whilst its implication in classification neural networks remain to be better explained. In this paper, we show that optimising the parameters of classification neural networks with softmax cross-entropy is equivalent to maximising the mutual information between inputs and labels under the balanced data assumption. Through the experiments on synthetic and real datasets, we show that softmax cross-entropy can estimate mutual information approximately. When applied to image classification, this relation helps approximate the point-wise mutual information between an input image and a label without modifying the network structure. In this end, we propose infoCAM, informative class activation map, which highlights regions of the input image that are the most relevant to a given label based on differences in information. The activation map helps localise the target object in an image. Through the experiments on the semi-supervised object localisation task with two real-world datasets, we evaluate the effectiveness of the information-theoretic approach.


Merging Deterministic Policy Gradient Estimations with Varied Bias-Variance Tradeoff for Effective Deep Reinforcement Learning

arXiv.org Machine Learning

Deep reinforcement learning (DRL) on Markov decision processes (MDPs) with continuous action spaces is often approached by directly updating parametric policies along the direction of estimated policy gradients (PGs). Previous research revealed that the performance of these PG algorithms depends heavily on the bias-variance tradeoff involved in estimating and using PGs. A notable approach towards balancing this tradeoff is to merge both on-policy and off-policy gradient estimations for the purpose of training stochastic policies. However this method cannot be utilized directly by sample-efficient off-policy PG algorithms such as Deep Deterministic Policy Gradient (DDPG) and twin-delayed DDPG (TD3), which have been designed to train deterministic policies. It is hence important to develop new techniques to merge multiple off-policy estimations of deterministic PG (DPG). Driven by this research question, this paper introduces elite DPG which will be estimated differently from conventional DPG to emphasize on the variance reduction effect at the expense of increased learning bias. To mitigate the extra bias, policy consolidation techniques will be developed to distill policy behavioral knowledge from elite trajectories and use the distilled generative model to further regularize policy training. Moreover, we will study both theoretically and experimentally two different DPG merging methods, i.e., interpolation merging and two-step merging, with the aim to induce varied bias-variance tradeoff through combined use of both conventional DPG and elite DPG. Experiments on six benchmark control tasks confirm that these two merging methods can noticeably improve the learning performance of TD3, significantly outperforming several state-of-the-art DRL algorithms.


Coupling Matrix Manifolds and Their Applications in Optimal Transport

arXiv.org Machine Learning

Optimal transport (OT) is a powerful tool for measuring the distance between two defined probability distributions. In this paper, we develop a new manifold named the coupling matrix manifold (CMM), where each point on CMM can be regarded as the transportation plan of the OT problem. We firstly explore the Riemannian geometry of CMM with the metric expressed by the Fisher information. These geometrical features of CMM have paved the way for developing numerical Riemannian optimization algorithms such as Riemannian gradient descent and Riemannian trust-region algorithms, forming a uniform optimization method for all types of OT problems. The proposed method is then applied to solve several OT problems studied by previous literature. The results of the numerical experiments illustrate that the optimization algorithms that are based on the method proposed in this paper are comparable to the classic ones, for example, the Sinkhorn algorithm, while outperforming other state-of-the-art algorithms without considering the geometry information, especially in the case of non-entropy optimal transport.


What can artificial intelligence do for physics? And what will it do i to /i physics?

#artificialintelligence

What can artificial intelligence do for physics? And what will it do to physics? In the past two years, governments all over the world have launched research initiatives for Artificial Intelligence (AI). Canada, China, the United States, the European Commission, Australia, France, Denmark, the UK, Germany – everyone suddenly has a strategy for "AI made in" whatever happens to be their own part of the planet. In the coming decades, it is now foreseeable, tens of billions of dollars will flow into the field.


Large expert-curated database for benchmarking document similarity detection in biomedical literature search

#artificialintelligence

Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations.



Automated detection and quantification of breast cancer brain metastases in an animal model using democratized machine learning tools

#artificialintelligence

Advances in digital whole-slide imaging and machine learning (ML) provide new opportunities for automated examination and quantification of histopathological slides to support pathologists and biologists. However, implementation of ML tools often requires advanced skills in computer science that may not be immediately available in the traditional wet-lab environment. Here, we propose a simple and accessible workflow to automate detection and quantification of brain epithelial metastases on digitized histological slides. A supervised training of the Trainable Weka Segmentation (TWS) from Fiji was achieved from annotated WSIs. Upon comparison with manually drawn regions, it is apparent that the algorithm learned to identify and segment cancer cell-specific nuclei and normal brain tissue.


CopyMTL: Copy Mechanism for Joint Extraction of Entities and Relations with Multi-Task Learning

arXiv.org Artificial Intelligence

Joint extraction of entities and relations has received significant attention due to its potential of providing higher performance for both tasks. Among existing methods, CopyRE is effective and novel, which uses a sequence-to-sequence framework and copy mechanism to directly generate the relation triplets. However, it suffers from two fatal problems. The model is extremely weak at differing the head and tail entity, resulting in inaccurate entity extraction. It also cannot predict multi-token entities (e.g. \textit{Steven Jobs}). To address these problems, we give a detailed analysis of the reasons behind the inaccurate entity extraction problem, and then propose a simple but extremely effective model structure to solve this problem. In addition, we propose a multi-task learning framework equipped with copy mechanism, called CopyMTL, to allow the model to predict multi-token entities. Experiments reveal the problems of CopyRE and show that our model achieves significant improvement over the current state-of-the-art method by 9% in NYT and 16% in WebNLG (F1 score). Our code is available at https://github.com/WindChimeRan/CopyMTL