Oceania
Convolutional Conditional Neural Processes
Gordon, Jonathan, Bruinsma, Wessel P., Foong, Andrew Y. K., Requeima, James, Dubois, Yann, Turner, Richard E.
We introduce the Convolutional Conditional Neural Process (ConvCNP), a new member of the Neural Process family that models translation equivariance in the data. Translation equivariance is an important inductive bias for many learning problems including time series modelling, spatial data, and images. The model embeds data sets into an infinite-dimensional function space as opposed to a finite-dimensional vector space. To formalize this notion, we extend the theory of neural representations of sets to include functional representations, and demonstrate that any translation-equivariant embedding can be represented using a convolutional deep set. We evaluate ConvCNPs in several settings, demonstrating that they achieve state-of-the-art performance compared to existing NPs. We demonstrate that building in translation equivariance enables zero-shot generalization to challenging, out-of-domain tasks.
Deep Integro-Difference Equation Models for Spatio-Temporal Forecasting
Zammit-Mangion, Andrew, Wikle, Christopher K.
Integro-difference equation (IDE) models describe the conditional dependence between the spatial process at a future time point and the process at the present time point through an integral operator. Nonlinearity or temporal dependence in the dynamics is often captured by allowing the operator parameters to vary temporally, or by re-fitting a model with a temporally-invariant linear operator at each time point in a sliding window. Both procedures tend to be excellent for prediction purposes over small time horizons, but are generally time-consuming and, crucially, do not provide a global prior model for the temporally-varying dynamics that is realistic. Here, we tackle these two issues by using a deep convolution neural network (CNN) in a hierarchical statistical IDE framework, where the CNN is designed to extract process dynamics from the process' most recent behaviour. Once the CNN is fitted, probabilistic forecasting can be done extremely quickly online using an ensemble Kalman filter with no requirement for repeated parameter estimation. We conduct an experiment where we train the model using 13 years of daily sea-surface temperature data in the North Atlantic Ocean. Forecasts are seen to be accurate and calibrated. A key advantage of our approach is that the CNN provides a global prior model for the dynamics that is realistic, interpretable, and computationally efficient. We show the versatility of the approach by successfully producing 10-minute nowcasts of weather radar reflectivities in Sydney using the same model that was trained on daily sea-surface temperature data in the North Atlantic Ocean.
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
Lewis, Mike, Liu, Yinhan, Goyal, Naman, Ghazvininejad, Marjan, Mohamed, Abdelrahman, Levy, Omer, Stoyanov, Ves, Zettlemoyer, Luke
BART is trained by (1) corrupting text with an arbitrary noising function, and (2) learning a model to reconstruct the original text. It uses a standard Tranformer-based neural machine translation architecture which, despite its simplicity, can be seen as generalizing BERT (due to the bidirectional encoder), GPT (with the left-to-right decoder), and many other more recent pretraining schemes. We evaluate a number of noising approaches, finding the best performance by both randomly shuffling the order of the original sentences and using a novel in-filling scheme, where spans of text are replaced with a single mask token. BART is particularly effective when fine tuned for text generation but also works well for comprehension tasks. It matches the performance of RoBERTa with comparable training resources on GLUE and SQuAD, achieves new state-of-the-art results on a range of abstractive dialogue, question answering, and summarization tasks, with gains of up to 6 ROUGE. BART also provides a 1.1 BLEU increase over a back-translation system for machine translation, with only target language pretraining. We also report ablation experiments that replicate other pretraining schemes within the BART framework, to better measure which factors most influence end-task performance.
Deep Learning Emulation of Multi-Angle Implementation of Atmospheric Correction (MAIAC)
Duffy, Kate, Vandal, Thomas, Wang, Weile, Nemani, Ramakrishna, Ganguly, Auroop R.
New generation geostationary satellites make solar reflectance observations available at a continental scale with unprecedented spatiotemporal resolution and spectral range. Generating quality land monitoring products requires correction of the effects of atmospheric scattering and absorption, which vary in time and space according to geometry and atmospheric composition. Many atmospheric radiative transfer models, including that of Multi-Angle Implementation of Atmospheric Correction (MAIAC), are too computationally complex to be run in real time, and rely on precomputed look-up tables. Additionally, uncertainty in measurements and models for remote sensing receives insufficient attention, in part due to the difficulty of obtaining sufficient ground measurements. In this paper, we present an adaptation of Bayesian Deep Learning (BDL) to emulation of the MAIAC atmospheric correction algorithm. Emulation approaches learn a statistical model as an efficient approximation of a physical model, while machine learning methods have demonstrated performance in extracting spatial features and learning complex, nonlinear mappings. We demonstrate stable surface reflectance retrieval by emulation (R2 between MAIAC and emulator SR are 0.63, 0.75, 0.86, 0.84, 0.95, and 0.91 for Blue, Green, Red, NIR, SWIR1, and SWIR2 bands, respectively), accurate cloud detection (86\%), and well-calibrated, geolocated uncertainty estimates. Our results support BDL-based emulation as an accurate and efficient (up to 6x speedup) method for approximation atmospheric correction, where built-in uncertainty estimates stand to open new opportunities for model assessment and support informed use of SR-derived quantities in multiple domains.
Knowledge Tracing with Sequential Key-Value Memory Networks
Abdelrahman, Ghodai, Wang, Qing
Can machines trace human knowledge like humans? Knowledge tracing (KT) is a fundamental task in a wide range of applications in education, such as massive open online courses (MOOCs), intelligent tutoring systems, educational games, and learning management systems. It models dynamics in a student's knowledge states in relation to different learning concepts through their interactions with learning activities. Recently, several attempts have been made to use deep learning models for tackling the KT problem. Although these deep learning models have shown promising results, they have limitations: either lack the ability to go deeper to trace how specific concepts in a knowledge state are mastered by a student, or fail to capture long-term dependencies in an exercise sequence. In this paper, we address these limitations by proposing a novel deep learning model for knowledge tracing, namely Sequential Key-Value Memory Networks (SKVMN). This model unifies the strengths of recurrent modelling capacity and memory capacity of the existing deep learning KT models for modelling student learning. We have extensively evaluated our proposed model on five benchmark datasets. The experimental results show that (1) SKVMN outperforms the state-of-the-art KT models on all datasets, (2) SKVMN can better discover the correlation between latent concepts and questions, and (3) SKVMN can trace the knowledge state of students dynamics, and a leverage sequential dependencies in an exercise sequence for improved predication accuracy.
Bayesian Optimization with Unknown Search Space
Ha, Huong, Rana, Santu, Gupta, Sunil, Nguyen, Thanh, Tran-The, Hung, Venkatesh, Svetha
Applying Bayesian optimization in problems wherein the search space is unknown is challenging. To address this problem, we propose a systematic volume expansion strategy for the Bayesian optimization. We devise a strategy to guarantee that in iterative expansions of the search space, our method can find a point whose function value within epsilon of the objective function maximum. Without the need to specify any parameters, our algorithm automatically triggers a minimal expansion required iteratively. We derive analytic expressions for when to trigger the expansion and by how much to expand. We also provide theoretical analysis to show that our method achieves epsilon-accuracy after a finite number of iterations. We demonstrate our method on both benchmark test functions and machine learning hyper-parameter tuning tasks and demonstrate that our method outperforms baselines.
Australia introducing AI in healthcare – Biopharmapress
HIMSS is joining forces with the Australia Digital Health Agency (ADHA) to compose the up and coming HIMSS Australia Digital Health Summit (ADHS) from 20-21 November this year, occurring in Sydney, Australia. The gathering is relied upon to unite delegates from ADHA, open and private medicinal services pioneers from Australia, just as from the APAC region. The primary topic of the Summit is "Interoperability and Connected Care, which is particularly pertinent with the execution of My Health Record (MHR) in the nation, and on the web, the electronic rundown of one's key wellbeing data. ADHA has been dynamically overhauling the MHR, for example, collaborating with programming merchants to have the option to share data securely crosswise over various programming items and improving its clinical work process abilities. The Data track will address the potential advantages of making a system of shared information crosswise over Australia and contextual investigations of how the utilization of information examination apparatuses can realize out better wellbeing results.
AI targets insider threats by analysing employee writing for malice
Data security threats from malicious insiders have already been recognised as a big problem for businesses – but an IBM Australia-built proof of concept could go a long way towards solving it with an artificial intelligence (AI) based solution that can spot disgruntled workers before they have acted. The tool grew out of an AI-themed internal hackathon run at IBM's Gold Coast-based Australian Security Development Lab, where developers are encouraged to come up with novel solutions. A team of IBM Security engineers realised that businesses are collecting masses of data about network performance and user behaviour, QRadar flows product owner Holly Wright told CSO Australia, and set about looking for ways this information could be meaningfully paired with other data and analysed to give greater insight about users' state of mind. "QRadar gives us deep visibility into the messages, views and emails going across the network," explained Wright, who shared details of the project with attendees at AISA's recent Australian Cyber Conference. "We decided to look at users from a risk perspective. We're essentially leveraging that information that's on the network, that nobody has really done anything with."
Bringing Precision Driven Health to the UK
Kevin Ross is the chief executive of Precision Driven Health, a research partnership that is applying machine learning models to health data in order to improve outcomes and personalise medicine. Kevin will be speaking at this year's Orion Health UK and Ireland Customer Conference 2019, and ahead of the event he spoke about its work, the tools that are now being made available to Orion Health customers, and why he is looking for like-minded organisations to "tune" models for their own populations. What is Precision Driven Health? Precision Driven Health is a research partnership that applies machine learning to data sets in order to improve health outcomes by delivering a more personalised experience for patients. It was formed when Orion Health, the University of Auckland and Waitematā District Health Board recognised that we all had capabilities that we could bring together.