Oceania
How the internet is affecting the human brain: Multitasking and relying on Google to jog memories
Spending time on the internet is reducing our ability to focus on one task at a time - and it means we no longer store facts in our brains. Our lives have been forever changed by gaining access to infinite amounts of information at the touch of a button, but the way our head works has too. A new review looking into the effect of the online world on our brain functions from researchers in the UK, US and Australia, has drawn a number of surprising conclusions. The review focused on the world wide web's influence in three areas: attention spans, memory, and social cognition. It notes that the internet is now'unavoidable, ubiquitous, and a highly functional aspect of modern living' before diving into how it has changed our society.
Tesla big battery paves way for artificial intelligence to dominate energy trades
Around the world, and particularly in Australia, energy traders are trying to get their minds, and their algorithms, around the complexities of trading in variable wind and solar projects and super-fast battery storage installations. Maybe they should give up now, and hand it over to artificial intelligence. US-based software-as-a-service platform provider AMS says automated trading systems for batteries and renewable energy projects using deep learning and artificial intelligence can out-compete the best human traders, by around a factor of five. With the deployment of large-scale energy storage systems occurring at an ever-increasing rate, this is critical – not just for the ability to make money out of the markets, but also for the ongoing operation of the National Electricity Market itself. Traditional generators only need to maximise their generation during periods of sufficiently high energy prices.
Constrained Decoding for Neural NLG from Compositional Representations in Task-Oriented Dialogue
Balakrishnan, Anusha, Rao, Jinfeng, Upasani, Kartikeya, White, Michael, Subba, Rajen
Generating fluent natural language responses from structured semantic representations is a critical step in task-oriented conversational systems. Avenues like the E2E NLG Challenge have encouraged the development of neural approaches, particularly sequence-to-sequence (Seq2Seq) models for this problem. The semantic representations used, however, are often underspecified, which places a higher burden on the generation model for sentence planning, and also limits the extent to which generated responses can be controlled in a live system. In this paper, we (1) propose using tree-structured semantic representations, like those used in traditional rule-based NLG systems, for better discourse-level structuring and sentence-level planning; (2) introduce a challenging dataset using this representation for the weather domain; (3) introduce a constrained decoding approach for Seq2Seq models that leverages this representation to improve semantic correctness; and (4) demonstrate promising results on our dataset and the E2E dataset.
MixUp as Directional Adversarial Training
Archambault, Guillaume P., Mao, Yongyi, Guo, Hongyu, Zhang, Richong
In this work, we explain the working mechanism of MixUp in terms of adversarial training. We introduce a new class of adversarial training schemes, which we refer to as directional adversarial training, or DAT. In a nutshell, a DAT scheme perturbs a training example in the direction of another example but keeps its original label as the training target. We prove that MixUp is equivalent to a special subclass of DAT, in that it has the same expected loss function and corresponds to the same optimization problem asymptotically. This understanding not only serves to explain the effectiveness of MixUp, but also reveals a more general family of MixUp schemes, which we call Untied MixUp. We prove that the family of Untied MixUp schemes is equivalent to the entire class of DAT schemes. We establish empirically the existence of Untied Mixup schemes which improve upon MixUp.
Losing Confidence in Quality: Unspoken Evolution of Computer Vision Services
Cummaudo, Alex, Vasa, Rajesh, Grundy, John, Abdelrazek, Mohamed, Cain, Andrew
Recent advances in artificial intelligence (AI) and machine learning (ML), such as computer vision, are now available as intelligent services and their accessibility and simplicity is compelling. Multiple vendors now offer this technology as cloud services and developers want to leverage these advances to provide value to end-users. However, there is no firm investigation into the maintenance and evolution risks arising from use of these intelligent services; in particular, their behavioural consistency and transparency of their functionality. We evaluated the responses of three different intelligent services (specifically computer vision) over 11 months using 3 different data sets, verifying responses against the respective documentation and assessing evolution risk. We found that there are: (1) inconsistencies in how these services behave; (2) evolution risk in the responses; and (3) a lack of clear communication that documents these risks and inconsistencies. We propose a set of recommendations to both developers and intelligent service providers to inform risk and assist maintainability.
Bridging the Gap between Training and Inference for Neural Machine Translation
Zhang, Wen, Feng, Yang, Meng, Fandong, You, Di, Liu, Qun
Neural Machine Translation (NMT) generates target words sequentially in the way of predicting the next word conditioned on the context words. At training time, it predicts with the ground truth words as context while at inference it has to generate the entire sequence from scratch. This discrepancy of the fed context leads to error accumulation among the way. Furthermore, word-level training requires strict matching between the generated sequence and the ground truth sequence which leads to overcorrection over different but reasonable translations. In this paper, we address these issues by sampling context words not only from the ground truth sequence but also from the predicted sequence by the model during training, where the predicted sequence is selected with a sentence-level optimum. Experiment results on Chinese->English and WMT'14 English->German translation tasks demonstrate that our approach can achieve significant improvements on multiple datasets.
REBA: A Refinement-Based Architecture for Knowledge Representation and Reasoning in Robotics
Sridharan, Mohan, Gelfond, Michael, Zhang, Shiqi, Wyatt, Jeremy
This article describes REBA, a knowledge representation and reasoning architecture for robots that is based on tightly-coupled transition diagrams of the domain at two different levels of granularity. An action language is extended to support non-boolean fluents and non-deterministic causal laws, and used to describe the domain's transition diagrams, with the fine-resolution transition diagram being defined as a refinement of the coarse-resolution transition diagram. The coarse-resolution system description, and a history that includes prioritized defaults, are translated into an Answer Set Prolog (ASP) program. For any given goal, inference in the ASP program provides a plan of abstract actions. To implement each such abstract action, the robot automatically zooms to the part of the fine-resolution transition diagram relevant to this action. The zoomed fine-resolution system description, and a probabilistic representation of the uncertainty in sensing and actuation, are used to construct a partially observable Markov decision process (POMDP). The policy obtained by solving the POMDP is invoked repeatedly to implement the abstract action as a sequence of concrete actions. The fine-resolution outcomes of executing these concrete actions are used to infer coarse-resolution outcomes that are added to the coarse-resolution history and used for subsequent coarse-resolution reasoning. The architecture thus combines the complementary strengths of declarative programming and probabilistic graphical models to represent and reason with non-monotonic logic-based and probabilistic descriptions of uncertainty and incomplete domain knowledge. In addition, we describe a general methodology for the design of software components of a robot based on these knowledge representation and reasoning tools, and provide a path for proving the correctness of these components. The architecture is evaluated in simulation and on a mobile robot finding and moving target objects to desired locations in indoor domains, to show that the architecture supports reliable and efficient reasoning with violation of defaults, noisy observations and unreliable actions, in complex domains.
Learning Interpretable Models Using an Oracle
Ghose, Abhishek, Ravindran, Balaraman
As Machine Learning (ML) becomes pervasive in various real world systems, the need for models to be interpretable or explainable has increased. We focus on interpretability, noting that models often need to be constrained in size for them to be considered understandable, e.g., a decision tree of depth 5 is easier to interpret than one of depth 50. This suggests a trade-off between interpretability and accuracy. We propose a technique to minimize this tradeoff. Our strategy is to first learn a powerful, possibly black-box, probabilistic model on the data, which we refer to as the oracle. We use this to adaptively sample the training dataset to present data to our model of interest to learn from. Determining the sampling strategy is formulated as an optimization problem that, independent of the dimensionality of the data, uses only seven variables. We empirically show that this often significantly increases the accuracy of our model. Our technique is model agnostic - in that, both the interpretable model and the oracle might come from any model family. Results using multiple real world datasets, using Linear Probability Models and Decision Trees as interpretable models, and Gradient Boosted Model and Random Forest as oracles are presented. Additionally, we discuss an interesting example of using a sentence-embedding based text classifier as an oracle to improve the accuracy of a term-frequency based bag-of-words linear classifier.
Dealing with the database variability problem in learning from medical data: an ensemble-based approach using convolutional neural networks and a case of study applied to automatic sleep scoring
Alvarez-Estevez, Diego, Fernández-Varela, Isaac
In this work we examine the problematic associated to the development of machine learning models to achieve robust generalization capabilities on common-task multiple-database scenarios. Referred as the ''database variability problem'', we focus on a specific medical domain (sleep staging in Sleep Medicine) to show the non-triviality of translating the estimated model's local generalization capabilities to independent external databases. We analyze some of the scalability problems when multiple-database data are used as input to train a single learning model. Then, we introduce a novel approach based on an ensemble of local models, and we show its advantages in terms of inter-database generalization performance and data scalability. Further on, we analyze different model configurations and data pre-processing techniques to evaluate their effects over the overall generalization performance. For this purpose we carry out experimentation involving several sleep databases evaluating different machine learning models based on Convolutional Neural Networks