Goto

Collaborating Authors

 Oceania


Testing Neural Program Analyzers

arXiv.org Machine Learning

Deep neural networks have been increasingly used in software engineering and program analysis tasks. They usually take a program and make some predictions about it, e.g., bug prediction. We call these models neural program analyzers. The reliability of neural programs can impact the reliability of the encompassing analyses. In this paper, we describe our ongoing efforts to develop effective techniques for testing neural programs. We discuss the challenges involved in developing such tools and our future plans. In our preliminary experiment on a neural model recently proposed in the literature, we found that the model is very brittle, and simple perturbations in the input can cause the model to make mistakes in its prediction.


Calibration of Deep Probabilistic Models with Decoupled Bayesian Neural Networks

arXiv.org Machine Learning

Deep Neural Networks (DNNs) have achieved state-of-the-art accuracy performance in many tasks. However, recent works have pointed out that the outputs provided by these models are not well-calibrated, seriously limiting their use in critical decision scenarios. In this work, we propose to use a decoupled Bayesian stage, implemented with a Bayesian Neural Network (BNN), to map the uncalibrated probabilities provided by a DNN to calibrated ones, consistently improving calibration. Our results evidence that incorporating uncertainty provides more reliable probabilistic models, a critical condition for achieving good calibration. We report a generous collection of experimental results using high-accuracy DNNs in standardized image classification benchmarks, showing the good performance, flexibility and robust behavior of our approach with respect to several state-of-the-art calibration methods. Code for reproducibility is provided.


EEG-Based Driver Drowsiness Estimation Using Feature Weighted Episodic Training

arXiv.org Artificial Intelligence

Drowsy driving is pervasive, and also a major cause of traffic accidents. Estimating a driver's drowsiness level by monitoring the electroencephalogram (EEG) signal and taking preventative actions accordingly may improve driving safety. However, individual differences among different drivers make this task very challenging. A calibration session is usually required to collect some subject-specific data and tune the model parameters before applying it to a new subject, which is very inconvenient and not user-friendly. Many approaches have been proposed to reduce the calibration effort, but few can completely eliminate it. This paper proposes a novel approach, feature weighted episodic training (FWET), to completely eliminate the calibration requirement. It integrates two techniques: feature weighting to learn the importance of different features, and episodic training for domain generalization. Experiments on EEG-based driver drowsiness estimation demonstrated that both feature weighting and episodic training are effective, and their integration can further improve the generalization performance. FWET does not need any labelled or unlabelled calibration data from the new subject, and hence could be very useful in plug-and-play brain-computer interfaces.


Task-Oriented Conversation Generation Using Heterogeneous Memory Networks

arXiv.org Artificial Intelligence

How to incorporate external knowledge into a neural dialogue model is critically important for dialogue systems to behave like real humans. To handle this problem, memory networks are usually a great choice and a promising way. However, existing memory networks do not perform well when leveraging heterogeneous information from different sources. In this paper, we propose a novel and versatile external memory networks called Heterogeneous Memory Networks (HMNs), to simultaneously utilize user utterances, dialogue history and background knowledge tuples. In our method, historical sequential dialogues are encoded and stored into the context-aware memory enhanced by gating mechanism while grounding knowledge tuples are encoded and stored into the context-free memory. During decoding, the decoder augmented with HMNs recurrently selects each word in one response utterance from these two memories and a general vocabulary. Experimental results on multiple real-world datasets show that HMNs significantly outperform the state-of-the-art data-driven task-oriented dialogue models in most domains.


Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond

arXiv.org Artificial Intelligence

An increasingly popular approach to alleviate this issue is to first learn general language representations on unlabeled data, which are then integrated in task-specific downstream systems. This approach was first popularized by word embeddings (Mikolov et al., 2013b; This work was performed during an internship at Facebook AI Research. Pennington et al., 2014), but has recently been superseded by sentence-level representations (Peters et al., 2018; Devlin et al., 2019). Nevertheless, all these works learn a separate model for each language and are thus unable to leverage information across different languages, greatly limiting their potential performance for low-resource languages. In this work, we are interested in universal language agnostic sentence embeddings, that is, vector representations of sentences that are general with respect to two dimensions: the input language and the NLP task.


AI engine Athena is a modern-day teacher

#artificialintelligence

TEACH a man how to fish, and you can feed him for a lifetime. Homegrown artificial intelligence (AI) company Xjera Labs aims to do just that, with the aid of neural networks and deep learning algorithms. Its AI engine Athena can train software to perform specific tasks, whether these pertain to recognising sushi, teaching drones to fly or spotting intruders. The company's co-founder and chief executive Ethan Chu compares Athena to a school for its "pupils" - in Xjera's case, these are its products named XIntelligence, XTransport and XHound. Taken together, the trio offers a bevy of functions, including facial recognition, emotional analysis and identifying vehicle types.


RegTech and corporate disclosure Vantage Asia

#artificialintelligence

In recent times, regulators have begun to explore the use of technology to help them perform their regulatory and supervisory functions. Known as RegTech (a contraction of the terms "regulatory" and "technology") and also SupTech (a contraction of the terms'supervision' and'technology"), innovation in this area includes the use of natural language processing (NLP) โ€“ a form of artificial intelligence โ€“ to facilitate and enhance the review of documents by regulators to assess compliance with disclosure requirements. There is a broad range of documents to which such technology might be applied, including corporate accounts, corporate announcements, company prospectuses and financial product disclosure documents. Developments in RegTech have accompanied developments in FinTech (for a discussion about FinTech and smart contracts, see China Business Law Journal volume 7 issue 8: FinTech and smart contracts). This column explores the potential that NLP offers in the area of corporate disclosure, and the legal and regulatory implications that arise as a result. These implications include the following: (1) whether technology will change the way in which the language of corporate disclosure and disclosure standards are interpreted by regulators; (2) whether regulators will be able to maintain transparency in relation to how technology is used to monitor and review corporate disclosure; and (3) how to maintain an appropriate degree of human involvement and guarantee trust in the process.


Switched linear projections and inactive state sensitivity for deep neural network interpretability

arXiv.org Machine Learning

We introduce switched linear projections for expressing the activity of a neuron in a ReLU-based deep neural network in terms of a single linear projection in the input space. The method works by isolating the active subnetwork, a series of linear transformations, that completely determine the entire computation of the deep network for a given input instance. We also propose that for interpretability it is more instructive and meaningful to focus on the patterns that deactive the neurons in the network, which are ignored by the exisiting methods that implicitly track only the active aspect of the network's computation. We introduce a novel interpretability method for the inactive state sensitivity (Insens). Comparison against existing methods shows that Insens is more robust (in the presence of noise), more complete (in terms of patterns that affect the computation) and a very effective interpretability method for deep neural networks.


Online Semi-Supervised Concept Drift Detection with Density Estimation

arXiv.org Machine Learning

Concept drift is formally defined as the change in joint distribution of a set of input variables X and a target variable y. The two types of drift that are extensively studied are real drift and virtual drift where the former is the change in posterior probabilities p(y|X) while the latter is the change in distribution of X without affecting the posterior probabilities. Many approaches on concept drift detection either assume full availability of data labels, y or handle only the virtual drift. In a streaming environment, the assumption of full availability of data labels, y is questioned. On the other hand, approaches that deal with virtual drift failed to address real drift. Rather than improving the state-of-the-art methods, this paper presents a semi-supervised framework to deal with the challenges above. The objective of the proposed framework is to learn from streaming environment with limited data labels, y and detect real drift concurrently. This paper proposes a novel concept drift detection method utilizing the densities of posterior probabilities in partially labeled streaming environments. Experimental results on both synthetic and realworld datasets show that our proposed semi-supervised framework enables the detection of concept drift in such environment while achieving comparable prediction performance to the state-of-the-art methods.


WATTNet: Learning to Trade FX via Hierarchical Spatio-Temporal Representation of Highly Multivariate Time Series

arXiv.org Machine Learning

Finance is a particularly challenging application area for deep learning models due to low noise-to-signal ratio, non-stationarity, and partial observability. Non-deliverable-forwards (NDF), a derivatives contract used in foreign exchange (FX) trading, presents additional difficulty in the form of long-term planning required for an effective selection of start and end date of the contract. In this work, we focus on tackling the problem of NDF tenor selection by leveraging high-dimensional sequential data consisting of spot rates, technical indicators and expert tenor patterns. To this end, we construct a dataset from the Depository Trust & Clearing Corporation (DTCC) NDF data that includes a comprehensive list of NDF volumes and daily spot rates for 64 FX pairs. We introduce WaveATTentionNet (WATTNet), a novel temporal convolution (TCN) model for spatio-temporal modeling of highly multivariate time series, and validate it across NDF markets with varying degrees of dissimilarity between the training and test periods in terms of volatility and general market regimes. The proposed method achieves a significant positive return on investment (ROI) in all NDF markets under analysis, outperforming recurrent and classical baselines by a wide margin. Finally, we propose two orthogonal interpretability approaches to verify noise stability and detect the driving factors of the learned tenor selection strategy.