Goto

Collaborating Authors

 Oceania


Nothing Artificial About How AI Is Transforming MRO

#artificialintelligence

The loudest industry buzz has been about using big data and artificial intelligence (AI) for predictive maintenance, or turning unscheduled events into scheduled ones by forecasting likely failures. But surprise events still occur, and AI can also help troubleshoot them faster and more effectively. Any tool that enables predictive maintenance also helps troubleshooting, as it often points to causes of likely failures. But to provide maximum diagnostic benefits, some AI techniques can also be used in different ways. For example, natural language processing can translate mechanics' plain-spoken inquiries into text that helps find answers.


Learning to compress and search visual data in large-scale systems

arXiv.org Machine Learning

The problem of high-dimensional and large-scale representation of visual data is addressed from an unsupervised learning perspective. The emphasis is put on discrete representations, where the description length can be measured in bits and hence the model capacity can be controlled. The algorithmic infrastructure is developed based on the synthesis and analysis prior models whose rate-distortion properties, as well as capacity vs. sample complexity trade-offs are carefully optimized. These models are then extended to multi-layers, namely the RRQ and the ML-STC frameworks, where the latter is further evolved as a powerful deep neural network architecture with fast and sample-efficient training and discrete representations. For the developed algorithms, three important applications are developed. First, the problem of large-scale similarity search in retrieval systems is addressed, where a double-stage solution is proposed leading to faster query times and shorter database storage. Second, the problem of learned image compression is targeted, where the proposed models can capture more redundancies from the training images than the conventional compression codecs. Finally, the proposed algorithms are used to solve ill-posed inverse problems. In particular, the problems of image denoising and compressive sensing are addressed with promising results.


Model-based Deep Reinforcement Learning for Dynamic Portfolio Optimization

arXiv.org Machine Learning

Dynamic portfolio optimization is the process of sequentially allocating wealth to a collection of assets in some consecutive trading periods, based on investors' return-risk profile. Automating this process with machine learning remains a challenging problem. Here, we design a deep reinforcement learning (RL) architecture with an autonomous trading agent such that, investment decisions and actions are made periodically, based on a global objective, with autonomy. In particular, without relying on a purely model-free RL agent, we train our trading agent using a novel RL architecture consisting of an infused prediction module (IPM), a generative adversarial data augmentation module (DAM) and a behavior cloning module (BCM). Our model-based approach works with both on-policy or off-policy RL algorithms. We further design the back-testing and execution engine which interact with the RL agent in real time. Using historical {\em real} financial market data, we simulate trading with practical constraints, and demonstrate that our proposed model is robust, profitable and risk-sensitive, as compared to baseline trading strategies and model-free RL agents from prior work.


FANDA: A Novel Approach to Perform Follow-up Query Analysis

arXiv.org Artificial Intelligence

Recent work on Natural Language Interfaces to Databases (NLIDB) has attracted considerable attention. NLIDB allow users to search databases using natural language instead of SQL-like query languages. While saving the users from having to learn query languages, multi-turn interaction with NLIDB usually involves multiple queries where contextual information is vital to understand the users' query intents. In this paper, we address a typical contextual understanding problem, termed as follow-up query analysis. In spite of its ubiquity, follow-up query analysis has not been well studied due to two primary obstacles: the multifarious nature of follow-up query scenarios and the lack of high-quality datasets. Our work summarizes typical follow-up query scenarios and provides a new FollowUp dataset with $1000$ query triples on 120 tables. Moreover, we propose a novel approach FANDA, which takes into account the structures of queries and employs a ranking model with weakly supervised max-margin learning. The experimental results on FollowUp demonstrate the superiority of FANDA over multiple baselines across multiple metrics.


When Can Neural Networks Learn Connected Decision Regions?

arXiv.org Machine Learning

Previous work has questioned the conditions under which the decision regions of a neural network are connected and further showed the implications of the corresponding theory to the problem of adversarial manipulation of classifiers. It has been proven that for a class of activation functions including leaky ReLU, neural networks having a pyramidal structure, that is no layer has more hidden units than the input dimension, produce necessarily connected decision regions. In this paper, we advance this important result by further developing the sufficient and necessary conditions under which the decision regions of a neural network are connected. We then apply our framework to overcome the limits of existing work and further study the capacity to learn connected regions of neural networks for a much wider class of activation functions including those widely used, namely ReLU, sigmoid, tanh, softlus, and exponential linear function.


Machine Learning and Deep Learning Algorithms for Bearing Fault Diagnostics - A Comprehensive Review

arXiv.org Machine Learning

In this survey paper, we systematically summarize the current literature on studies that apply machine learning (ML) and data mining techniques to bearing fault diagnostics. Conventional ML methods, including artificial neural network (ANN), principal component analysis (PCA), support vector machines (SVM), etc., have been successfully applied to detecting and categorizing bearing faults since the last decade, while the application of deep learning (DL) methods has sparked great interest in both the industry and academia in the last five years. In this paper, we will first review the conventional ML methods, before taking a deep dive into the latest developments in DL algorithms for bearing fault applications. Specifically, the superiority of the DL based methods over the conventional ML methods are analyzed in terms of metrics directly related to fault feature extraction and classifier performances; the new functionalities offered by DL techniques that cannot be accomplished before are also summarized. In addition, to obtain a more intuitive insight, a comparative study is performed on the classifier performance and accuracy for a number of papers utilizing the open source Case Western Reserve University (CWRU) bearing data set. Finally, based on the nature of the time-series 1-D data obtained from sensors monitoring the bearing conditions, recommendations and suggestions are provided to applying DL algorithms on bearing fault diagnostics based on specific applications, as well as future research directions to further improve its performance.


Location reference identification from tweets during emergencies: A deep learning approach

arXiv.org Machine Learning

Twitter is recently being used during crises to communicate with officials and provide rescue and relief operation in real time. The geographical location information of the event, as well as users, are vitally important in such scenarios. The identification of geographic location is one of the challenging tasks as the location information fields, such as user location and place name of tweets are not reliable. The extraction of location information from tweet text is difficult as it contains a lot of nonstandard English, grammatical errors, spelling mistakes, nonstandard abbreviations, and so on. This research aims to extract location words used in the tweet using a Convolutional Neural Network (CNN) based model. We achieved the exact matching score of 0.929, Hamming loss of 0.002, and F Our model was able to extract even three-to four-word long location references which is also evident from the exact matching score of over 92%. The findings of this paper can help in early event localization, emergency situations, real-time road traffic management, localized advertisement, and in various location-based services. Keywords: Location references, Tweets, Geo-locations, Named entity recognition, Gazetteer, Convolutional Neural Network 1. Introduction Tweets are very responsive to real-world events, and are sometimes even more immediate than traditional news channels. Therefore, it is possible to keep track of the latest information by following tweets. Several examples were seen when the news was first reported on Twitter, such as an airplane crash over the Hudson River in New York in the year 2009 (Sakaki et al., 2013), the death of former British Prime Minister Margaret Thatcher in April 2013 Preprint submitted to Elsevier January 25, 2019 Sakaki et al., 2013; Singh et al., 2017; Yuan & Liu, 2018). In an American Red Cross survey, a question was asked to individuals that "whom they contacted in an emergency?" The estimation and detection of location information of events and users from tweets are a major concern in relation to the above-mentioned tasks. Twitter provides three location information fields for sharing a user's location: (1) User location; (2) Place name; and (3) Geo-coordinate. The user location field has 140 character spaces (previously it was limited to 30 characters) in which the user can write his/her home location information while creating their profile. This field is optional to the user and the user can write any arbitrary words or leave it blank.


A Universally Optimal Multistage Accelerated Stochastic Gradient Method

arXiv.org Machine Learning

We study the problem of minimizing a strongly convex and smooth function when we have noisy estimates of its gradient. We propose a novel multistage accelerated algorithm that is universally optimal in the sense that it achieves the optimal rate both in the deterministic and stochastic case and operates without knowledge of noise characteristics. The algorithm consists of stages that use a stochastic version of Nesterov's accelerated algorithm with a specific restart and parameters selected to achieve the fastest reduction in the bias-variance terms in the convergence rate bounds.


Fairness risk measures

arXiv.org Machine Learning

Ensuring that classifiers are non-discriminatory or fair with respect to a sensitive feature (e.g., race or gender) is a topical problem. Progress in this task requires fixing a definition of fairness, and there have been several proposals in this regard over the past few years. Several of these, however, assume either binary sensitive features (thus precluding categorical or real-valued sensitive groups), or result in non-convex objectives (thus adversely affecting the optimisation landscape). In this paper, we propose a new definition of fairness that generalises some existing proposals, while allowing for generic sensitive features and resulting in a convex objective. The key idea is to enforce that the expected losses (or risks) across each subgroup induced by the sensitive feature are commensurate. We show how this relates to the rich literature on risk measures from mathematical finance. As a special case, this leads to a new convex fairness-aware objective based on minimising the conditional value at risk (CVaR).


When is it right and good for an intelligent autonomous vehicle to take over control (and hand it back)?

arXiv.org Artificial Intelligence

There is much debate in machine ethics about the most appropriate way to introduce ethical reasoning capabilities into intelligent autonomous machines. Recent incidents involving autonomous vehicles in which humans have been killed or injured have raised questions about how we ensure that such vehicles have an ethical dimension to their behaviour and are therefore trustworthy. The main problem is that hardwiring such machines with rules not to cause harm or damage is not consistent with the notion of autonomy and intelligence. Also, such ethical hardwiring does not leave intelligent autonomous machines with any course of action if they encounter situations or dilemmas for which they are not programmed or where some harm is caused no matter what course of action is taken. Teaching machines so that they learn ethics may also be problematic given recent findings in machine learning that machines pick up the prejudices and biases embedded in their learning algorithms or data. This paper describes a fuzzy reasoning approach to machine ethics. The paper shows how it is possible for an ethics architecture to reason when taking over from a human driver is morally justified. The design behind such an ethical reasoner is also applied to an ethical dilemma resolution case. One major advantage of the approach is that the ethical reasoner can generate its own data for learning moral rules (hence, autometric) and thereby reduce the possibility of picking up human biases and prejudices. The results show that a new type of metric-based ethics appropriate for autonomous intelligent machines is feasible and that our current concept of ethical reasoning being largely qualitative in nature may need revising if want to construct future autonomous machines that have an ethical dimension to their reasoning so that they become moral machines.