Goto

Collaborating Authors

 Oceania


Transform-Invariant Convolutional Neural Networks for Image Classification and Search

arXiv.org Machine Learning

Convolutional neural networks (CNNs) have achieved state-of-the-art results on many visual recognition tasks. However, current CNN models still exhibit a poor ability to be invariant to spatial transformations of images. Intuitively, with sufficient layers and parameters, hierarchical combinations of convolution (matrix multiplication and non-linear activation) and pooling operations should be able to learn a robust mapping from transformed input images to transform-invariant representations. In this paper, we propose randomly transforming (rotation, scale, and translation) feature maps of CNNs during the training stage. This prevents complex dependencies of specific rotation, scale, and translation levels of training images in CNN models. Rather, each convolutional kernel learns to detect a feature that is generally helpful for producing the transform-invariant answer given the combinatorially large variety of transform levels of its input feature maps. In this way, we do not require any extra training supervision or modification to the optimization process and training images. We show that random transformation provides significant improvements of CNNs on many benchmark tasks, including small-scale image recognition, large-scale image recognition, and image retrieval. The code is available at https://github.com/jasonustc/caffe-multigpu/tree/TICNN.


Anti-Alignments -- Measuring The Precision of Process Models and Event Logs

arXiv.org Artificial Intelligence

Processes are a crucial artefact in organizations, since they coordinate the execution of activities so that products and services are provided. The use of models to analyse the underlying processes is a well-known practice. However, due to the complexity and continuous evolution of their processes, organizations need an effective way of analysing the relation between processes and models. Conformance checking techniques asses the suitability of a process model in representing an underlying process, observed through a collection of real executions. One important metric in conformance checking is to asses the precision of the model with respect to the observed executions, i.e., characterize the ability of the model to produce behavior unrelated to the one observed. In this paper we present the notion of anti-alignment as a concept to help unveiling runs in the model that may deviate significantly from the observed behavior. Using anti-alignments, a new metric for precision is proposed. In contrast to existing metrics, anti-alignment based precision metrics satisfy most of the required axioms highlighted in a recent publication. Moreover, a complexity analysis of the problem of computing anti-alignments is provided, which sheds light into the practicability of using anti-alignment to estimate precision. Experiments are provided that witness the validity of the concepts introduced in this paper.


Inducing Relational Knowledge from BERT

arXiv.org Artificial Intelligence

One of the most remarkable properties of word embeddings is the fact that they capture certain types of semantic and syntactic relationships. Recently, pre-trained language models such as BERT have achieved groundbreaking results across a wide range of Natural Language Processing tasks. However, it is unclear to what extent such models capture relational knowledge beyond what is already captured by standard word embeddings. To explore this question, we propose a methodology for distilling relational knowledge from a pre-trained language model. Starting from a few seed instances of a given relation, we first use a large text corpus to find sentences that are likely to express this relation. We then use a subset of these extracted sentences as templates. Finally, we fine-tune a language model to predict whether a given word pair is likely to be an instance of some relation, when given an instantiated template for that relation as input.


Richer priors for infinitely wide multi-layer perceptrons

arXiv.org Machine Learning

It is well-known that the distribution over functions induced through a zero-mean iid prior distribution over the parameters of a multi-layer perceptron (MLP) converges to a Gaussian process (GP), under mild conditions. We extend this result firstly to independent priors with general zero or non-zero means, and secondly to a family of partially exchangeable priors which generalise iid priors. We discuss how the second prior arises naturally when considering an equivalence class of functions in an MLP and through training processes such as stochastic gradient descent. The model resulting from partially exchangeable priors is a GP, with an additional level of inference in the sense that the prior and posterior predictive distributions require marginalisation over hyperparameters. We derive the kernels of the limiting GP in deep MLPs, and show empirically that these kernels avoid certain pathologies present in previously studied priors. We empirically evaluate our claims of convergence by measuring the maximum mean discrepancy between finite width models and limiting models. We compare the performance of our new limiting model to some previously discussed models on synthetic regression problems. We observe increasing ill-conditioning of the marginal likelihood and hyper-posterior as the depth of the model increases, drawing parallels with finite width networks which require notoriously involved optimisation tricks.


Spatiotemporal deep learning model for citywide air pollution interpolation and prediction

arXiv.org Machine Learning

Recently, air pollution is one of the most concerns for big cities. Predicting air quality for any regions and at any time is a critical requirement of urban citizens. However, air pollution prediction for the whole city is a challenging problem. The reason is, there are many spatiotemporal factors affecting air pollution throughout the city. Collecting as many of them could help us to forecast air pollution better. In this research, we present many spatiotemporal datasets collected over Seoul city in Korea, which is currently much suffered by air pollution problem as well. These datasets include air pollution data, meteorological data, traffic volume, average driving speed, and air pollution indexes of external areas which are known to impact Seoul's air pollution. To the best of our knowledge, traffic volume and average driving speed data are two new datasets in air pollution research. In addition, recent research in air pollution has tried to build models to interpolate and predict air pollution in the city. Nevertheless, they mostly focused on predicting air quality in discrete locations or used hand-crafted spatial and temporal features. In this paper, we propose the usage of Convolutional Long Short-Term Memory (ConvLSTM) model \cite{b16}, a combination of Convolutional Neural Networks and Long Short-Term Memory, which automatically manipulates both the spatial and temporal features of the data. Specially, we introduce how to transform the air pollution data into sequences of images which leverages the using of ConvLSTM model to interpolate and predict air quality for the entire city at the same time. We prove that our approach is suitable for spatiotemporal air pollution problems and also outperforms other related research.


Continuous Dropout

arXiv.org Machine Learning

Dropout has been proven to be an effective algorithm for training robust deep networks because of its ability to prevent overfitting by avoiding the co-adaptation of feature detectors. Current explanations of dropout include bagging, naive Bayes, regularization, and sex in evolution. According to the activation patterns of neurons in the human brain, when faced with different situations, the firing rates of neurons are random and continuous, not binary as current dropout does. Inspired by this phenomenon, we extend the traditional binary dropout to continuous dropout. On the one hand, continuous dropout is considerably closer to the activation characteristics of neurons in the human brain than traditional binary dropout. On the other hand, we demonstrate that continuous dropout has the property of avoiding the co-adaptation of feature detectors, which suggests that we can extract more independent feature detectors for model averaging in the test stage. We introduce the proposed continuous dropout to a feedforward neural network and comprehensively compare it with binary dropout, adaptive dropout, and DropConnect on MNIST, CIFAR-10, SVHN, NORB, and ILSVRC-12. Thorough experiments demonstrate that our method performs better in preventing the co-adaptation of feature detectors and improves test performance. The code is available at: https://github.com/jasonustc/caffe-multigpu/tree/dropout.


FairPrep: Promoting Data to a First-Class Citizen in Studies on Fairness-Enhancing Interventions

arXiv.org Machine Learning

The importance of incorporating ethics and legal compliance into machine-assisted decision-making is broadly recognized. Further, several lines of recent work have argued that critical opportunities for improving data quality and representativeness, controlling for bias, and allowing humans to oversee and impact computational processes are missed if we do not consider the lifecycle stages upstream from model training and deployment. Yet, very little has been done to date to provide system-level support to data scientists who wish to develop and deploy responsible machine learning methods. We aim to fill this gap and present FairPrep, a design and evaluation framework for fairness-enhancing interventions. FairPrep is based on a developer-centered design, and helps data scientists follow best practices in software engineering and machine learning. As part of our contribution, we identify shortcomings in existing empirical studies for analyzing fairness-enhancing interventions. We then show how FairPrep can be used to measure the impact of sound best practices, such as hyperparameter tuning and feature scaling. In particular, our results suggest that the high variability of the outcomes of fairness-enhancing interventions observed in previous studies is often an artifact of a lack of hyperparameter tuning. Further, we show that the choice of a data cleaning method can impact the effectiveness of fairness-enhancing interventions.


Sam George: The State of IoT, Cloud, Edge, and AI - Connected World

#artificialintelligence

Peggy Smedley: For you, what are the most interesting trends that you see? You and I have talked in the past about the IoT (Internet of Things) and I know that you have a lot of vision, a lot of examples that you look at when you think about cloud and edge and we talk about manufacturing and all these things in vertical markets, but for listeners right now, based on investments you guys [Microsoft] are making, what do you see are the most interesting trends? Sam George: Well, I think if you zoom the telescope way back out and look at the very big picture, what we're seeing across all of these vertical markets, whether it's manufacturing or agriculture, smart cites, smart energy. If you take a look at what's happening with all of these, there's a set of disruptive technologies that are fundamentally transforming how those industries function. Cloud was a big catalyst for that and I'd say, very well established at this point. And then IoT, a couple of years ago, really started hitting the scene, building on top of cloud and giving these businesses unprecedented visibility if they were able to take advantage of it back in the early days. Virtually all aspects of their business are able to sense things in the physical world, in realtime, that they weren't able to before. And then while the IoT was happening, edge computing started happening too, which was a normal and natural optimization, where as I connect and start collecting data from these billions of devices that are sensing across all of these different industries that are sensing things that are happening, it's natural to start taking some of the computing that you were doing in the cloud and some of the services that you were taking advantage of and pushing those right out and distributing those right out to the devices themselves for a variety of reasons, whether that's latency concerns or security concerns or anything else. We see this wonderful trend of AI that is powering really new breakthrough capabilities across all of these industries. AI is a great example, where as it takes advantage of those proceeding waves, edge computing and the IoT and cloud. AI can now run in a distributed fashion as well.


Why we need a "Secretary of Digital"

#artificialintelligence

With software eating the world, it's time for the United States to create a cabinet-level position focused on digital. Whether it was the flawed rollout of the ACA website, the identified need for cybersecurity protection of our energy grid, or AI on the battlefield, the executive branch needs digital leadership and expertise to help guide the country. Here are 10 issues that require coordination, oversight, and/or financial support at the national level with links to relevant content. How do you ensure that machine learning algorithms aren't biased – by race, by age, or by gender? When can you trust an AI black box algorithm and when isn't it suitable?


Artificial Intelligence(AI) in Retail Market- increasing demand with Industry Professionals: IBM, Microsoft, Nvidia, Amazon Web Services, Oracle, SAP - Med News Ledger

#artificialintelligence

A New Research on the Global Artificial Intelligence(AI) in Retail Market was conducted across a variety of industries in various regions to produce more than 150 page reports. This study is a perfect blend of qualitative and quantifiable information highlighting key market developments, industry and competitors' challenges in gap analysis and new opportunities and may be trending in the Artificial Intelligence(AI) in Retail market. Some are part of the coverage and are the core and emerging players being profiled IBM, Microsoft, Nvidia, Amazon Web Services, Oracle, SAP, Intel, Google, Sentient Technologies, Salesforce, Visenze. Import and export policies that can have an immediate impact on the global Artificial Intelligence(AI) in Retail market. This study includes EXIM * related chapters for all relevant companies dealing with the Artificial Intelligence(AI) in Retail market and related profiles and provides valuable data in terms of finances, product portfolio, investment planning and marketing and business strategy. The study is a collection of primary and secondary data that contains valuable information from the major suppliers of the market.