AITopics

Convolutional neural networks (CNNs) have achieved state-of-the-art results on many visual recognition tasks. However, current CNN models still exhibit a poor ability to be invariant to spatial transformations of images. Intuitively, with sufficient layers and parameters, hierarchical combinations of convolution (matrix multiplication and non-linear activation) and pooling operations should be able to learn a robust mapping from transformed input images to transform-invariant representations. In this paper, we propose randomly transforming (rotation, scale, and translation) feature maps of CNNs during the training stage. This prevents complex dependencies of specific rotation, scale, and translation levels of training images in CNN models. Rather, each convolutional kernel learns to detect a feature that is generally helpful for producing the transform-invariant answer given the combinatorially large variety of transform levels of its input feature maps. In this way, we do not require any extra training supervision or modification to the optimization process and training images. We show that random transformation provides significant improvements of CNNs on many benchmark tasks, including small-scale image recognition, large-scale image recognition, and image retrieval. The code is available at https://github.com/jasonustc/caffe-multigpu/tree/TICNN.

cnn model, feature map, transformation, (16 more...)

1912.01447

Country:

Asia > China > Anhui Province > Hefei (0.05)
Oceania > Australia (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Chatain, Thomas, Boltenhagen, Mathilde, Carmona, Josep

Anti-Alignments -- Measuring The Precision of Process Models and Event Logs

arXiv.org Artificial IntelligenceNov-28-2019

Processes are a crucial artefact in organizations, since they coordinate the execution of activities so that products and services are provided. The use of models to analyse the underlying processes is a well-known practice. However, due to the complexity and continuous evolution of their processes, organizations need an effective way of analysing the relation between processes and models. Conformance checking techniques asses the suitability of a process model in representing an underlying process, observed through a collection of real executions. One important metric in conformance checking is to asses the precision of the model with respect to the observed executions, i.e., characterize the ability of the model to produce behavior unrelated to the one observed. In this paper we present the notion of anti-alignment as a concept to help unveiling runs in the model that may deviate significantly from the observed behavior. Using anti-alignments, a new metric for precision is proposed. In contrast to existing metrics, anti-alignment based precision metrics satisfy most of the required axioms highlighted in a recent publication. Moreover, a complexity analysis of the problem of computing anti-alignments is provided, which sheds light into the practicability of using anti-alignment to estimate precision. Experiments are provided that witness the validity of the concepts introduced in this paper.

dist, precision, process model, (16 more...)

arXiv.org Artificial Intelligence

1912.05907

Country:

Europe > Netherlands > North Brabant > Eindhoven (0.04)
Europe > France > Île-de-France > Val-de-Marne > Cachan (0.04)
South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
(7 more...)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Bouraoui, Zied, Camacho-Collados, Jose, Schockaert, Steven

Inducing Relational Knowledge from BERT

arXiv.org Artificial IntelligenceNov-28-2019

One of the most remarkable properties of word embeddings is the fact that they capture certain types of semantic and syntactic relationships. Recently, pre-trained language models such as BERT have achieved groundbreaking results across a wide range of Natural Language Processing tasks. However, it is unclear to what extent such models capture relational knowledge beyond what is already captured by standard word embeddings. To explore this question, we propose a methodology for distilling relational knowledge from a pre-trained language model. Starting from a few seed instances of a given relation, we first use a large text corpus to find sentences that are likely to express this relation. We then use a subset of these extracted sentences as templates. Finally, we fine-tune a language model to predict whether a given word pair is likely to be an instance of some relation, when given an instantiated template for that relation as input.

language model, relation, template, (16 more...)

arXiv.org Artificial Intelligence

1911.12753

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Europe > France > Île-de-France > Paris > Paris (0.05)
Europe > Italy > Lazio > Rome (0.04)
(3 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Tsuchida, Russell, Roosta, Fred, Gallagher, Marcus

Richer priors for infinitely wide multi-layer perceptrons

It is well-known that the distribution over functions induced through a zero-mean iid prior distribution over the parameters of a multi-layer perceptron (MLP) converges to a Gaussian process (GP), under mild conditions. We extend this result firstly to independent priors with general zero or non-zero means, and secondly to a family of partially exchangeable priors which generalise iid priors. We discuss how the second prior arises naturally when considering an equivalence class of functions in an MLP and through training processes such as stochastic gradient descent. The model resulting from partially exchangeable priors is a GP, with an additional level of inference in the sense that the prior and posterior predictive distributions require marginalisation over hyperparameters. We derive the kernels of the limiting GP in deep MLPs, and show empirically that these kernels avoid certain pathologies present in previously studied priors. We empirically evaluate our claims of convergence by measuring the maximum mean discrepancy between finite width models and limiting models. We compare the performance of our new limiting model to some previously discussed models on synthetic regression problems. We observe increasing ill-conditioning of the marginal likelihood and hyper-posterior as the depth of the model increases, drawing parallels with finite width networks which require notoriously involved optimisation tricks.

hyperparameter, kernel, mlp, (14 more...)

1911.12927

Country:

North America > Canada > Ontario > Toronto (0.14)
Oceania > Australia > Queensland > Brisbane (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (1.00)

Le, Van-Duc, Bui, Tien-Cuong, Cha, Sang Kyun

Spatiotemporal deep learning model for citywide air pollution interpolation and prediction

Recently, air pollution is one of the most concerns for big cities. Predicting air quality for any regions and at any time is a critical requirement of urban citizens. However, air pollution prediction for the whole city is a challenging problem. The reason is, there are many spatiotemporal factors affecting air pollution throughout the city. Collecting as many of them could help us to forecast air pollution better. In this research, we present many spatiotemporal datasets collected over Seoul city in Korea, which is currently much suffered by air pollution problem as well. These datasets include air pollution data, meteorological data, traffic volume, average driving speed, and air pollution indexes of external areas which are known to impact Seoul's air pollution. To the best of our knowledge, traffic volume and average driving speed data are two new datasets in air pollution research. In addition, recent research in air pollution has tried to build models to interpolate and predict air pollution in the city. Nevertheless, they mostly focused on predicting air quality in discrete locations or used hand-crafted spatial and temporal features. In this paper, we propose the usage of Convolutional Long Short-Term Memory (ConvLSTM) model \cite{b16}, a combination of Convolutional Neural Networks and Long Short-Term Memory, which automatically manipulates both the spatial and temporal features of the data. Specially, we introduce how to transform the air pollution data into sequences of images which leverages the using of ConvLSTM model to interpolate and predict air quality for the entire city at the same time. We prove that our approach is suitable for spatiotemporal air pollution problems and also outperforms other related research.

air pollution, convlstm, pollution, (13 more...)

1911.12919

Country:

Asia > South Korea > Seoul > Seoul (0.46)
Asia > China > Beijing > Beijing (0.05)
Oceania > Australia > New South Wales > Sydney (0.04)
(4 more...)

Genre: Research Report (0.40)

Industry:

Transportation (0.70)
Government > Regional Government > North America Government > United States Government (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Continuous Dropout

Shen, Xu, Tian, Xinmei, Liu, Tongliang, Xu, Fang, Tao, Dacheng

Dropout has been proven to be an effective algorithm for training robust deep networks because of its ability to prevent overfitting by avoiding the co-adaptation of feature detectors. Current explanations of dropout include bagging, naive Bayes, regularization, and sex in evolution. According to the activation patterns of neurons in the human brain, when faced with different situations, the firing rates of neurons are random and continuous, not binary as current dropout does. Inspired by this phenomenon, we extend the traditional binary dropout to continuous dropout. On the one hand, continuous dropout is considerably closer to the activation characteristics of neurons in the human brain than traditional binary dropout. On the other hand, we demonstrate that continuous dropout has the property of avoiding the co-adaptation of feature detectors, which suggests that we can extract more independent feature detectors for model averaging in the test stage. We introduce the proposed continuous dropout to a feedforward neural network and comprehensively compare it with binary dropout, adaptive dropout, and DropConnect on MNIST, CIFAR-10, SVHN, NORB, and ILSVRC-12. Thorough experiments demonstrate that our method performs better in preventing the co-adaptation of feature detectors and improves test performance. The code is available at: https://github.com/jasonustc/caffe-multigpu/tree/dropout.

continuous dropout, dropout, gaussian dropout, (12 more...)

1911.12675

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > China (0.04)
Oceania > Australia (0.04)
North America > United States > Texas > Travis County > Austin (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Schelter, Sebastian, He, Yuxuan, Khilnani, Jatin, Stoyanovich, Julia

FairPrep: Promoting Data to a First-Class Citizen in Studies on Fairness-Enhancing Interventions

The importance of incorporating ethics and legal compliance into machine-assisted decision-making is broadly recognized. Further, several lines of recent work have argued that critical opportunities for improving data quality and representativeness, controlling for bias, and allowing humans to oversee and impact computational processes are missed if we do not consider the lifecycle stages upstream from model training and deployment. Yet, very little has been done to date to provide system-level support to data scientists who wish to develop and deploy responsible machine learning methods. We aim to fill this gap and present FairPrep, a design and evaluation framework for fairness-enhancing interventions. FairPrep is based on a developer-centered design, and helps data scientists follow best practices in software engineering and machine learning. As part of our contribution, we identify shortcomings in existing empirical studies for analyzing fairness-enhancing interventions. We then show how FairPrep can be used to measure the impact of sound best practices, such as hyperparameter tuning and feature scaling. In particular, our results suggest that the high variability of the outcomes of fairness-enhancing interventions observed in previous studies is often an artifact of a lack of hyperparameter tuning. Further, we show that the choice of a data cleaning method can impact the effectiveness of fairness-enhancing interventions.

fairness-enhancing intervention, fairprep, intervention, (16 more...)

1911.12587

Country:

North America > United States > District of Columbia > Washington (0.05)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
(8 more...)

Genre: Research Report > New Finding (1.00)

Industry: Law (1.00)

Technology:

Information Technology > Data Science > Data Quality (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

#artificialintelligenceNov-27-2019, 23:08:39 GMT

Sam George: The State of IoT, Cloud, Edge, and AI - Connected World

Peggy Smedley: For you, what are the most interesting trends that you see? You and I have talked in the past about the IoT (Internet of Things) and I know that you have a lot of vision, a lot of examples that you look at when you think about cloud and edge and we talk about manufacturing and all these things in vertical markets, but for listeners right now, based on investments you guys [Microsoft] are making, what do you see are the most interesting trends? Sam George: Well, I think if you zoom the telescope way back out and look at the very big picture, what we're seeing across all of these vertical markets, whether it's manufacturing or agriculture, smart cites, smart energy. If you take a look at what's happening with all of these, there's a set of disruptive technologies that are fundamentally transforming how those industries function. Cloud was a big catalyst for that and I'd say, very well established at this point. And then IoT, a couple of years ago, really started hitting the scene, building on top of cloud and giving these businesses unprecedented visibility if they were able to take advantage of it back in the early days. Virtually all aspects of their business are able to sense things in the physical world, in realtime, that they weren't able to before. And then while the IoT was happening, edge computing started happening too, which was a normal and natural optimization, where as I connect and start collecting data from these billions of devices that are sensing across all of these different industries that are sensing things that are happening, it's natural to start taking some of the computing that you were doing in the cloud and some of the services that you were taking advantage of and pushing those right out and distributing those right out to the devices themselves for a variety of reasons, whether that's latency concerns or security concerns or anything else. We see this wonderful trend of AI that is powering really new breakthrough capabilities across all of these industries. AI is a great example, where as it takes advantage of those proceeding waves, edge computing and the IoT and cloud. AI can now run in a distributed fashion as well.

iot, microsoft, smedley, (15 more...)

#artificialintelligence

Country:

Oceania > New Zealand (0.04)
Europe > Switzerland (0.04)

Genre: Personal > Interview (0.47)

Industry:

Information Technology > Security & Privacy (1.00)
Food & Agriculture (0.88)

Technology:

Information Technology > Internet of Things (1.00)
Information Technology > Cloud Computing (1.00)
Information Technology > Artificial Intelligence (1.00)

#artificialintelligenceNov-27-2019, 14:36:47 GMT

Why we need a "Secretary of Digital"

With software eating the world, it's time for the United States to create a cabinet-level position focused on digital. Whether it was the flawed rollout of the ACA website, the identified need for cybersecurity protection of our energy grid, or AI on the battlefield, the executive branch needs digital leadership and expertise to help guide the country. Here are 10 issues that require coordination, oversight, and/or financial support at the national level with links to relevant content. How do you ensure that machine learning algorithms aren't biased – by race, by age, or by gender? When can you trust an AI black box algorithm and when isn't it suitable?

executive branch, government, secretary, (7 more...)

#artificialintelligence

Country:

North America > United States (1.00)
Oceania > Australia (0.06)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.92)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.52)

#artificialintelligenceNov-27-2019, 11:13:34 GMT

Artificial Intelligence(AI) in Retail Market- increasing demand with Industry Professionals: IBM, Microsoft, Nvidia, Amazon Web Services, Oracle, SAP - Med News Ledger

A New Research on the Global Artificial Intelligence(AI) in Retail Market was conducted across a variety of industries in various regions to produce more than 150 page reports. This study is a perfect blend of qualitative and quantifiable information highlighting key market developments, industry and competitors' challenges in gap analysis and new opportunities and may be trending in the Artificial Intelligence(AI) in Retail market. Some are part of the coverage and are the core and emerging players being profiled IBM, Microsoft, Nvidia, Amazon Web Services, Oracle, SAP, Intel, Google, Sentient Technologies, Salesforce, Visenze. Import and export policies that can have an immediate impact on the global Artificial Intelligence(AI) in Retail market. This study includes EXIM * related chapters for all relevant companies dealing with the Artificial Intelligence(AI) in Retail market and related profiles and provides valuable data in terms of finances, product portfolio, investment planning and marketing and business strategy. The study is a collection of primary and secondary data that contains valuable information from the major suppliers of the market.

artificial intelligence, global artificial intelligence, intelligence, (14 more...)

#artificialintelligence

Country:

South America > Chile (0.05)
South America > Brazil (0.05)
South America > Argentina (0.05)
(23 more...)

Genre: Research Report > New Finding (0.37)

Industry:

Retail (1.00)
Information Technology > Services (0.62)
Information Technology > Hardware (0.62)
Banking & Finance > Trading (0.49)

Technology:

Information Technology > Artificial Intelligence (1.00)
Information Technology > Communications > Web (0.62)