AITopics

2011.14486

Country:

North America > United States > New York > New York County > New York City (0.05)
North America > United States > California > San Diego County > Carlsbad (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
Asia > Middle East > Saudi Arabia > Riyadh Province > Riyadh (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningNov-27-2020

A Study on the Uncertainty of Convolutional Layers in Deep Neural Networks

Shen, Haojing, Chen, Sihong, Wang, Ran

This paper shows a Min-Max property existing in the connection weights of the convolutional layers in a neural network structure, i.e., the LeNet. Specifically, the Min-Max property means that, during the back propagation-based training for LeNet, the weights of the convolutional layers will become far away from their centers of intervals, i.e., decreasing to their minimum or increasing to their maximum. From the perspective of uncertainty, we demonstrate that the Min-Max property corresponds to minimizing the fuzziness of the model parameters through a simplified formulation of convolution. It is experimentally confirmed that the model with the Min-Max property has a stronger adversarial robustness, thus this property can be incorporated into the design of loss function. This paper points out a changing tendency of uncertainty in the convolutional layers of LeNet structure, and gives some insights to the interpretability of convolution.

adversarial robustness, min-max property, robustness, (14 more...)

2011.13719

Country:

Asia > China > Guangdong Province > Shenzhen (0.05)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > France (0.04)
(10 more...)

Genre: Research Report (0.41)

Industry: Information Technology > Security & Privacy (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Gopalakrishnan, Anand, van Steenkiste, Sjoerd, Schmidhuber, Jürgen

Unsupervised Object Keypoint Learning using Local Spatial Predictability

arXiv.org Artificial IntelligenceNov-25-2020

Hence, which layer(s) we choose as our feature embedding will have an effect on the outcome of the local spatial prediction problem. While more abstract high-level features are expected to better capture the internal predictive structure of an object, it will be more difficult to attribute the error of the prediction network to the exact image location. On the other hand, while more low-level features can be localized more accurately, they may lack the expressiveness to capture high-level properties of objects. Nonetheless, in practice we find that a spatial feature embedding based on earlier layers of the encoder works well (see also Section 5.3 for an ablation). Local Spatial Prediction Task Using the learned spatial feature embedding we seek out salient regions of the input image that correspond to object parts. Our approach is based on the idea that objects correspond to local regions in feature space that have high internal predictive structure, which allows us to formulate the following local spatial prediction (LSP) task. For each location in the learned spatial feature embedding, we seek to predict the value of the features (across the feature maps) from its neighbouring feature values. When neighbouring areas correspond to the same object-(part), i.e. they regularly appear together, we expect that this prediction problem is easy (green arrow in Figure 3).

keypoint, representation, transporter, (13 more...)

2011.1293

Country:

Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(5 more...)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Games (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.74)

Puranik, Bhagyashree, Madhow, Upamanyu, Pedarsani, Ramtin

Adversarially Robust Classification based on GLRT

arXiv.org Machine LearningNov-16-2020

Machine learning models are vulnerable to adversarial attacks that can often cause misclassification by introducing small but well designed perturbations. In this paper, we explore, in the setting of classical composite hypothesis testing, a defense strategy based on the generalized likelihood ratio test (GLRT), which jointly estimates the class of interest and the adversarial perturbation. We evaluate the GLRT approach for the special case of binary hypothesis testing in white Gaussian noise under $\ell_{\infty}$ norm-bounded adversarial perturbations, a setting for which a minimax strategy optimizing for the worst-case attack is known. We show that the GLRT approach yields performance competitive with that of the minimax approach under the worst-case attack, and observe that it yields a better robustness-accuracy trade-off under weaker attacks, depending on the values of signal components relative to the attack budget. We also observe that the GLRT defense generalizes naturally to more complex models for which optimal minimax classifiers are not known.

detector, probability, robustness, (17 more...)

2011.07835

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.05)
Europe > Sweden > Stockholm > Stockholm (0.05)
North America > Canada > Quebec > Montreal (0.05)
(4 more...)

Genre: Research Report (0.40)

Industry:

Information Technology > Security & Privacy (0.35)
Government > Military (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Strömfelt, Harald, Dickens, Luke, Garcez, Artur d'Avila, Russo, Alessandra

On the Transferability of VAE Embeddings using Relational Knowledge with Semi-Supervision

arXiv.org Artificial IntelligenceNov-13-2020

When dealing with complex data, the effectiveness of a classifier/predictor is limited by its ability to extract useful information. As such, representations that clearly expose the semantics of the data should then be most amenable to downstream learning [1, 2]. This is often referred to as a challenge of acquiring a disentangled representation over the factors of the data [3]. A popular recent trend that has had significant success in this regard uses semi-supervised Variational AutoEncoders (VAE) [4, 5, 6, 7, 8, 9]. Whilst fully unsupervised VAE methods have been shown to require strong inductive bias [10], semi-supervised methods achieve disentanglement by training additional auxiliary tasks that are defined on the factors, alongside the standard VAE objective (see Appendix Eqn. 3).

ovember 17, relation, representation, (15 more...)

2011.07137

Country:

North America > United States > California > Los Angeles County > Long Beach (0.14)
North America > Canada > Quebec > Montreal (0.05)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(8 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Machine LearningNov-11-2020

CAN: Revisiting Feature Co-Action for Click-Through Rate Prediction

Zhou, Guorui, Bian, Weijie, Wu, Kailun, Ren, Lejian, Pi, Qi, Zhang, Yujing, Xiao, Can, Sheng, Xiang-Rong, Mou, Na, Luo, Xinchen, Zhang, Chi, Qiao, Xianjie, Xiang, Shiming, Gai, Kun, Zhu, Xiaoqiang, Xu, Jian

Inspired by the success of deep learning, recent industrial Click-Through Rate (CTR) prediction models have made the transition from traditional shallow approaches to deep approaches. Deep Neural Networks (DNNs) are known for its ability to learn non-linear interactions from raw feature automatically, however, the non-linear feature interaction is learned in an implicit manner. The non-linear interaction may be hard to capture and explicitly model the \textit{co-action} of raw feature is beneficial for CTR prediction. \textit{Co-action} refers to the collective effects of features toward final prediction. In this paper, we argue that current CTR models do not fully explore the potential of feature co-action. We conduct experiments and show that the effect of feature co-action is underestimated seriously. Motivated by our observation, we propose feature Co-Action Network (CAN) to explore the potential of feature co-action. The proposed model can efficiently and effectively capture the feature co-action, which improves the model performance while reduce the storage and computation consumption. Experiment results on public and industrial datasets show that CAN outperforms state-of-the-art CTR models by a large margin. Up to now, CAN has been deployed in the Alibaba display advertisement system, obtaining averaging 12\% improvement on CTR and 8\% on RPM.

coaction, coaction modeling, feature coaction, (16 more...)

2011.05625

Country:

Europe > Slovenia > Central Slovenia > Municipality of Ljubljana > Ljubljana (0.05)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
(5 more...)

Genre: Research Report (0.82)

Industry: Marketing (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Kim, Yoon-Yeong, Song, Kyungwoo, Jang, JoonHo, Moon, Il-Chul

LADA: Look-Ahead Data Acquisition via Augmentation for Active Learning

arXiv.org Artificial IntelligenceNov-9-2020

Active learning effectively collects data instances for training deep learning models when the labeled dataset is limited and the annotation cost is high. Besides active learning, data augmentation is also an effective technique to enlarge the limited amount of labeled instances. However, the potential gain from virtual instances generated by data augmentation has not been considered in the acquisition process of active learning yet. Looking ahead the effect of data augmentation in the process of acquisition would select and generate the data instances that are informative for training the model. Hence, this paper proposes Look-Ahead Data Acquisition via augmentation, or LADA, to integrate data acquisition and data augmentation. LADA considers both 1) unlabeled data instance to be selected and 2) virtual data instance to be generated by data augmentation, in advance of the acquisition process. Moreover, to enhance the informativeness of the virtual data instances, LADA optimizes the data augmentation policy to maximize the predictive acquisition score, resulting in the proposal of InfoMixup and InfoSTN. As LADA is a generalizable framework, we experiment with the various combinations of acquisition and augmentation methods. The performance of LADA shows a significant improvement over the recent augmentation and acquisition baselines which were independently applied to the benchmark datasets.

acquisition, augmentation, data augmentation, (11 more...)

2011.04194

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
(4 more...)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Labhsetwar, Shreyas Rajesh, Salvi, Raj Sunil, Kolte, Piyush Arvind, venkatesh, Veerasai Subramaniam, Baretto, Alistair Michael

Predictive Analysis of Diabetic Retinopathy with Transfer Learning

arXiv.org Artificial IntelligenceNov-8-2020

With the prevalence of Diabetes, the Diabetes Mellitus Retinopathy (DR) is becoming a major health problem across the world. The long-term medical complications arising due to DR have a significant impact on the patient as well as the society, as the disease mostly affects individuals in their most productive years. Early detection and treatment can help reduce the extent of damage to the patients. The rise of Convolutional Neural Networks for predictive analysis in the medical field paves the way for a robust solution to DR detection. This paper studies the performance of several highly efficient and scalable CNN architectures for Diabetic Retinopathy Classification with the help of Transfer Learning. The research focuses on VGG16, Resnet50 V2 and EfficientNet B0 models. The classification performance is analyzed using several performance metrics including True Positive Rate, False Positive Rate, Accuracy, etc. Also, several performance graphs are plotted for visualizing the architecture performance including Confusion Matrix, ROC Curve, etc. The results indicate that Transfer Learning with ImageNet weights using VGG 16 model demonstrates the best classification performance with the best Accuracy of 95%. It is closely followed by ResNet50 V2 architecture with the best Accuracy of 93%. This paper shows that predictive analysis of DR from retinal images is achieved with Transfer Learning on Convolutional Neural Networks.

classification, dataset, transfer learning, (12 more...)

2011.04052

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Asia > India > Maharashtra > Mumbai (0.05)
Oceania > Australia > New South Wales > Sydney (0.04)
(10 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (1.00)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Machine LearningNov-7-2020

Overcoming Negative Transfer: A Survey

Zhang, Wen, Deng, Lingfei, Zhang, Lei, Wu, Dongrui

Transfer learning (TL) tries to utilize data or knowledge from one or more source domains to facilitate the learning in a target domain. It is particularly useful when the target domain has few or no labeled data, due to annotation expense, privacy concerns, etc. Unfortunately, the effectiveness of TL is not always guaranteed. Negative transfer (NT), i.e., the source domain data/knowledge cause reduced learning performance in the target domain, has been a long-standing and challenging problem in TL. Various approaches to overcome NT have been proposed in the literature. However, there has not been a systematic survey on overcoming NT. This paper fills the gap, by categorizing and reviewing near 100 approaches for combating NT, from four perspectives: source data quality, target data quality, domain divergence, and integrated algorithms. NT in related fields, e.g., multi-task learning, multilingual models, and lifelong learning, is also discussed.

proc, source domain, target domain, (14 more...)

2009.00909

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.05)
(36 more...)

Genre:

Research Report (1.00)
Overview (0.66)

Industry:

Education (1.00)
Health & Medicine > Therapeutic Area (0.46)
Information Technology > Security & Privacy (0.34)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Artificial IntelligenceNov-5-2020

Belief-Grounded Networks for Accelerated Robot Learning under Partial Observability

Nguyen, Hai, Daley, Brett, Song, Xinchao, Amato, Christopher, Platt, Robert

Many important robotics problems are partially observable in the sense that a single visual or force-feedback measurement is insufficient to reconstruct the state. Standard approaches involve learning a policy over beliefs or observation-action histories. However, both of these have drawbacks; it is expensive to track the belief online, and it is hard to learn policies directly over histories. We propose a method for policy learning under partial observability called the Belief-Grounded Network (BGN) in which an auxiliary belief-reconstruction loss incentivizes a neural network to concisely summarize its input history. Since the resulting policy is a function of the history rather than the belief, it can be executed easily at runtime. We compare BGN against several baselines on classic benchmark tasks as well as three novel robotic touch-sensing tasks. BGN outperforms all other tested methods and its learned policies work well when transferred onto a physical robot.

agent, bgn, representation, (12 more...)

2010.0917

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.51)