AITopics

Recommender systems today have become an essential component of any commercial website. Collaborative filtering approaches, and Matrix Factorization (MF) techniques in particular, are widely used in recommender systems. However, the natural data sparsity problem limits their performance where users generally interact with very few items in the system. Consequently, multiple hybrid models were proposed recently to optimize MF performance by incorporating additional contextual information in its learning process. Although these models improve the recommendation quality, there are two primary aspects for further improvements: (1) multiple models focus only on some portion of the available contextual information and neglect other portions; (2) learning the feature space of the side contextual information needs to be further enhanced. In this paper, we propose a Collaborative Dual Attentive Autoencoder (CATA++) for recommending scientific articles. CATA++ utilizes an article's content and learns its latent space via two parallel autoencoders. We use attention mechanism to capture the most pertinent part of information in making more relevant recommendations. Comprehensive experiments on three real-world datasets have shown that our dual-way learning strategy has significantly improved the MF performance in comparison with other state-of-the-art MF-based models according to various experimental evaluations. The source code of our methods is available at: https://github.com/jianlin-cheng/CATA.

autoencoder, dataset, recommendation, (14 more...)

2002.12277

Country: North America > United States > Missouri > Boone County > Columbia (0.14)

Genre: Research Report > Promising Solution (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Mhammedi, Zakaria, Koolen, Wouter M.

Lipschitz and Comparator-Norm Adaptivity in Online Learning

We study Online Convex Optimization in the unbounded setting where neither predictions nor gradient are constrained. The goal is to simultaneously adapt to both the sequence of gradients and the comparator. We first develop parameter-free and scale-free algorithms for a simplified setting with hints. We present two versions: the first adapts to the squared norms of both comparator and gradients separately using $O(d)$ time per round, the second adapts to their squared inner products (which measure variance only in the comparator direction) in time $O(d^3)$ per round. We then generalize two prior reductions to the unbounded setting; one to not need hints, and a second to deal with the range ratio problem (which already arises in prior work). We discuss their optimality in light of prior and new lower bounds. We apply our methods to obtain sharper regret bounds for scale-invariant online prediction with linear models.

algorithm, linear model, sequence, (15 more...)

2002.12242

Country:

Europe > Sweden > Stockholm > Stockholm (0.04)
Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
North America > United States > Arizona > Maricopa County > Phoenix (0.04)

Genre: Research Report (0.40)

Industry: Education > Educational Setting > Online (0.41)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.41)

Woodbury Transformations for Deep Generative Flows

Lu, You, Huang, Bert

Normalizing flows are deep generative models that allow efficient likelihood calculation and sampling. The core requirement for this advantage is that they are constructed using functions that can be efficiently inverted and for which the determinant of the function's Jacobian can be efficiently computed. Researchers have introduced various such flow operations, but few of these allow rich interactions among variables without incurring significant computational costs. In this paper, we introduce Woodbury transformations, which achieve efficient invertibility via the Woodbury matrix identity and efficient determinant calculation via Sylvester's determinant identity. In contrast with other operations used in state-of-the-art normalizing flows, Woodbury transformations enable (1) high-dimensional interactions, (2) efficient sampling, and (3) efficient likelihood evaluation. Other similar operations, such as 1x1 convolutions, emerging convolutions, or periodic convolutions allow at most two of these three advantages. In our experiments on multiple image datasets, we find that Woodbury transformations allow learning of higher-likelihood models than other flow architectures while still enjoying their efficiency advantages.

convolution, transformation, woodbury transformation, (12 more...)

2002.12229

Country: North America > United States > Virginia (0.04)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

On Isometry Robustness of Deep 3D Point Cloud Models under Adversarial Attacks

Zhao, Yue, Wu, Yuwei, Chen, Caihua, Lim, Andrew

While deep learning in 3D domain has achieved revolutionary performance in many tasks, the robustness of these models has not been sufficiently studied or explored. Regarding the 3D adversarial samples, most existing works focus on manipulation of local points, which may fail to invoke the global geometry properties, like robustness under linear projection that preserves the Euclidean distance, i.e., isometry. In this work, we show that existing state-of-the-art deep 3D models are extremely vulnerable to isometry transformations. Armed with the Thompson Sampling, we develop a black-box attack with success rate over 95\% on ModelNet40 data set. Incorporating with the Restricted Isometry Property, we propose a novel framework of white-box attack on top of spectral norm based perturbation. In contrast to previous works, our adversarial samples are experimentally shown to be strongly transferable. Evaluated on a sequence of prevailing 3D models, our white-box attack achieves success rates from 98.88\% to 100\%. It maintains a successful attack rate over 95\% even within an imperceptible rotation range $[\pm 2.81^{\circ}]$.

isometry, point cloud, proceedings, (12 more...)

2002.12222

Country:

North America > United States > California > Los Angeles County > Long Beach (0.04)
Asia > Singapore (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)

Genre: Research Report (0.82)

Industry:

Information Technology > Security & Privacy (0.65)
Transportation > Ground > Road (0.46)
Government > Military (0.41)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)

Fiok, Krzysztof, Karwowski, Waldemar, Wilamowski, Maciej

Prediction of adverse events in Afghanistan: regression analysis of time series data grouped not by geographic dependencies

The aim of this study was to approach a difficult regression task on highly unbalanced data regarding active theater of war in Afghanistan. Our focus was set on predicting the negative events number without distinguishing precise nature of the events given historical data on investment and negative events per each of predefined 400 Afghanistan districts. In contrast with previous research on the matter, we propose an approach to analysis of time series data that benefits from non-conventional aggregation of these territorial entities. By carrying out initial exploratory data analysis we demonstrate that dividing data according to our proposal allows to identify strong trend and seasonal components in the selected target variable. Utilizing this approach we also tried to estimate which data regarding investments is most important for prediction performance. Based on our exploratory analysis and previous research we prepared 5 sets of independent variables that were fed to 3 machine learning regression models. The results expressed by mean absolute and mean square errors indicate that leveraging historical data regarding target variable allows for reasonable performance, however unfortunately other proposed independent variables does not seem to improve prediction quality.

information, ml task, negative event, (14 more...)

2002.12211

Country:

Asia > Afghanistan (0.81)
Europe > Poland > Masovia Province > Warsaw (0.04)
North America > United States > Florida > Orange County > Orlando (0.04)

Genre: Research Report > Experimental Study (0.34)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.41)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.86)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.69)

Jia, Hengrui, Choquette-Choo, Christopher A., Papernot, Nicolas

Entangled Watermarks as a Defense against Model Extraction

Machine learning involves expensive data collection and training procedures. Model owners may be concerned that valuable intellectual property can be leaked if adversaries mount model extraction attacks. Because it is difficult to defend against model extraction without sacrificing significant prediction accuracy, watermarking leverages unused model capacity to have the model overfit to outlier input-output pairs, which are not sampled from the task distribution and are only known to the defender. The defender then demonstrates knowledge of the input-output pairs to claim ownership of the model at inference. The effectiveness of watermarks remains limited because they are distinct from the task distribution and can thus be easily removed through compression or other forms of knowledge transfer. We introduce Entangled Watermarking Embeddings (EWE). Our approach encourages the model to learn common features for classifying data that is sampled from the task distribution, but also data that encodes watermarks. An adversary attempting to remove watermarks that are entangled with legitimate data is also forced to sacrifice performance on legitimate data. Experiments on MNIST, Fashion-MNIST, and Google Speech Commands validate that the defender can claim model ownership with 95% confidence after less than 10 queries to the stolen copy, at a modest cost of 1% accuracy in the defended model's performance.

dataset, task distribution, watermark, (15 more...)

2002.122

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Santa Clara County > Santa Clara (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Afshar, Majid, Usefi, Hamid

High-Dimensional Feature Selection for Genomic Datasets

In the presence of large dimensional datasets that contain many irrelevant features (variables), dimensionality reduction algorithms have proven to be useful in removing features with low variance and combine features with high correlation. In this paper, we propose a new feature selection method which uses singular value decomposition of a matrix and the method of least squares to remove the irrelevant features and detect correlations between the remaining features. The effectiveness of our method has been verified by performing a series of comparisons with state-of-the-art feature selection methods over ten genetic datasets ranging up from 9,117 to 267,604 features. The results show that our method is favorable in various aspects compared to state-of-the-art feature selection methods.

dataset, feature selection, selection, (14 more...)

2002.12104

Country:

North America > Canada > Newfoundland and Labrador > Newfoundland > St. John's (0.04)
North America > United States > New Hampshire (0.04)
North America > Greenland (0.04)
Asia > Middle East > Republic of Türkiye > Bingoel Province > Bingol (0.04)

Genre: Research Report > Experimental Study (0.47)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Kuutti, Sampo, Fallah, Saber, Bowden, Richard

Training Adversarial Agents to Exploit Weaknesses in Deep Control Policies

Deep learning has become an increasingly common technique for various control problems, such as robotic arm manipulation, robot navigation, and autonomous vehicles. However, the downside of using deep neural networks to learn control policies is their opaque nature and the difficulties of validating their safety. As the networks used to obtain state-of-the-art results become increasingly deep and complex, the rules they have learned and how they operate become more challenging to understand. This presents an issue, since in safety-critical applications the safety of the control policy must be ensured to a high confidence level. In this paper, we propose an automated black box testing framework based on adversarial reinforcement learning. The technique uses an adversarial agent, whose goal is to degrade the performance of the target model under test. We test the approach on an autonomous vehicle problem, by training an adversarial reinforcement learning agent, which aims to cause a deep neural network-driven autonomous vehicle to collide. Two neural networks trained for autonomous driving are compared, and the results from the testing are used to compare the robustness of their learned control policies. We show that the proposed framework is able to find weaknesses in both control policies that were not evident during online testing and therefore, demonstrate a significant benefit over manual testing methods.

agent, control policy, vehicle, (14 more...)

2002.12078

Country:

North America > United States (0.14)
Europe > United Kingdom > England > Surrey > Guildford (0.04)
Europe > Germany (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Automobiles & Trucks (1.00)
Transportation > Ground > Road (0.89)
Information Technology > Robotics & Automation (0.67)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Horak, Danijela, Yu, Simiao, Salimi-Khorshidi, Gholamreza

Topology Distance: A Topology-Based Approach For Evaluating Generative Adversarial Networks

Automatic evaluation of the goodness of Generative Adversarial Networks (GANs) has been a challenge for the field of machine learning. In this work, we propose a distance complementary to existing measures: Topology Distance (TD), the main idea behind which is to compare the geometric and topological features of the latent manifold of real data with those of generated data. More specifically, we build Vietoris-Rips complex on image features, and define TD based on the differences in persistent-homology groups of the two manifolds. We compare TD with the most commonly used and relevant measures in the field, including Inception Score (IS), Frechet Inception Distance (FID), Kernel Inception Distance (KID) and Geometry Score (GS), in a range of experiments on various datasets. We demonstrate the unique advantage and superiority of our proposed approach over the aforementioned metrics. A combination of our empirical results and the theoretical argument we propose in favour of TD, strongly supports the claim that TD is a powerful candidate metric that researchers can employ when aiming to automatically evaluate the goodness of GANs' learning.

dataset, homology group, manifold, (15 more...)

2002.12054

Country:

Europe > United Kingdom > England > Greater London > London (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Harris, Ethan, Marcu, Antonia, Painter, Matthew, Niranjan, Mahesan, Prügel-Bennett, Adam, Hare, Jonathon

Understanding and Enhancing Mixed Sample Data Augmentation

Mixed Sample Data Augmentation (MSDA) has received increasing attention in recent years, with many successful variants such as MixUp and CutMix. Following insight on the efficacy of CutMix in particular, we propose FMix, an MSDA that uses binary masks obtained by applying a threshold to low frequency images sampled from Fourier space. FMix improves performance over MixUp and CutMix for a number of state-of-the-art models across a range of data sets and problem settings. We go on to analyse MixUp, CutMix, and FMix from an information theoretic perspective, characterising learned models in terms of how they progressively compress the input with depth. Ultimately, our analyses allow us to decouple two complementary properties of augmentations, and present a unified framework for reasoning about MSDA. Code for all experiments is available at https://github.com/ecs-vlc/FMix.

experiment, information, msda, (15 more...)

2002.12047

Country:

North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > United Kingdom > England > Hampshire > Southampton (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)