AITopics

To overcome the high dimensionality of data, learning latent feature representations for clustering has been widely studied recently. However, it is still challenging to learn "cluster-friendly" latent representations due to the unsupervised fashion of clustering. In this paper, we propose Disentangling Latent Space Clustering (DLS-Clustering), a new clustering mechanism that directly learning cluster assignment during the disentanglement of latent spacing without constructing the "cluster-friendly" latent representation and additional clustering methods. We achieve the bidirectional mapping by enforcing an inference network (i.e. encoder) and the generator of GAN to form a deterministic encoder-decoder pair with a maximum mean discrepancy (MMD)-based regularization. We utilize a weight-sharing procedure to disentangle latent space into the one-hot discrete latent variables and the continuous latent variables. The disentangling process is actually performing the clustering operation. Eventually the one-hot discrete latent variables can be directly expressed as clusters, and the continuous latent variables represent remaining unspecified factors. Experiments on six benchmark datasets of different types demonstrate that our method outperforms existing state-of-the-art methods. We further show that the latent representations from DLS-Clustering also maintain the ability to generate diverse and high-quality images, which can support more promising application scenarios.

latent space, latent variable, representation, (14 more...)

1911.0521

Country: North America > United States > California > Alameda County > Oakland (0.04)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

CHEETAH: An Ultra-Fast, Approximation-Free, and Privacy-Preserved Neural Network Framework based on Joint Obscure Linear and Nonlinear Computations

Zhang, Qiao, Wang, Cong, Xin, Chunsheng, Wu, Hongyi

Machine Learning as a Service (MLaaS) is enabling a wide range of smart applications on end devices. However, such convenience comes with a cost of privacy because users have to upload their private data to the cloud. This research aims to provide effective and efficient MLaaS such that the cloud server learns nothing about user data and the users cannot infer the proprietary model parameters owned by the server. This work makes the following contributions. First, it unveils the fundamental performance bottleneck of existing schemes due to the heavy permutations in computing linear transformation and the use of communication intensive Garbled Circuits for nonlinear transformation. Second, it introduces an ultra-fast secure MLaaS framework, CHEETAH, which features a carefully crafted secret sharing scheme that runs significantly faster than existing schemes without accuracy loss. Third, CHEETAH is evaluated on the benchmark of well-known, practical deep networks such as AlexNet and VGG-16 on the MNIST and ImageNet datasets. The results demonstrate more than 100x speedup over the fastest GAZELLE (Usenix Security'18), 2000x speedup over MiniONN (ACM CCS'17) and five orders of magnitude speedup over CryptoNets (ICML'16). This significant speedup enables a wide range of practical applications based on privacy-preserved deep neural networks.

ciphertext, computation, gazelle, (17 more...)

1911.05184

Country:

North America > United States (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia (0.04)

Genre: Research Report > New Finding (0.66)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Aberdeen, Douglas, Baxter, Jonathan, Edwards, Robert

92c/MFlops/s, Ultra-Large-Scale Neural-Network Training on a PIII Cluster

Artificial neural networks with millions of adjustable parameters and a similar number of training examples are a potential solution for difficult, large-scale pattern recognition problems in areas such as speech and face recognition, classification of large volumes of web data, and finance. The bottleneck is that neural network training involves iterative gradient descent and is extremely computationally intensive. In this paper we present a technique for distributed training of Ultra Large Scale Neural Networks (ULSNN) on Bunyip, a Linux-based cluster of 196 Pentium III processors. To illustrate ULSNN training we describe an experiment in which a neural network with 1.73 million adjustable parameters was trained to recognize machine-printed Japanese characters from a database containing 9 million training patterns. The training runs with a average performance of 163.3 GFlops/s (single precision). With a machine cost of \$150,913, this yields a price/performance ratio of 92.4c/MFlops/s (single precision). For comparison purposes, training using double precision and the ATLAS DGEMM produces a sustained performance of 70 MFlops/s or \$2.16 / MFlop/s (double precision).

neural network, node, processor, (17 more...)

doi: 10.1109/SC.2000.10031

1911.05181

Country:

Oceania > Australia > Australian Capital Territory > Canberra (0.04)
North America > United States > Tennessee (0.04)

Genre: Research Report (0.84)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Nonconvex Stochastic Nested Optimization via Stochastic ADMM

Wang, Zhongruo

We consider the stochastic nested composition optimization problem where the objective is a composition of two expected-value functions. We proposed the stochastic ADMM to solve this complicated objective. In order to find an $\epsilon$ stationary point where the expected norm of the subgradient of corresponding augmented Lagrangian is smaller than $\epsilon$, the total sample complexity of our method is $\mathcal{O}(\epsilon^{-3})$ for the online case and $\mathcal{O} \Bigl((2N_1 + N_2) + (2N_1 + N_2)^{1/2}\epsilon^{-2}\Bigr)$ for the finite sum case. The computational complexity is consistent with proximal version proposed in \cite{zhang2019multi}, but our algorithm can solve more general problem when the proximal mapping of the penalty is not easy to compute.

algorithm, complexity, min 2, (15 more...)

1911.05167

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.34)

Chen, John, Shah, Vatsal, Kyrillidis, Anastasios

Negative sampling in semi-supervised learning

We introduce Negative Sampling in Semi-Supervised Learning (NS3L), a simple, fast, easy to tune algorithm for semi-supervised learning (SSL). NS3L is motivated by the success of negative sampling/contrastive estimation. We demonstrate that adding the NS3L loss to state-of-the-art SSL algorithms, such as the Virtual Adversarial Training (VAT), significantly improves upon vanilla VAT and its variant, VAT with Entropy Minimization. By adding the NS3L loss to MixMatch, the current state-of-the-art approach on semi-supervised tasks, we observe significant improvements over vanilla MixMatch. We conduct extensive experiments on the CIFAR10, CIFAR100, SVHN and STL10 benchmark datasets.

ns 3, semi-supervised learning, unlabeled data, (12 more...)

1911.05166

Country: North America > United States > Texas > Travis County > Austin (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Coordination Group Formation for OnLine Coordinated Routing Mechanisms

Peng, Wang, Du, Lili

This study considers that the collective route choices of travelers en route represent a resolution of their competition on network routes. Well understanding this competition and coordinating their route choices help mitigate urban traffic congestion. Even though existing studies have developed such mechanisms (e.g., the CRM [1]), we still lack the quantitative method to evaluate the coordination penitential and identify proper coordination groups (CG) to implement the CRM. Thus, they hit prohibitive computing difficulty when implemented with many opt-in travelers. Motived by this view, this study develops mathematical approaches to quantify the coordination potential between two and among multiple travelers. Next, we develop the adaptive centroid-based clustering algorithm (ACCA), which splits travelers en route in a local network into CGs, each with proper size and strong coordination potential. Moreover, the ACCA is statistically secured to stop at a local optimal clustering solution, which balances the inner-cluster and inter-cluster coordination potential. It can be implemented by parallel computation to accelerate its computing efficiency. Furthermore, we propose a clustering based coordinated routing mechanism (CB-CRM), which implements a CRM on each individual CG. The numerical experiments built upon both Sioux Falls and Hardee city networks show that the ACCA works efficiently to form proper coordination groups so that as compared to the CRM, the CB-CRM significantly improves computation efficiency with minor system performance loss in a large network. This merit becomes more apparent under high penetration and congested traffic condition. Last, the experiments validate the good features of the ACCA as well as the value of implementing parallel computation.

competition potential, crm, traveler, (16 more...)

1911.05159

Country:

North America > United States > Florida > Alachua County > Gainesville (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Transportation (1.00)
Consumer Products & Services > Travel (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Data Science (0.89)
Information Technology > Communications > Networks (0.88)

Incentivized Exploration for Multi-Armed Bandits under Reward Drift

Liu, Zhiyuan, Wang, Huazheng, Shen, Fan, Liu, Kai, Chen, Lijun

We study incentivized exploration for the multi-armed bandit (MAB) problem where the players receive compensation for exploring arms other than the greedy choice and may provide biased feedback on reward. We seek to understand the impact of this drifted reward feedback by analyzing the performance of three instantiations of the incentivized MAB algorithm: UCB, $\varepsilon$-Greedy, and Thompson Sampling. Our results show that they all achieve $\mathcal{O}(\log T)$ regret and compensation under the drifted reward, and are therefore effective in incentivizing exploration. Numerical examples are provided to complement the theoretical analysis.

algorithm, compensation, exploration, (16 more...)

1911.05142

Country:

North America > United States > Colorado > Boulder County > Boulder (0.04)
North America > United States > Virginia (0.04)
Europe > Denmark (0.04)
Asia > Middle East > Iran (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Information Technology (0.46)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.67)

Detecting Patterns of Physiological Response to Hemodynamic Stress via Unsupervised Deep Learning

Gao, Chufan, Falck, Fabian, Goswami, Mononito, Wertz, Anthony, Pinsky, Michael R., Dubrawski, Artur

Monitoring physiological responses to hemodynamic stress can help in determining appropriate treatment and ensuring good patient outcomes. Physicians' intuition suggests that the human body has a number of physiological response patterns to hemorrhage which escalate as blood loss continues, however the exact etiology and phenotypes of such responses are not well known or understood only at a coarse level. Although previous research has shown that machine learning models can perform well in hemorrhage detection and survival prediction, it is unclear whether machine learning could help to identify and characterize the underlying physiological responses in raw vital sign data. We approach this problem by first transforming the high-dimensional vital sign time series into a tractable, lower-dimensional latent space using a dilated, causal convolutional encoder model trained purely unsupervised. Second, we identify informative clusters in the embeddings. By analyzing the clusters of latent embeddings and visualizing them over time, we hypothesize that the clusters correspond to the physiological response patterns that match physicians' intuition. Furthermore, we attempt to evaluate the latent embeddings using a variety of methods, such as predicting the cluster labels using explainable features.

algorithm, information, sequence, (13 more...)

1911.05121

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.05)
North America > Canada (0.04)
Europe > Sweden (0.04)
Europe > Denmark (0.04)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Therapeutic Area > Hematology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.85)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.82)

Gligorijevic, Djordje, Gligorijevic, Jelena, Flores, Aaron

Time-Aware Prospective Modeling of Users for Online Display Advertising

Prospective display advertising poses a great challenge for large advertising platforms as the strongest predictive signals of users are not eligible to be used in the conversion prediction systems. To that end efforts are made to collect as much information as possible about each user from various data sources and to design powerful models that can capture weaker signals ultimately obtaining good quality of conversion prediction probability estimates. In this study we propose a novel time-aware approach to model heterogeneous sequences of users' activities and capture implicit signals of users' conversion intents. On two real-world datasets we show that our approach outperforms other, previously proposed approaches, while providing interpretability of signal impact to conversion probability.

advertiser, conversion, information, (15 more...)

1911.051

Country:

North America > United States > Alaska > Anchorage Municipality > Anchorage (0.05)
North America > United States > California > Santa Clara County > Sunnyvale (0.05)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)

Genre: Research Report > New Finding (0.88)

Industry:

Marketing (1.00)
Information Technology > Services (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Kamani, Mohammad Mahdi, Haddadpour, Farzin, Forsati, Rana, Mahdavi, Mehrdad

Efficient Fair Principal Component Analysis

The flourishing assessments of fairness measure in machine learning algorithms have shown that dimension reduction methods such as PCA treat data from different sensitive groups unfairly. In particular, by aggregating data of different groups, the reconstruction error of the learned subspace becomes biased towards some populations that might hurt or benefit those groups inherently, leading to an unfair representation. On the other hand, alleviating the bias to protect sensitive groups in learning the optimal projection, would lead to a higher reconstruction error overall. This introduces a trade-off between sensitive groups' sacrifices and benefits, and the overall reconstruction error. In this paper, in pursuit of achieving fairness criteria in PCA, we introduce a more efficient notion of Pareto fairness, cast the Pareto fair dimensionality reduction as a multi-objective optimization problem, and propose an adaptive gradient-based algorithm to solve it. Using the notion of Pareto optimality, we can guarantee that the solution of our proposed algorithm belongs to the Pareto frontier for all groups, which achieves the optimal trade-off between those aforementioned conflicting objectives. This framework can be efficiently generalized to multiple group sensitive features, as well. We provide convergence analysis of our algorithm for both convex and non-convex objectives and show its efficacy through empirical studies on different datasets, in comparison with the state-of-the-art algorithm.

disparity error, objective, subspace, (16 more...)

1911.04931

Country:

North America > United States > Washington > King County > Bellevue (0.04)
North America > United States > Pennsylvania > Centre County > University Park (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Genre: Research Report (0.70)

Industry: Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.40)