Goto

Collaborating Authors

 Country


General Matrix-Matrix Multiplication Using SIMD features of the PIII

arXiv.org Machine Learning

Generalised matrix-matrix multiplication forms the kernel of many mathematical algorithms. A faster matrix-matrix multiply immediately benefits these algorithms. In this paper we implement efficient matrix multiplication for large matrices using the floating point Intel Pentium SIMD (Single Instruction Multiple Data) architecture. A description of the issues and our solution is presented, paying attention to all levels of the memory hierarchy. Our results demonstrate an average performance of 2.09 times faster than the leading public domain matrix-matrix multiply routines.


A Multi-language Platform for Generating Algebraic Mathematical Word Problems

arXiv.org Machine Learning

--Existing approaches for automatically generating mathematical word problems are deprived of customizability and creativity due to the inherent nature of template-based mechanisms they employ. We present a solution to this problem with the use of deep neural language generation mechanisms. Our approach uses a Character Level Long Short T erm Memory Network (LSTM) to generate word problems, and uses POS (Part of Speech) tags to resolve the constraints found in the generated problems. Our approach is capable of generating Mathematics Word Problems in both English and Sinhala languages with an accuracy over 90%. A Mathematical word problem (MWP) is a mathematical problem expressed in natural language. Unlike other knowledge based question types such as travel or history related questions, MWPs require problem solving ability. In particular, algebraic questions involve sentences to make the questions more deep and inspective. Algebra is a major component of mathematics that is learnt by every student in Ordinary Level (O/L). Simple algebra problems mostly appear in a word format.


On the Vietnamese Name Entity Recognition: A Deep Learning Method Approach

arXiv.org Machine Learning

--Named entity recognition (NER) plays an important role in text-based information retrieval. In this paper, we combine Bidirectional Long Short-T erm Memory (Bi-LSTM) [7], [27] with Conditional Random Field (CRF) [9] to create a novel deep learning model for the NER problem. Each word as input of the deep learning model is represented by a Word2vec-trained vector . A word embedding set trained from about one million articles in 2018 collected through a Vietnamese news portal (baomoi.com). In addition, we concatenate a Word2V ec [18]- trained vector with semantic feature vector (Part-Of-Speech (POS) tagging, chunk-tag) and hidden syntactic feature vector (extracted by Bi-LSTM nerwork) to achieve the (so far best) result in Vietnamese NER system.


DeStress: Deep Learning for Unsupervised Identification of Mental Stress in Firefighters from Heart-rate Variability (HRV) Data

arXiv.org Machine Learning

In this work we perform a study of various unsupervised methods to identify mental stress in firefighter trainees based on unlabeled heart rate variability data. We collect RR interval time series data from nearly 100 firefighter trainees that participated in a drill. We explore and compare three methods in order to perform unsupervised stress detection: 1) traditional K-Means clustering with engineered time and frequency domain features 2) convolutional autoencoders and 3) long short-term memory (LSTM) autoencoders, both trained on the raw RRI measurements combined with DBSCAN clustering and K-Nearest-Neighbors classification. We demonstrate that K-Means combined with engineered features is unable to capture meaningful structure within the data. On the other hand, convolutional and LSTM autoencoders tend to extract varying structure from the data pointing to different clusters with different sizes of clusters. We attempt at identifying the true stressed and normal clusters using the HRV markers of mental stress reported in the literature. We demonstrate that the clusters produced by the convolutional autoencoders consistently and successfully stratify stressed versus normal samples, as validated by several established physiological stress markers such as RMSSD, Max-HR, Mean-HR and LF-HF ratio.


Consistent recovery threshold of hidden nearest neighbor graphs

arXiv.org Machine Learning

Jian Ding, Yihong Wu, Jiaming Xu, and Dana Yang November 20, 2019 Abstract Motivated by applications such as discovering strong ties in social networks and assembling genome subsequences in biology, we study the problem of recovering a hidden 2 k -nearest neighbor (NN) graph in an n -vertex complete graph, whose edge weights are independent and distributed according to P n for edges in the hidden 2 k -NN graph and Q n otherwise. We focus on two types of asymptotic recovery guarantees as n: (1) exact recovery: all edges are classified correctly with probability tending to one; (2) almost exact recovery: the expected number of misclassified edges is o (nk). We show that the maximum likelihood estimator achieves (1) exact recovery for 2 k n o(1) if lim inf 2ฮฑ n log n 1; (2) almost exact recovery for 1 k o null log n log log nnull if lim inf kD ( P n Q n) log n 1, where ฮฑ n null 2 log null dP ndQ n is the R enyi divergence of order 1 2 and D (P n Q n) is the Kullback-Leibler divergence.


Deep Detector Health Management under Adversarial Campaigns

arXiv.org Machine Learning

Machine learning models are vulnerable to adversarial inputs that induce seemingly unjustifiable errors. As automated classifiers are increasingly used in industrial control systems and machinery, these adversarial errors could grow to be a serious problem. Despite numerous studies over the past few years, the field of adversarial ML is still considered alchemy, with no practical unbroken defenses demonstrated to date, leaving PHM practitioners with few meaningful ways of addressing the problem. We introduce turbidity detection as a practical superset of the adversarial input detection problem, coping with adversarial campaigns rather than statistically invisible one-offs. This perspective is coupled with ROCtheoretic design guidance that prescribes an inexpensive domain adaptation layer at the output of a deep learning model during an attack campaign. The result aims to approximate the Bayes optimal mitigation that ameliorates the detection model's degraded health. A proactively reactive type of prognostics is achieved via Monte Carlo simulation of various adversarial campaign scenarios, by sampling from the model's own turbidity distribution to quickly deploy the correct mitigation during a real-world campaign. A machine learning application often begins with a dataset of examples and the task is to find a classification model that will turn inputs into class-label predictions, while preserving some sense of minimum expected error. This is an open-access article distributed under the terms of the Creative Commons Attribution 3.0 United States License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. But less obviously, it is often possible to determin-istically find input examples that force the model to misclas-sify (Szegedy et al., 2014).


Outlier-Robust High-Dimensional Sparse Estimation via Iterative Filtering

arXiv.org Machine Learning

We study high-dimensional sparse estimation tasks in a robust setting where a constant fraction of the dataset is adversarially corrupted. Specifically, we focus on the fundamental problems of robust sparse mean estimation and robust sparse PCA. We give the first practically viable robust estimators for these problems. In more detail, our algorithms are sample and computationally efficient and achieve near-optimal robustness guarantees. In contrast to prior provable algorithms which relied on the ellipsoid method, our algorithms use spectral techniques to iteratively remove outliers from the dataset. Our experimental evaluation on synthetic data shows that our algorithms are scalable and significantly outperform a range of previous approaches, nearly matching the best error rate without corruptions.


Adaptive Activation Network and Functional Regularization for Efficient and Flexible Deep Multi-Task Learning

arXiv.org Machine Learning

Multi-task learning (MTL) is a common paradigm that seeks to improve the generalization performance of task learning by training related tasks simultaneously. However, it is still a challenging problem to search the flexible and accurate architecture that can be shared among multiple tasks. In this paper, we propose a novel deep learning model called Task Adaptive Activation Network (TAAN) that can automatically learn the optimal network architecture for MTL. The main principle of TAAN is to derive flexible activation functions for different tasks from the data with other parameters of the network fully shared. We further propose two functional regularization methods that improve the MTL performance of TAAN. The improved performance of both TAAN and the regularization methods is demonstrated by comprehensive experiments.


Prestopping: How Does Early Stopping Help Generalization against Label Noise?

arXiv.org Machine Learning

Thus, it is challenging to train a DNN robustly even when noisy labels exist in the training data. A popular approach to dealing with noisy labels is "sample selection" that selects true-labeled samples from the noisy training data (Jiang et al., 2018; Ren et al., 2018; Han et al., 2018; Y u et al., 2019; Song et al., 2019). This loss-based separation is well known to be justified by the memorization effect (Arpit et al., 2017) that DNNs tend to learn easy patterns first and then gradually memorize all samples. Han et al. (2018) empirically proved that training on such small-loss samples yields a much better Despite its great success, Song et al. (2019) have recently argued that the performance of the loss-based separation becomes considerably worse depending on the type of label noise. The memorization rate for false-labeled samples is faster with pair noise than with symmetric noise. Regardless of the noise type, the memorization of false-labeled samples significantly increases at a late stage of training.


Fair Learning-to-Rank from Implicit Feedback

arXiv.org Machine Learning

Addressing unfairness in rankings has become an increasingly important problem due to the growing influence of rankings in critical decision making, yet existing learning-to-rank algorithms suffer from multiple drawbacks when learning fair ranking policies from implicit feedback. Some algorithms suffer from extrinsic reasons of unfairness due to inherent selection biases in implicit feedback leading to rich-get-richer dynamics. While those that address the biased nature of implicit feedback suffer from intrinsic reasons of unfairness due to the lack of explicit control over the allocation of exposure based on merit (i.e, relevance). In both cases, the learned ranking policy can be unfair and lead to suboptimal results. To this end, we propose a novel learning-to-rank framework, FULTR, that is the first to address both intrinsic and extrinsic reasons of unfairness when learning ranking policies from logged implicit feedback. Considering the needs of various applications, we define a class of amortized fairness of exposure constraints with respect to items based on their merit, and propose corresponding counterfactual estimators of disparity (aka unfairness) and utility that are also robust to click noise. Furthermore, we provide an efficient algorithm that optimizes both utility and fairness via a policy-gradient approach. To show that our proposed algorithm learns accurate and fair ranking policies from biased and noisy feedback, we provide empirical results beyond the theoretical justification of the framework.