Goto

Collaborating Authors

 Perceptrons


Domain-Independent turn-level Dialogue Quality Evaluation via User Satisfaction Estimation

arXiv.org Artificial Intelligence

An automated metric to evaluate dialogue quality is vital for optimizing data driven dialogue management. The common approach of relying on explicit user feedback during a conversation is intrusive and sparse. Current models to estimate user satisfaction use limited feature sets and rely on annotation schemes with low inter-rater reliability, limiting generalizability to conversations spanning multiple domains. To address these gaps, we created a new Response Quality annotation scheme, based on which we developed turn-level User Satisfaction metric. We introduced five new domain-independent feature sets and experimented with six machine learning models to estimate the new satisfaction metric. Using Response Quality annotation scheme, across randomly sampled single and multi-turn conversations from 26 domains, we achieved high inter-annotator agreement (Spearman's rho 0.94). The Response Quality labels were highly correlated (0.76) with explicit turn-level user ratings. Gradient boosting regression achieved best correlation of ~0.79 between predicted and annotated user satisfaction labels. Multi Layer Perceptron and Gradient Boosting regression models generalized to an unseen domain better (linear correlation 0.67) than other models. Finally, our ablation study verified that our novel features significantly improved model performance.


The Partial Response Network

arXiv.org Machine Learning

We propose a method to open the black box of the Multi-Layer Perceptron by inferring from it a simpler and generally more accurate general additive model. The resulting model comprises non-linear univariate and bivariate partial responses derived from the original Multi-Layer Perceptron. The responses are combined using the Lasso and further optimised within a modular structure. The approach is generic and provides a constructive framework to simplify and explain the Multi-Layer Perceptron for any data set, opening the door for validation against prior knowledge. Experimental results on benchmarking datasets indicate that the partial responses are intuitive to interpret and the Area Under the Curve is competitive with Gradient Boosting, Support Vector Machines and Random Forests. The performance improvement compared with a fully connected Multi-Layer Perceptron is attributed to reduced confounding in the second stage of optimisation of the weights. The main limitation of the method is that it explicitly models only up to pairwise interactions. For many practical applications this will be optimal, but where that is not the case then this will be indicated by the performance difference compared to the original model. The streamlined model simultaneously interprets and optimises this frequently used flexible model.


A Reproducible Comparison of RSSI Fingerprinting Localization Methods Using LoRaWAN

arXiv.org Machine Learning

--The use of fingerprinting localization techniques in outdoor IoT settings has started to gain popularity over the recent years. Communication signals of Low Power Wide Area Networks (LPW AN), such as LoRaW AN, are used to estimate the location of low power mobile devices. In this study, a publicly available dataset of LoRaW AN RSSI measurements is utilized to compare different machine learning methods and their accuracy in producing location estimates. The tested methods are: the k Nearest Neighbours method, the Extra Trees method and a neural network approach using a Multilayer Perceptron. T o facilitate the reproducibility of tests and the comparability of results, the code and the train/validation/test split of the dataset used in this study have become available. The neural network approach was the method with the highest accuracy, achieving a mean error of 358 meters and a median error of 204 meters. The proliferation of the usage of Internet-of-Things (IoT) technologies and Low Power Wide Area Networks (LPW AN), such as LoRaW AN or Sigfox, over the last decade has created a new landscape in the field of outdoor localization. Low power devices of LPW ANs cannot afford the battery consumption of a chip-set of a Global Navigation Satellite System (GNSS), such as the GPS. Therefore, an alternative approach is needed in order to localize these low power devices. The devices communicate with fixed basestations deployed in urban and rural areas through RF messages.


Exploiting a Stimuli Encoding Scheme of Spiking Neural Networks for Stream Learning

arXiv.org Artificial Intelligence

One of the most promising techniques in stream learning is the Spiking Neural Network, and some of them use an interesting population encoding scheme to transform the incoming stimuli into spikes. This study sheds lights on the key issue of this encoding scheme, the Gaussian receptive fields, and focuses on applying them as a pre-processing technique to any dataset in order to gain representativeness, and to boost the predictive performance of the stream learning methods. Experiments with synthetic and real data sets are presented, and lead to confirm that our approach can be applied successfully as a general pre-processing technique in many real cases. Keywords: Stream learning, gaussian receptive fields, population encoding, spiking neural networks 1. Introduction The continuous production of tremendous amount of data in the form of fast streams upsets the traditional view in machine learning, thus giving rise to a new emerging paradigm called stream learning (SL). These streams of data evolve generally over time and may be occasionally affected by a change (concept drift) which impacts on their input data distribution, without following the fundamental hypothesis of stationarity upon which the learning theory is based. Learning in non-stationary environments has attracted much attention in the SL community in Corresponding author: jesus.lopez@tecnalia.com


Classifying Multi-Gas Spectrums using Monte Carlo KNN and Multi-Resolution CNN

arXiv.org Machine Learning

A Monte Carlo k-nearest neighbours (KNN) and a multi-resolution convolutional neural network (CNN) were developed to detect the presences of multiple gasses in near infrared (IR) spectrums. High Resolution Transmission database was used to synthesize the near IR spectrums. Monte Carlo KNN determined the optimal kernel sizes and the optimal number of channels. The multi-resolution CNN, composed of multiple different kernels, was created using the optimal kernel sizes and the optimal number of channels. The multi-resolution CNN outperforms the multilayer perceptron and the partial least squares.


Machine Learning based Prediction of Hierarchical Classification of Transposable Elements

arXiv.org Machine Learning

Transposable Elements (TEs) or jumping genes are the DNA sequences that have an intrinsic capability to move within a host genome from one genomic location to another. Studies show that the presence of a TE within or adjacent to a functional gene may alter its expression. TEs can also cause an increase in the rate of mutation and can even mediate duplications and large insertions and deletions in the genome, promoting gross genetic rearrangements. Thus, the proper classification of the identified jumping genes is essential to understand their genetic and evolutionary effects in the genome. While computational methods have been developed that perform either binary classification or multi-label classification of TEs, few studies have focused on their hierarchical classification. The state-of-the-art machine learning classification method utilizes a Multi-Layer Perceptron (MLP), a class of neural network, for hierarchical classification of TEs. However, the existing methods have limited accuracy in classifying TEs. A more effective classifier, which can explain the role of TEs in germline and somatic evolution, is needed. In this study, we examine the performance of a variety of machine learning (ML) methods. And eventually, propose a robust approach for the hierarchical classification of TEs, with higher accuracy, using Support Vector Machines (SVM).


Searching for Interaction Functions in Collaborative Filtering

arXiv.org Machine Learning

Interaction function (IFC), which captures interactions among items and users, is of great importance in collaborative filtering (CF). The inner product is the most popular IFC due to its success in low-rank matrix factorization. However, interactions in real-world applications can be highly complex. Many other operations (such as plus and concatenation) have also been proposed, and can possibly offer better performance than the inner product. In this paper, motivated by the success of automated machine learning, we propose to search for proper interaction functions (SIF) for CF tasks. We first design an expressive search space for SIF by reviewing and generalizing existing CF approaches. We then propose to represent the search space as a structured multi-layer perceptron, and design a stochastic gradient descent algorithm which can simultaneously update both architectures and learning parameters. Experimental results demonstrate that the proposed method can be much more efficient than popular AutoML approaches, and also obtain much better prediction performance than state-of-the-art CF approaches.


ANN-3-PART(1)-What is a perceptron in a neural network?

#artificialintelligence

You can visit our Website: https://www.mldawn.com/ You can follow us on Twitter: https://twitter.com/MLDawn2018 You can join us on Facebook: https://www.facebook.com/ml.dawn.3 Keep up the good work and good luck!


r/MachineLearning - [D] Can we minimize counting cost function for perceptron algorithm?

#artificialintelligence

The question is posted on Cross Validated (haven't figured out how to write formulas in reddit). Feel free to leave your comments either here or at Cross Validated.


Scaling Laws for the Principled Design, Initialization and Preconditioning of ReLU Networks

arXiv.org Machine Learning

In this work, we describe a set of rules for the design and initialization of well-conditioned neural networks, guided by the goal of naturally balancing the diagonal blocks of the Hessian at the start of training. Our design principle balances multiple sensible measures of the conditioning of neural networks. We prove that for a ReLU-based deep multilayer perceptron, a simple initialization scheme using the geometric mean of the fan-in and fan-out satisfies our scaling rule. For more sophisticated architectures, we show how our scaling principle can be used to guide design choices to produce well-conditioned neural networks, reducing guess-work.