Modelling the real world complexity of music is a challenge for machine learning. We address the task of modeling melodic sequences from the same music genre. We perform a comparative analysis of two probabilistic models; a Dirichlet Variable Length Markov Model (Dirichlet-VMM) and a Time Convolutional Restricted Boltzmann Machine (TC-RBM). We show that the TC-RBM learns descriptive music features, such as underlying chords and typical melody transitions and dynamics. We assess the models for future prediction and compare their performance to a VMM, which is the current state of the art in melody generation. We show that both models perform significantly better than the VMM, with the Dirichlet-VMM marginally outperforming the TC-RBM. Finally, we evaluate the short order statistics of the models, using the Kullback-Leibler divergence between test sequences and model samples, and show that our proposed methods match the statistics of the music genre significantly better than the VMM.
The quality of training data is one of the crucial problems when a learning-centered approach is employed. This paper proposes a new method to investigate the quality of a large corpus designed for the recognizing textual entailment (RTE) task. The proposed method, which is inspired by a statistical hypothesis test, consists of two phases: the first phase is to introduce the predictability of textual entailment labels as a null hypothesis which is extremely unacceptable if a target corpus has no hidden bias, and the second phase is to test the null hypothesis using a Naive Bayes model. The experimental result of the Stanford Natural Language Inference (SNLI) corpus does not reject the null hypothesis. Therefore, it indicates that the SNLI corpus has a hidden bias which allows prediction of textual entailment labels from hypothesis sentences even if no context information is given by a premise sentence. This paper also presents the performance impact of NN models for RTE caused by this hidden bias.
That means the pooling layer computes a feature vector of size 128 which is passed into dense layers of the feedforward network as we mentioned above. The overall structure of the DNN can be understood as a preprocessor defined in the first part that is being trained to map text sequences into feature vectors in such a way that the weights of the second part can be trained to obtain optimal classification results from the overall network. More details on the implementation and text preprocessing can be found in my GitHub repository for this project. I trained this network for 10 epochs with a batch size of 128 using an 80-20 training/hold-out set. A couple of notes on additional parameters: The vast majority of documents in this collection is of length 5000 or less. So for the maximum input sequence length for the DNN I chose 5000 words. There are roughly 100,000 unique words in this collection of documents. I arbitrarily limited the dictionary that the DNN can learn to 25% of that: 25,000 words. Finally, for the embedding dimension, I chose 300 simply because that is the default embedding dimension for both word2vec and GloVe.
"V for Vendetta" masks are a typical feature of many political protests since the eponymous dystopian movie came out in 2005 -- but what if facial recognition technology was able to identify the face behind the mask? SEE ALSO: Why the iPhone 8's facial recognition could be a privacy disaster We're not there yet, but researchers are slowly and steadily making highly-controversial steps in this direction. Academics from Cambridge University, India's National Institute of Technology, and the Indian Institute of Science used deep learning and a dataset of pictures of people in disguise to try to identify masked faces with an acceptable level of reliability. The research, published on the preprint server arXiv and shared in an AI newsletter, went viral after prominent academic and sociologist Zeynep Tufekci shared it on Twitter. Stressing that the paper "isn't that great", Tufekci nonetheless points out that it's the direction that's worrying, as oppressive and authoritarian states could use the tool to stifle dissent and expose anonymous protesters.
Dynamic Boltzmann Machine (DyBM) has been shown highly efficient to predict time-series data. Gaussian DyBM is a DyBM that assumes the predicted data is generated by a Gaussian distribution whose first-order moment (mean) dynamically changes over time but its second-order moment (variance) is fixed. However, in many financial applications, the assumption is quite limiting in two aspects. First, even when the data follows a Gaussian distribution, its variance may change over time. Such variance is also related to important temporal economic indicators such as the market volatility. Second, financial time-series data often requires learning datasets generated by the generalized Gaussian distribution with an additional shape parameter that is important to approximate heavy-tailed distributions. Addressing those aspects, we show how to extend DyBM that results in significant performance improvement in predicting financial time-series data.