Collaborating Authors

Capitalize on the Big Data boom with 6 courses of self-paced training


To master Big Data you need big training. Fortunately, a fantastic opportunity has presented itself this week. Big Data is one of the exploding markets in the tech industry. It involves methods of handling and analyzing extremely large data sets, whether they be insurance figures or social media trends. Hadoop, Spark, and SAS are among a variety of tools used to achieve this aim.

Adversarial Self-Paced Learning for Mixture Models of Hawkes Processes Machine Learning

We propose a novel adversarial learning strategy for mixture models of Hawkes processes, leveraging data augmentation techniques of Hawkes process in the framework of self-paced learning. Instead of learning a mixture model directly from a set of event sequences drawn from different Hawkes processes, the proposed method learns the target model iteratively, which generates "easy" sequences and uses them in an adversarial and self-paced manner. In each iteration, we first generate a set of augmented sequences from original observed sequences. Based on the fact that an easy sample of the target model can be an adversarial sample of a misspecified model, we apply a maximum likelihood estimation with an adversarial self-paced mechanism. In this manner the target model is updated, and the augmented sequences that obey it are employed for the next learning iteration. Experimental results show that the proposed method outperforms traditional methods consistently.

Self-Paced Learning for Latent Variable Models

Neural Information Processing Systems

Latent variable models are a powerful tool for addressing several tasks in machine learning. However, the algorithms for learning the parameters of latent variable models are prone to getting stuck in a bad local optimum. To alleviate this problem, we build on the intuition that, rather than considering all samples simultaneously, the algorithm should be presented with the training data in a meaningful order that facilitates learning. The order of the samples is determined by how easy they are. The main challenge is that often we are not provided with a readily computable measure of the easiness of samples.

Top 8 reasons to choose Azure HDInsight


Household names such as Adobe, Jet, ASOS, Schneider Electric, and Milliman are amongst hundreds of enterprises that are powering their Big Data Analytics using Azure HDInsight. Azure HDInsight launched nearly six years ago and has since become the best place to run Apache Hadoop and Spark analytics on Azure. We will monitor the cluster and all the services, detect and repair common issues and respond to issues 24/7. Your big data applications can run more reliably as your HDInsight service monitors the health and automatically recovers from failures. Isolate your HDInsight cluster within VNETs and take advantage of transparent data encryption.