The objective of this course is to give you a wholistic understanding of machine learning, covering theory, application, and inner workings of supervised, unsupervised, and deep learning algorithms. In this series, we'll be covering linear regression, K Nearest Neighbors, Support Vector Machines (SVM), flat clustering, hierarchical clustering, and neural networks. For each major algorithm that we cover, we will discuss the high level intuitions of the algorithms and how they are logically meant to work. Next, we'll apply the algorithms in code using real world data sets along with a module, such as with Scikit-Learn. Finally, we'll be diving into the inner workings of each of the algorithms by recreating them in code, from scratch, ourselves, including all of the math involved.
We propose a logistic regression model taking into account two analytically different sets of factors–structure and action. The factors include individual, dyadic, and triadic properties between ego and alter whose tie breakup is under consideration. From the fitted model using a large-scale data, we discover 5 structural and 7 actional variables to have significant explanatory power for unfollow. One unique finding from our quantitative analysis is that people appreciate receiving acknowledgements from others even in virtually unilateral communication relationships and are less likely to unfollow them: people are more of a receiver than a giver.
Deep learning is well known to be very amenable to GPU acceleration. Accelerating "traditional" machine learning methods like logistic regression, linear regression, and support vector machines with GPUs at scale, has, however, been challenging. Today I am very proud to share a major breakthrough that IBM Research has made in this critical area. A team out of our Zurich IBM Research lab beat a previous performance benchmark set for a machine learning workload by Google by 46 times. The research team trained a logistic regression classifier to predict clicks on advertisements using a Terabyte-scale data set that consists of online advertising click-thru data, containing 4.2 billion training examples and 1 million features.
This short paper is describing a demonstrator that is complementing the paper "Towards Cross-Media Feature Extraction" in these proceedings. The demo is exemplifying the use of textual resources, out of which semantic information can be extracted, for supporting the semantic annotation and indexing of associated video material in the soccer domain. Entities and events extracted from textual data are marked-up with semantic classes derived from an ontology modeling the soccer domain. We show further how extracted Audio-Video features by video analysis can be taken into account for additional annotation of specific soccer event types, and how those different types of annotation can be combined.
Recent work in unsupervised feature learning and deep learning has shown that being able to train large models can dramatically improve performance. In this paper, we consider the problem of training a deep network with billions of parameters using tens of thousands of CPU cores. We have developed a software framework called DistBelief that can utilize computing clusters with thousands of machines to train large models. Within this framework, we have developed two algorithms for large-scale distributed training: (i) Downpour SGD, an asynchronous stochastic gradient descent procedure supporting a large number of model replicas, and (ii) Sandblaster, a framework that supports a variety of distributed batch optimization procedures, including a distributed implementation of L-BFGS. We have successfully used our system to train a deep network 30x larger than previously reported in the literature, and achieves state-of-the-art performance on ImageNet, a visual object recognition task with 16 million images and 21k categories.