Overview
Sparse Linear Isotonic Models
Chen, Sheng, Banerjee, Arindam
In machine learning and data mining, linear models have been widely used to model the response as parametric linear functions of the predictors. To relax such stringent assumptions made by parametric linear models, additive models consider the response to be a summation of unknown transformations applied on the predictors; in particular, additive isotonic models (AIMs) assume the unknown transformations to be monotone. In this paper, we introduce sparse linear isotonic models (SLIMs) for highdimensional problems by hybridizing ideas in parametric sparse linear models and AIMs, which enjoy a few appealing advantages over both. In the high-dimensional setting, a two-step algorithm is proposed for estimating the sparse parameters as well as the monotone functions over predictors. Under mild statistical assumptions, we show that the algorithm can accurately estimate the parameters. Promising preliminary experiments are presented to support the theoretical results.
Debugging & Visualising training of Neural Network with TensorBoard
I started my deep learning journey a few years back. I have learnt a lot in this period. But, even after all these efforts, every Neural network I train provides me with a new experience. If you have tried to train a neural network, you must know my plight! But, through all this time, I have now made a workflow, which I will share with you today.
Shehroz Khan's answer to Do you know unsupervised image classification? - Quora
Any form of classification is supervised and not unsupervised[1][2]. You are probably interested in unsupervised image segmentation, where the algorithm attempts to determine which pixels are related and groups them into certain categories. This can be done by using traditional partitional clustering algorithms, such as K-means/EM[3], or advanced deep learning methods such as convolutional autoencoders[4], bayesian methods[5] and so on. You may read this survey research paper on the evaluation of such techniques - Image segmentation evaluation: A survey of unsupervised methods.
Recent Advances in Zero-shot Recognition
Fu, Yanwei, Xiang, Tao, Jiang, Yu-Gang, Xue, Xiangyang, Sigal, Leonid, Gong, Shaogang
With the recent renaissance of deep convolution neural networks, encouraging breakthroughs have been achieved on the supervised recognition tasks, where each class has sufficient training data and fully annotated training data. However, to scale the recognition to a large number of classes with few or now training samples for each class remains an unsolved problem. One approach to scaling up the recognition is to develop models capable of recognizing unseen categories without any training instances, or zero-shot recognition/ learning. This article provides a comprehensive review of existing zero-shot recognition techniques covering various aspects ranging from representations of models, and from datasets and evaluation settings. We also overview related recognition tasks including one-shot and open set recognition which can be used as natural extensions of zero-shot recognition when limited number of class samples become available or when zero-shot recognition is implemented in a real-world setting. Importantly, we highlight the limitations of existing approaches and point out future research directions in this existing new research area.
Alibaba Launches Global Research Program for Cutting-Edge Technology Development
HANGZHOU, China--(BUSINESS WIRE)--Alibaba Group Holding Ltd. ("Alibaba Group") announced today the launch of an innovative global research program, "Alibaba DAMO Academy ("Academy")," which is designed to increase technological collaboration worldwide, advance the development of cutting-edge technology and strive to make the world more inclusive by narrowing the technology gap. With the setup of the Academy, the company expects to invest more than US$15billion in research and development over the next three years. The Academy, which stands for the "Academy for Discovery, Adventure, Momentum and Outlook," will oversee the opening of research and development labs worldwide and seek to recruit talented scientists and researchers to join the program. Alibaba Group's Chief Technology Officer, Jeff ZHANG will be the head of the Academy. In the beginning, the Academy will focus on the opening of seven research labs in China (Beijing and Hangzhou), the United States (San Mateo and Bellevue), Russia (Moscow), Israel (Tel Aviv) and Singapore.
TensorFlow Lattice ensures your machine learning models follow global trends
Google's TensorFlow team released TensorFlow Lattice today to help developers ensure that their machine learning models adhere to global trends even when training data is noisy. Lattice draws from the concept of lookup tables to simplify the process of defining macro rules to restrict models. A lookup table is a representation of data that includes inputs (keys) and outputs (values). It's easiest to conceptualize with a single key linking to a single output, but there can be multiple keys in the case of more complex multi-dimensional functions. Roughly speaking, the TensorFlow team's approach is to train the lookup table values using training data to maximize accuracy given constraints.
An introduction to Topological Data Analysis: fundamental and practical aspects for data scientists
Chazal, Frรฉdรฉric, Michel, Bertrand
Topological Data Analysis (tda) is a recent and fast growing eld providing a set of new topological and geometric tools to infer relevant features for possibly complex data. This paper is a brief introduction, through a few selected topics, to basic fundamental and practical aspects of tda for non experts. 1 Introduction and motivation Topological Data Analysis (tda) is a recent eld that emerged from various works in applied (algebraic) topology and computational geometry during the rst decade of the century. Although one can trace back geometric approaches for data analysis quite far in the past, tda really started as a eld with the pioneering works of Edelsbrunner et al. (2002) and Zomorodian and Carlsson (2005) in persistent homology and was popularized in a landmark paper in 2009 Carlsson (2009). tda is mainly motivated by the idea that topology and geometry provide a powerful approach to infer robust qualitative, and sometimes quantitative, information about the structure of data-see, e.g. Chazal (2017). tda aims at providing well-founded mathematical, statistical and algorithmic methods to infer, analyze and exploit the complex topological and geometric structures underlying data that are often represented as point clouds in Euclidean or more general metric spaces. During the last few years, a considerable eort has been made to provide robust and ecient data structures and algorithms for tda that are now implemented and available and easy to use through standard libraries such as the Gudhi library (C++ and Python) Maria et al. (2014) and its R software interface Fasy et al. (2014a). Although it is still rapidly evolving, tda now provides a set of mature and ecient tools that can be used in combination or complementary to other data sciences tools. The tdapipeline. tda has recently known developments in various directions and application elds. There now exist a large variety of methods inspired by topological and geometric approaches. Providing a complete overview of all these existing approaches is beyond the scope of this introductory survey. However, most of them rely on the following basic and standard pipeline that will serve as the backbone of this paper: 1. The input is assumed to be a nite set of points coming with a notion of distance-or similarity between them. This distance can be induced by the metric in the ambient space (e.g. the Euclidean metric when the data are embedded in R d) or come as an intrinsic metric dened by a pairwise distance matrix. The denition of the metric on the data is usually given as an input or guided by the application. It is however important to notice that the choice of the metric may be critical to reveal interesting topological and geometric features of the data.
Stochastic Runtime Analysis of a Cross Entropy Algorithm for Traveling Salesman Problems
Wu, Zijun, Moehring, Rolf, Lai, Jianhui
This article analyzes the stochastic runtime of a Cross-Entropy Algorithm on two classes of traveling salesman problems. The algorithm shares main features of the famous Max-Min Ant System with iteration-best reinforcement. For simple instances that have a $\{1,n\}$-valued distance function and a unique optimal solution, we prove a stochastic runtime of $O(n^{6+\epsilon})$ with the vertex-based random solution generation, and a stochastic runtime of $O(n^{3+\epsilon}\ln n)$ with the edge-based random solution generation for an arbitrary $\epsilon\in (0,1)$. These runtimes are very close to the known expected runtime for variants of Max-Min Ant System with best-so-far reinforcement. They are obtained for the stronger notion of stochastic runtime, which means that an optimal solution is obtained in that time with an overwhelming probability, i.e., a probability tending exponentially fast to one with growing problem size. We also inspect more complex instances with $n$ vertices positioned on an $m\times m$ grid. When the $n$ vertices span a convex polygon, we obtain a stochastic runtime of $O(n^{3}m^{5+\epsilon})$ with the vertex-based random solution generation, and a stochastic runtime of $O(n^{2}m^{5+\epsilon})$ for the edge-based random solution generation. When there are $k = O(1)$ many vertices inside a convex polygon spanned by the other $n-k$ vertices, we obtain a stochastic runtime of $O(n^{4}m^{5+\epsilon}+n^{6k-1}m^{\epsilon})$ with the vertex-based random solution generation, and a stochastic runtime of $O(n^{3}m^{5+\epsilon}+n^{3k}m^{\epsilon})$ with the edge-based random solution generation. These runtimes are better than the expected runtime for the so-called $(\mu\!+\!\lambda)$ EA reported in a recent article, and again obtained for the stronger notion of stochastic runtime.
Machine Learning: Understanding Decision Tree Learning
As the data that is fed becomes larger, the decision tree tends to become longer. In such cases, noise and corrupt/incorrect data can have a detrimental impact on the decision tree. This results in the decision tree overfitting the dataset, that is decision tree performs satisfactory for the training data, but fails to produce an appropriate approximation of the target concept when it encounters actual data. Overfitting can also occur when insufficent data is provided to build the decision tree (like perhaps, our previous with only 6 rows.)