Clustering
Robust Metric Learning based on the Rescaled Hinge Loss
Al-Obaidi, Sumia Abdulhussien Razooqi, Zabihzadeh, Davood, Rasheed, Ali Salim, Monsefi, Reza
Distance/Similarity learning is a fundamental problem in machine learning. For example, kNN classifier or clustering methods are based on a distance/similarity measure. Metric learning algorithms enhance the efficiency of these methods by learning an optimal distance function from data. Most metric learning methods need training information in the form of pair or triplet sets. Nowadays, this training information often is obtained from the Internet via crowdsourcing methods. Therefore, this information may contain label noise or outliers leading to the poor performance of the learned metric. It is even possible that the learned metric functions perform worse than the general metrics such as Euclidean distance. To address this challenge, this paper presents a new robust metric learning method based on the Rescaled Hinge loss. This loss function is a general case of the popular Hinge loss and initially introduced in (Xu et al. 2017) to develop a new robust SVM algorithm. In this paper, we formulate the metric learning problem using the Rescaled Hinge loss function and then develop an efficient algorithm based on HQ (Half-Quadratic) to solve the problem. Experimental results on a variety of both real and synthetic datasets confirm that our new robust algorithm considerably outperforms state-of-the-art metric learning methods in the presence of label noise and outliers.
The Mutex Watershed and its Objective: Efficient, Parameter-Free Image Partitioning
Wolf, Steffen, Bailoni, Alberto, Pape, Constantin, Rahaman, Nasim, Kreshuk, Anna, Köthe, Ullrich, Hamprecht, Fred A.
Image partitioning, or segmentation without semantics, is the task of decomposing an image into distinct segments, or equivalently to detect closed contours. Most prior work either requires seeds, one per segment; or a threshold; or formulates the task as multicut / correlation clustering, an NP-hard problem. Here, we propose a greedy algorithm for signed graph partitioning, the "Mutex Watershed". Unlike seeded watershed, the algorithm can accommodate not only attractive but also repulsive cues, allowing it to find a previously unspecified number of segments without the need for explicit seeds or a tunable threshold. We also prove that this simple algorithm solves to global optimality an objective function that is intimately related to the multicut / correlation clustering integer linear programming formulation. The algorithm is deterministic, very simple to implement, and has empirically linearithmic complexity. When presented with short-range attractive and long-range repulsive cues from a deep neural network, the Mutex Watershed gives the best results currently known for the competitive ISBI 2012 EM segmentation benchmark.
Discrete Optimal Graph Clustering
Han, Yudong, Zhu, Lei, Cheng, Zhiyong, Li, Jingjing, Liu, Xiaobai
Graph based clustering is one of the major clustering methods. Most of it work in three separate steps: similarity graph construction, clustering label relaxing and label discretization with k-means. Such common practice has three disadvantages: 1) the predefined similarity graph is often fixed and may not be optimal for the subsequent clustering. 2) the relaxing process of cluster labels may cause significant information loss. 3) label discretization may deviate from the real clustering result since k-means is sensitive to the initialization of cluster centroids. To tackle these problems, in this paper, we propose an effective discrete optimal graph clustering (DOGC) framework. A structured similarity graph that is theoretically optimal for clustering performance is adaptively learned with a guidance of reasonable rank constraint. Besides, to avoid the information loss, we explicitly enforce a discrete transformation on the intermediate continuous label, which derives a tractable optimization problem with discrete solution. Further, to compensate the unreliability of the learned labels and enhance the clustering accuracy, we design an adaptive robust module that learns prediction function for the unseen data based on the learned discrete cluster labels. Finally, an iterative optimization strategy guaranteed with convergence is developed to directly solve the clustering results. Extensive experiments conducted on both real and synthetic datasets demonstrate the superiority of our proposed methods compared with several state-of-the-art clustering approaches.
Machine Learning Tips and Tricks for Power Line Communications
Tonello, Andrea M., Letizia, Nunzio A., Righini, Davide, Marcuzzi, Francesco
A great deal of attention has been recently given to Machine Learning (ML) techniques in many different application fields. This paper provides a vision of what ML can do in Power Line Communications (PLC). We firstly and briefly describe classical formulations of ML, and distinguish deterministic problems from statistical problems with relevance to communications. We then discuss ML applications in PLC for each layer, namely, for characterization and modeling, for physical layer algorithms, for media access control and networking algorithms. Finally, other applications of PLC that can benefit from the usage of ML, as grid diagnostics, are analyzed. Illustrative numerical examples are reported to serve the purpose of validating the ideas and motivate future research endeavors in this stimulating signal/data processing field.
A Unified Framework for Structured Graph Learning via Spectral Constraints
Kumar, Sandeep, Ying, Jiaxi, Cardoso, José Vinícius de M., Palomar, Daniel
Graph learning from data represents a canonical problem that has received substantial attention in the literature. However, insufficient work has been done in incorporating prior structural knowledge onto the learning of underlying graphical models from data. Learning a graph with a specific structure is essential for interpretability and identification of the relationships among data. Useful structured graphs include the multi-component graph, bipartite graph, connected graph, sparse graph, and regular graph. In general, structured graph learning is an NP-hard combinatorial problem, therefore, designing a general tractable optimization method is extremely challenging. In this paper, we introduce a unified graph learning framework lying at the integration of Gaussian graphical models and spectral graph theory. To impose a particular structure on a graph, we first show how to formulate the combinatorial constraints as an analytical property of the graph matrix. Then we develop an optimization framework that leverages graph learning with specific structures via spectral constraints on graph matrices. The proposed algorithms are provably convergent, computationally efficient, and practically amenable for numerous graph-based tasks. Extensive numerical experiments with both synthetic and real data sets illustrate the effectiveness of the proposed algorithms. The code for all the simulations is made available as an open source repository.
TiK-means: $K$-means clustering for skewed groups
Berry, Nicholas S., Maitra, Ranjan
The $K$-means algorithm is extended to allow for partitioning of skewed groups. Our algorithm is called TiK-Means and contributes a $K$-means type algorithm that assigns observations to groups while estimating their skewness-transformation parameters. The resulting groups and transformation reveal general-structured clusters that can be explained by inverting the estimated transformation. Further, a modification of the jump statistic chooses the number of groups. Our algorithm is evaluated on simulated and real-life datasets and then applied to a long-standing astronomical dispute regarding the distinct kinds of gamma ray bursts.
The Why's and how's of Machine Learning
The knowledge is the output of learning through the inseparable combination of theory and practice. It's what remains in one's experience from all the data which got shaped into what we call information. This process can be noticed throughout the different stages of our lives and it's never limited to the academic journey. What I'm aiming to express is that machine learning is nothing but a human logic tailored for more complex problems that surely require more computational capabilities. The last quote represents the nature knowledge acquiring process which, as you may notice, is similar to CRISP-DM Methodology which I detailed in a previous article and which is essential to succeed in your data mining project. To define Machine learning, its is a set of algorithms that are included in the many operations like the Data Mining process and which help you transform your raw data into knowledge, the layer that hides under the obvious information.
Top 10 Machine Learning Algorithms for Data Science
For the majority of newcomers, machine learning algorithms may seem too boring and complicated subject to be mastered. Well, to some extent, this is true. In most cases, you stumble upon a few-page description for each algorithm and yes, it's hard to find time and energy to deal with each and every detail. However, if you truly, madly, deeply want to be an ML-expert, you have to brush up your knowledge regarding it and there is no other way to be. But relax, today I will try to simplify this task and explain core principles of 10 most common algorithms in simple words (each includes a brief description, guides, and useful links).
Identifying Points of Interest and Similar Individuals from Raw GPS Data
Smartphones and portable devices have become ubiquitous and part of everyone's life. Due to the fact of its portability, these devices are perfect to record individuals' traces and life-logging generating vast amounts of data at low costs. These data is emerging as a new source for studies in human mobility patterns raising the number of research projects and techniques aiming to analyze and retrieve useful information from it. The aim of this paper is to explore GPS raw data from different individuals in a community and apply data mining algorithms to identify meaningful places in a region and describe user's profiles and its similarities. We evaluate the proposed method with a real-world dataset. The experimental results show that the steps performed to identify points of interest (POIs) and further the similarity between the users are quite satisfactory serving as a supplement for urban planning and social networks.
Optimal initialization of K-means using Particle Swarm Optimization
This paper proposes the use of an optimization algorithm, namely PSO to decide the initial centroids in K-means, to eventually get better accuracy. The vectorized notation of the optimal centroids can be thought of as entities in an optimization space, where the accuracy of K-means over a random subset of the data could act as a fitness measure. The resultant optimal vector can be used as the initial centroids for K-means.