A Graph-Partitioning Based Continuous Optimization Approach to Semi-supervised Clustering Problems
Liu, Wei, Liu, Xin, Ng, Michael K., Zhang, Zaikun
–arXiv.org Artificial Intelligence
Semi-supervised clustering is a basic problem in various applications. Most existing methods require knowledge of the ideal cluster number, which is often difficult to obtain in practice. Besides, satisfying the must-link constraints is another major challenge for these methods. In this work, we view the semi-supervised clustering task as a partitioning problem on a graph associated with the given dataset, where the similarity matrix includes a scaling parameter to reflect the must-link constraints. Utilizing a relaxation technique, we formulate the graph partitioning problem into a continuous optimization model that does not require the exact cluster number, but only an overestimate of it. We then propose a block coordinate descent algorithm to efficiently solve this model, and establish its convergence result. Based on the obtained solution, we can construct the clusters that theoretically meet the must-link constraints under mild assumptions. Furthermore, we verify the effectiveness and efficiency of our proposed method through comprehensive numerical experiments.
arXiv.org Artificial Intelligence
Mar-6-2025
- Country:
- North America > United States
- California (0.14)
- Wisconsin (0.14)
- North America > United States
- Genre:
- Research Report (0.64)
- Technology: