Goto

Collaborating Authors

 Lee, Dong-Gyu


Enhancing Clustering Representations with Positive Proximity and Cluster Dispersion Learning

arXiv.org Artificial Intelligence

Contemporary deep clustering approaches often rely on either contrastive or non-contrastive techniques to acquire effective representations for clustering tasks. Contrastive methods leverage negative pairs to achieve homogenous representations but can introduce class collision issues, potentially compromising clustering performance. On the contrary, non-contrastive techniques prevent class collisions but may produce non-uniform representations that lead to clustering collapse. In this work, we propose a novel end-to-end deep clustering approach named PIPCDR, designed to harness the strengths of both approaches while mitigating their limitations. PIPCDR incorporates a positive instance proximity loss and a cluster dispersion regularizer. The positive instance proximity loss ensures alignment between augmented views of instances and their sampled neighbors, enhancing within-cluster compactness by selecting genuinely positive pairs within the embedding space. Meanwhile, the cluster dispersion regularizer maximizes inter-cluster distances while minimizing within-cluster compactness, promoting uniformity in the learned representations. PIPCDR excels in producing well-separated clusters, generating uniform representations, avoiding class collision issues, and enhancing within-cluster compactness. We extensively validate the effectiveness of PIPCDR within an end-to-end Majorize-Minimization framework, demonstrating its competitive performance on moderate-scale clustering benchmark datasets and establishing new state-of-the-art results on large-scale datasets.


RemoteNet: Remote Sensing Image Segmentation Network based on Global-Local Information

arXiv.org Artificial Intelligence

Remotely captured images possess an immense scale and object appearance variability due to the complex scene. It becomes challenging to capture the underlying attributes in the global and local context for their segmentation. Existing networks struggle to capture the inherent features due to the cluttered background. To address these issues, we propose a remote sensing image segmentation network, RemoteNet, for semantic segmentation of remote sensing images. We capture the global and local features by leveraging the benefits of the transformer and convolution mechanisms. RemoteNet is an encoder-decoder design that uses multi-scale features. We construct an attention map module to generate channel-wise attention scores for fusing these features. We construct a global-local transformer block (GLTB) in the decoder network to support learning robust representations during a decoding phase. Further, we designed a feature refinement module to refine the fused output of the shallow stage encoder feature and the deepest GLTB feature of the decoder. Experimental findings on the two public datasets show the effectiveness of the proposed RemoteNet.