Clustering
Spectral clustering for dependent community Hawkes process models of temporal networks
Zhao, Lingfei, Soliman, Hadeel, Xu, Kevin S., Paul, Subhadeep
Temporal networks observed continuously over time through timestamped relational events data are commonly encountered in application settings including online social media communications, financial transactions, and international relations. Temporal networks often exhibit community structure and strong dependence patterns among node pairs. This dependence can be modeled through mutual excitations, where an interaction event from a sender to a receiver node increases the possibility of future events among other node pairs. We provide statistical results for a class of models that we call dependent community Hawkes (DCH) models, which combine the stochastic block model with mutually exciting Hawkes processes for modeling both community structure and dependence among node pairs, respectively. We derive a non-asymptotic upper bound on the misclustering error of spectral clustering on the event count matrix as a function of the number of nodes and communities, time duration, and the amount of dependence in the model. Our result leverages recent results on bounding an appropriate distance between a multivariate Hawkes process count vector and a Gaussian vector, along with results from random matrix theory. We also propose a DCH model that incorporates only self and reciprocal excitation along with highly scalable parameter estimation using a Generalized Method of Moments (GMM) estimator that we demonstrate to be consistent for growing network size and time duration.
Image Tokens Matter: Mitigating Hallucination in Discrete Tokenizer-based Large Vision-Language Models via Latent Editing
Wang, Weixing, Ding, Zifeng, Gu, Jindong, Cao, Rui, Meinel, Christoph, de Melo, Gerard, Yang, Haojin
Large Vision-Language Models (LVLMs) with discrete image tokenizers unify multimodal representations by encoding visual inputs into a finite set of tokens. Despite their effectiveness, we find that these models still hallucinate non-existent objects. We hypothesize that this may be due to visual priors induced during training: When certain image tokens frequently co-occur in the same spatial regions and represent shared objects, they become strongly associated with the verbalizations of those objects. As a result, the model may hallucinate by evoking visually absent tokens that often co-occur with present ones. To test this assumption, we construct a co-occurrence graph of image tokens using a segmentation dataset and employ a Graph Neural Network (GNN) with contrastive learning followed by a clustering method to group tokens that frequently co-occur in similar visual contexts. We find that hallucinations predominantly correspond to clusters whose tokens dominate the input, and more specifically, that the visually absent tokens in those clusters show much higher correlation with hallucinated objects compared to tokens present in the image. Based on this observation, we propose a hallucination mitigation method that suppresses the influence of visually absent tokens by modifying latent image embeddings during generation. Experiments show our method reduces hallucinations while preserving expressivity. Code is available at https://github.com/weixingW/CGC-VTD/tree/main
The Map Equation Goes Neural: Mapping Network Flows with Graph Neural Networks
Community detection is an essential tool for unsupervised data exploration and revealing the organisational structure of networked systems. With a long history in network science, community detection typically relies on objective functions, optimised with custom-tailored search algorithms, but often without leveraging recent advances in deep learning. Recently, first works have started incorporating such objectives into loss functions for deep graph clustering and pooling. We consider the map equation, a popular information-theoretic objective function for unsupervised community detection, and express it in differentiable tensor form for optimisation through gradient descent. Our formulation turns the map equation compatible with any neural network architecture, enables end-to-end learning, incorporates node features, and chooses the optimal number of clusters automatically, all without requiring explicit regularisation.
Identifying Heart Attack Risk in Vulnerable Population: A Machine Learning Approach
Chattopadhyay, Subhagata, Chattopadhyay, Amit K
The COVID-19 pandemic has significantly increased the incidence of post-infection cardiovascular events, particularly myocardial infarction, in individuals over 40. While the underlying mechanisms remain elusive, this study employs a hybrid machine learning approach to analyze epidemiological data in assessing 13 key heart attack risk factors and their susceptibility. Based on a unique dataset that combines demographic, biochemical, ECG, and thallium stress-tests, this study categorizes distinct subpopulations against varying risk profiles and then divides the population into 'at-risk' (AR) and 'not-at-risk' (NAR) groups using clustering algorithms. The study reveals strong association between the likelihood of experiencing a heart attack on the 13 risk factors studied. The aggravated risk for postmenopausal patients indicates compromised individual risk factors due to estrogen depletion that may be, further compromised by extraneous stress impacts, like anxiety and fear, aspects that have traditionally eluded data modeling predictions.
COM Adjustment Mechanism Control for Multi-Configuration Motion Stability of Unmanned Deformable Vehicle
Liu, Jun, Liu, Hongxun, Zhang, Cheng, Xing, Jiandang, Jiang, Shang, Jiang, Ping
An unmanned deformable vehicle is a wheel -legged robot that can transform between two configurations: a vehicular state and a humanoid state, which have different motion modes and stability characteristics. Aiming at the motion stability of an unmanned deformable vehicle in multiple configurations, a center -of -mass adjustment mechanism was designed in this study. Further, a motion stability hierarchical control algorithm was proposed based on this mechanism, and an electromechanical model based on a two -degree-of -freedom center -of -mass adjustment mechanism was established. An unmanned -deformable-vehicle vehicular-state steady -state steering dynamics model and a gait planning kinematic model of humanoid state walking were established. A stability hierarchical control strategy was designed b ased on the hybrid automata model, Fuzzy -PID control, K -means clustering algorithm, and variable universe fuzzy control - active disturbance rejection control (VUFC -ADRC) to realize the stability control of the unmanned deformable vehicle in multi -configuration motion. The simulation and test results showed that the steady-state steering stabi lity in the vehicular state and the walking stability in the humanoid state could be significantly improved by controlling the slider motion in the center-of -mass adjustment mechanism.
Bi-Level Unsupervised Feature Selection
Liu, Jingjing, Ju, Xiansen, Xiu, Xianchao, Liu, Wanquan
Unsupervised feature selection (UFS) is an important task in data engineering. However, most UFS methods construct models from a single perspective and often fail to simultaneously evaluate feature importance and preserve their inherent data structure, thus limiting their performance. To address this challenge, we propose a novel bi-level unsupervised feature selection (BLUFS) method, including a clustering level and a feature level. Specifically, at the clustering level, spectral clustering is used to generate pseudo-labels for representing the data structure, while a continuous linear regression model is developed to learn the projection matrix. At the feature level, the $\ell_{2,0}$-norm constraint is imposed on the projection matrix for more effectively selecting features. To the best of our knowledge, this is the first work to combine a bi-level framework with the $\ell_{2,0}$-norm. To solve the proposed bi-level model, we design an efficient proximal alternating minimization (PAM) algorithm, whose subproblems either have explicit solutions or can be computed by fast solvers. Furthermore, we establish the convergence result and computational complexity. Finally, extensive experiments on two synthetic datasets and eight real datasets demonstrate the superiority of BLUFS in clustering and classification tasks.
EyeGraph: Modularity-aware Spatio Temporal Graph Clustering for Continuous Event-based Eye Tracking
Continuous tracking of eye movement dynamics plays a significant role in developing a broad spectrum of human-centered applications, such as cognitive skills (visual attention and working memory) modeling, human-machine interaction, biometric user authentication, and foveated rendering. Recently neuromorphic cameras have garnered significant interest in the eye-tracking research community, owing to their sub-microsecond latency in capturing intensity changes resulting from eye movements. Nevertheless, the existing approaches for event-based eye tracking suffer from several limitations: dependence on RGB frames, label sparsity, and training on datasets collected in controlled lab environments that do not adequately reflect real-world scenarios. To address these limitations, in this paper, we propose a dynamic graph-based approach that uses a neuromorphic event stream captured by Dynamic Vision Sensors (DVS) for high-fidelity tracking of pupillary movement. More specifically, first, we present EyeGraph, a large-scale multi-modal near-eye tracking dataset collected using a wearable event camera attached to a head-mounted device from 40 participants -- the dataset was curated while mimicking in-the-wild settings, accounting for varying mobility and ambient lighting conditions.
BanditPAM: Almost Linear Time k-Medoids Clustering via Multi-Armed Bandits
Clustering is a ubiquitous task in data science. Compared to the commonly used k-means clustering, k-medoids clustering requires the cluster centers to be actual data points and supports arbitrary distance metrics, which permits greater interpretability and the clustering of structured objects. Current state-of-the-art k-medoids clustering algorithms, such as Partitioning Around Medoids (PAM), are iterative and are quadratic in the dataset size n for each iteration, being prohibitively expensive for large datasets. We propose BanditPAM, a randomized algorithm inspired by techniques from multi-armed bandits, that reduces the complexity of each PAM iteration from O(n 2) to O(nlogn) and returns the same results with high probability, under assumptions on the data that often hold in practice. As such, BanditPAM matches state-of-the-art clustering loss while reaching solutions much faster.
On the cohesion and separability of average-link for hierarchical agglomerative clustering
Average-link is widely recognized as one of the most popular and effective methods for building hierarchical agglomerative clustering. The available theoretical analyses show that this method has a much better approximation than other popular heuristics, as single-linkage and complete-linkage, regarding variants of Dasgupta's cost function [STOC 2016]. However, these analyses do not separate average-link from a random hierarchy and they are not appealing for metric spaces since every hierarchical clustering has a 1/2 approximation with regard to the variant of Dasgupta's functionthat is employed for dissimilarity measures [Moseley and Yang 2020]. In this paper, we present a comprehensive study of the performance of \avglink \, in metric spaces, regarding several natural criteria that capture separability and cohesion, and are more interpretable than Dasgupta's cost function and its variants. We also present experimental results with real datasets that, together with our theoretical analyses, suggest that average-link is a better choice than other related methods when both cohesion and separability are important goals.
Interactive Deep Clustering via Value Mining
In the absence of class priors, recent deep clustering methods resort to data augmentation and pseudo-labeling strategies to generate supervision signals. Though achieved remarkable success, existing works struggle to discriminate hard samples at cluster boundaries, mining which is particularly challenging due to their unreliable cluster assignments. To break such a performance bottleneck, we propose incorporating user interaction to facilitate clustering instead of exhaustively mining semantics from the data itself. To be exact, we present Interactive Deep Clustering (IDC), a plug-and-play method designed to boost the performance of pre-trained clustering models with minimal interaction overhead. More specifically, IDC first quantitatively evaluates sample values based on hardness, representativeness, and diversity, where the representativeness avoids selecting outliers and the diversity prevents the selected samples from collapsing into a small number of clusters.