Goto

Collaborating Authors

 Nearest Neighbor Methods


Vicinity-Guided Discriminative Latent Diffusion for Privacy-Preserving Domain Adaptation

arXiv.org Artificial Intelligence

Recent work on latent diffusion models (LDMs) has focused almost exclusively on generative tasks, leaving their potential for discriminative transfer largely unexplored. We introduce Discriminative Vicinity Diffusion (DVD), a novel LDM-based framework for a more practical variant of source-free domain adaptation (SFDA): the source provider may share not only a pre-trained classifier but also an auxiliary latent diffusion module, trained once on the source data and never exposing raw source samples. DVD encodes each source feature's label information into its latent vicinity by fitting a Gaussian prior over its k-nearest neighbors and training the diffusion network to drift noisy samples back to label-consistent representations. During adaptation, we sample from each target feature's latent vicinity, apply the frozen diffusion module to generate source-like cues, and use a simple InfoNCE loss to align the target encoder to these cues, explicitly transferring decision boundaries without source access. Across standard SFDA benchmarks, DVD outperforms state-of-the-art methods. We further show that the same latent diffusion module enhances the source classifier's accuracy on in-domain data and boosts performance in supervised classification and domain generalization experiments. DVD thus reinterprets LDMs as practical, privacy-preserving bridges for explicit knowledge transfer, addressing a core challenge in source-free domain adaptation that prior methods have yet to solve.


I Detect What I Don't Know: Incremental Anomaly Learning with Stochastic Weight Averaging-Gaussian for Oracle-Free Medical Imaging

arXiv.org Artificial Intelligence

Unknown anomaly detection in medical imaging remains a fundamental challenge due to the scarcity of labeled anomalies and the high cost of expert supervision. We introduce an unsupervised, oracle-free framework that incrementally expands a trusted set of normal samples without any anomaly labels. Starting from a small, verified seed of normal images, our method alternates between lightweight adapter updates and uncertainty-gated sample admission. A frozen pretrained vision backbone is augmented with tiny convolutional adapters, ensuring rapid domain adaptation with negligible computational overhead. Extracted embeddings are stored in a compact coreset enabling efficient k-nearest neighbor anomaly (k-NN) scoring. Safety during incremental expansion is enforced by dual probabilistic gates, a sample is admitted into the normal memory only if its distance to the existing coreset lies within a calibrated z-score threshold, and its SWAG-based epistemic uncertainty remains below a seed-calibrated bound. This mechanism prevents drift and false inclusions without relying on generative reconstruction or replay buffers. Empirically, our system steadily refines the notion of normality as unlabeled data arrive, producing substantial gains over baselines. On COVID-CXR, ROC-AUC improves from 0.9489 to 0.9982 (F1: 0.8048 to 0.9746); on Pneumonia CXR, ROC-AUC rises from 0.6834 to 0.8968; and on Brain MRI ND-5, ROC-AUC increases from 0.6041 to 0.7269 and PR-AUC from 0.7539 to 0.8211. These results highlight the effectiveness and efficiency of the proposed framework for real-world, label-scarce medical imaging applications.


Fuzzy Label: From Concept to Its Application in Label Learning

arXiv.org Artificial Intelligence

Label learning is a fundamental task in machine learning that aims to construct intelligent models using labeled data, encompassing traditional single-label and multi-label classification models. Traditional methods typically rely on logical labels, such as binary indicators (e.g., "yes/no") that specify whether an instance belongs to a given category. However, in practical applications, label annotations often involve significant uncertainty due to factors such as data noise, inherent ambiguity in the observed entities, and the subjectivity of human annotators. Therefore, representing labels using simplistic binary logic can obscure valuable information and limit the expressiveness of label learning models. To overcome this limitation, this paper introduces the concept of fuzzy labels, grounded in fuzzy set theory, to better capture and represent label uncertainty. We further propose an efficient fuzzy labeling method that mines and generates fuzzy labels from the original data, thereby enriching the label space with more informative and nuanced representations. Based on this foundation, we present fuzzy-label-enhanced algorithms for both single-label and multi-label learning, using the classical K-Nearest Neighbors (KNN) and multi-label KNN algorithms as illustrative examples. Experimental results indicate that fuzzy labels can more effectively characterize the real-world labeling information and significantly enhance the performance of label learning models.


PPDiff: Diffusing in Hybrid Sequence-Structure Space for Protein-Protein Complex Design

arXiv.org Artificial Intelligence

Designing protein-binding proteins with high affinity is critical in biomedical research and biotechnology. Despite recent advancements targeting specific proteins, the ability to create high-affinity binders for arbitrary protein targets on demand, without extensive rounds of wet-lab testing, remains a significant challenge. Here, we introduce PPDiff, a diffusion model to jointly design the sequence and structure of binders for arbitrary protein targets in a non-autoregressive manner. PPDiffbuilds upon our developed Sequence Structure Interleaving Network with Causal attention layers (SSINC), which integrates interleaved self-attention layers to capture global amino acid correlations, k-nearest neighbor (kNN) equivariant graph layers to model local interactions in three-dimensional (3D) space, and causal attention layers to simplify the intricate interdependencies within the protein sequence. To assess PPDiff, we curate PPBench, a general protein-protein complex dataset comprising 706,360 complexes from the Protein Data Bank (PDB). The model is pretrained on PPBenchand finetuned on two real-world applications: target-protein mini-binder complex design and antigen-antibody complex design. PPDiffconsistently surpasses baseline methods, achieving success rates of 50.00%, 23.16%, and 16.89% for the pretraining task and the two downstream applications, respectively. The code, data and models are available at https://github.com/JocelynSong/PPDiff.


Representing Classical Compositions through Implication-Realization Temporal-Gestalt Graphs

arXiv.org Artificial Intelligence

Understanding the structural and cognitive underpinnings of musical compositions remains a key challenge in music theory and computational musicology. While traditional methods focus on harmony and rhythm, cognitive models such as the Implication-Realization (I-R) model and Temporal Gestalt theory offer insight into how listeners perceive and anticipate musical structure. This study presents a graph-based computational approach that operationalizes these models by segmenting melodies into perceptual units and annotating them with I-R patterns. These segments are compared using Dynamic Time Warping and organized into k-nearest neighbors graphs to model intra- and inter-segment relationships. Each segment is represented as a node in the graph, and nodes are further labeled with melodic expectancy values derived from Schellenberg's two-factor I-R model-quantifying pitch proximity and pitch reversal at the segment level. This labeling enables the graphs to encode both structural and cognitive information, reflecting how listeners experience musical tension and resolution. To evaluate the expressiveness of these graphs, we apply the Weisfeiler-Lehman graph kernel to measure similarity between and within compositions. Results reveal statistically significant distinctions between intra- and inter-graph structures. Segment-level analysis via multidimensional scaling confirms that structural similarity at the graph level reflects perceptual similarity at the segment level. Graph2vec embeddings and clustering demonstrate that these representations capture stylistic and structural features that extend beyond composer identity. These findings highlight the potential of graph-based methods as a structured, cognitively informed framework for computational music analysis, enabling a more nuanced understanding of musical structure and style through the lens of listener perception.


A supervised discriminant data representation: application to pattern classification

arXiv.org Artificial Intelligence

The performance of machine learning and pattern recognition algorithms generally depends on data representation. That is why, much of the current effort in performing machine learning algorithms goes into the design of preprocessing frameworks and data transformations able to support effective machine learning. The method proposed in this work consists of a hybrid linear feature extraction scheme to be used in supervised multi-class classification problems. Inspired by two recent linear discriminant methods: robust sparse linear discriminant analysis (RSLDA) and inter-class sparsitybased discriminative least square regression (ICS_DLSR), we propose a unifying criterion that is able to retain the advantages of these two powerful methods. The resulting transformation relies on sparsity-promoting techniques both to select the features that most accurately represent the data and to preserve the row-sparsity consistency property of samples from the same class. The linear transformation and the orthogonal matrix are estimated using an iterative alternating minimization scheme based on steepest descent gradient method and different initialization schemes. The proposed framework is generic in the sense that it allows the combination and tuning of other linear discriminant embedding methods. According to the experiments conducted on several datasets including faces, objects, and digits, the proposed method was able to outperform competing methods in most cases.


LOC: A General Language-Guided Framework for Open-Set 3D Occupancy Prediction

arXiv.org Artificial Intelligence

Vision-Language Models (VLMs) have shown significant progress in open-set challenges. However, the limited availability of 3D datasets hinders their effective application in 3D scene understanding. We propose LOC, a general language-guided framework adaptable to various occupancy networks, supporting both supervised and self-supervised learning paradigms. For self-supervised tasks, we employ a strategy that fuses multi-frame LiDAR points for dynamic/static scenes, using Poisson reconstruction to fill voids, and assigning semantics to voxels via K-Nearest Neighbor (KNN) to obtain comprehensive voxel representations. To mitigate feature over-homogenization caused by direct high-dimensional feature distillation, we introduce Densely Contrastive Learning (DCL). DCL leverages dense voxel semantic information and predefined textual prompts. This efficiently enhances open-set recognition without dense pixel-level supervision, and our framework can also leverage existing ground truth to further improve performance. Our model predicts dense voxel features embedded in the CLIP feature space, integrating textual and image pixel information, and classifies based on text and semantic similarity. Experiments on the nuScenes dataset demonstrate the method's superior performance, achieving high-precision predictions for known classes and distinguishing unknown classes without additional training data.


Precise classification of low quality G-banded Chromosome Images by reliability metrics and data pruning classifier

arXiv.org Artificial Intelligence

In the last decade, due to high resolution cameras and accurate meta - phase analyzes, the accuracy of chromosome classification has improved substantially. However, current Karyotyping systems demand large number of high quality train data to have an adequa tely plausible Precision per each chromosome. Such provision of high quality train data with accurate devices are not yet accomplished in some out - reached pathological laboratories. To prevent false positive detections in low - cost systems and low - quality i mages settings, this paper improves the classification Precision of chromosomes using proposed reliability thresholding metrics and deliberately engineered features. The proposed method has been evaluated using a variation of deep Alex - Net neural network, SVM, K - Nearest - Neighbors, and their cascade pipelines to an automated filtering of semi - straight chromosome. The classification results have highly improved over 90% for the chromosomes with more common defections and translocations. Furthermore, a compara tive analysis over the proposed thresholding metrics has been conducted and the best metric is bolded with its salient characteristics. The high Precision results provided for a very low - quality G - banding database verifies suitability of the proposed metri cs and pruning method for Karyotyping facilities in poor countries and low - budget pathological laboratories. Keywords: G - banded Karyotyping, Precision, Reliability metrics, Pattern Recognition, Medical Imaging 1 Introduction One of the ways to study and dia gnose birth - defects and biological disorders is through using Cytogenetics. This branch of science endeavors to analyze chromosome shapes and patterns to find out common defects. The methods used for such analyzes includes G - Banding, Fluorescent In - Situ Hy bridization (FISH), Comparative Genomic Hybridization (CGH) and Chromosome - specific unique - sequence probes [27] . While Molecular Cytogenetics methods are effective in biological disorders, they do not necessarily manifest specific chromosome defects. FISH methods, though having higher accuracy results in stains, are costly and unable to identify all chromosome abnorm alities. Being temporary in sustaining fluorescence detector, they demand higher provision effort and substance supply that might not be affordable for some countries . Furthermore, detecting some abnormalities implies having G - banding technique involved an d not merely using stains.


A Theory of the Mechanics of Information: Generalization Through Measurement of Uncertainty (Learning is Measuring)

arXiv.org Machine Learning

Traditional machine learning relies on explicit models and domain assumptions, limiting flexibility and interpretability. We introduce a model-free framework using surprisal (information theoretic uncertainty) to directly analyze and perform inferences from raw data, eliminating distribution modeling, reducing bias, and enabling efficient updates including direct edits and deletion of training data. By quantifying relevance through uncertainty, the approach enables generalizable inference across tasks including generative inference, causal discovery, anomaly detection, and time series forecasting. It emphasizes traceability, interpretability, and data-driven decision making, offering a unified, human-understandable framework for machine learning, and achieves at or near state-of-the-art performance across most common machine learning tasks. The mathematical foundations create a ``physics'' of information, which enable these techniques to apply effectively to a wide variety of complex data types, including missing data. Empirical results indicate that this may be a viable alternative path to neural networks with regard to scalable machine learning and artificial intelligence that can maintain human understandability of the underlying mechanics.


On pattern classification with weighted dimensions

arXiv.org Artificial Intelligence

Studies on various facets of pattern classification is often imperative while working with multi - dimensional samples pertaining to diverse application scenarios. In this notion, w eighted dimension - based distance measure has been one of the vital considerat ions in pattern analysis as it reflects the degree of similarity between samples . Though it is often presumed to be settled with the pervasive use of Euclidean distance, plethora of issues often surface. In this paper, we present (a) a detail analysis on t he impact of distance measure norms and weights of dimensions along with visualization, (b) a novel weighting scheme for each dimension, (c) incorporation of this dimensional weighting schema in to a KNN classifier, and (d) pattern classification on a varie ty of synthetic as well as realistic datasets with the developed model . It has perform ed well across diverse experiments in comparison to the traditional KNN under the same experimental setups. Specifically, for gene expression datasets, it yields signific ant and consistent gain in classification accuracy (around 10%) in all cross - validation experiments with different values of k. As such datasets contain limited number of samples of high dimensions, meaningful selection of nearest neighbours is desirable, and this requirement is reasonably met by regulat ing the shape and size of the region enclos ing the k number of reference samples with the developed weighting schema and appropriate norm . I t, therefore, stands as an important generalization of K NN classifier powered by weighted Minkowski distance with the present weighting schema .