Goto

Collaborating Authors

 Singh, Gurpreet


YINYANG-ALIGN: Benchmarking Contradictory Objectives and Proposing Multi-Objective Optimization based DPO for Text-to-Image Alignment

arXiv.org Artificial Intelligence

Precise alignment in Text-to-Image (T2I) systems is crucial to ensure that generated visuals not only accurately encapsulate user intents but also conform to stringent ethical and aesthetic benchmarks. Incidents like the Google Gemini fiasco, where misaligned outputs triggered significant public backlash, underscore the critical need for robust alignment mechanisms. In contrast, Large Language Models (LLMs) have achieved notable success in alignment. Building on these advancements, researchers are eager to apply similar alignment techniques, such as Direct Preference Optimization (DPO), to T2I systems to enhance image generation fidelity and reliability. We present YinYangAlign, an advanced benchmarking framework that systematically quantifies the alignment fidelity of T2I systems, addressing six fundamental and inherently contradictory design objectives. Each pair represents fundamental tensions in image generation, such as balancing adherence to user prompts with creative modifications or maintaining diversity alongside visual coherence. YinYangAlign includes detailed axiom datasets featuring human prompts, aligned (chosen) responses, misaligned (rejected) AI-generated outputs, and explanations of the underlying contradictions.


DPO Kernels: A Semantically-Aware, Kernel-Enhanced, and Divergence-Rich Paradigm for Direct Preference Optimization

arXiv.org Artificial Intelligence

The rapid rise of large language models (LLMs) has unlocked many applications but also underscores the challenge of aligning them with diverse values and preferences. Direct Preference Optimization (DPO) is central to alignment but constrained by fixed divergences and limited feature transformations. We propose DPO-Kernels, which integrates kernel methods to address these issues through four key contributions: (i) Kernelized Representations with polynomial, RBF, Mahalanobis, and spectral kernels for richer transformations, plus a hybrid loss combining embedding-based and probability-based objectives; (ii) Divergence Alternatives (Jensen-Shannon, Hellinger, Renyi, Bhattacharyya, Wasserstein, and f-divergences) for greater stability; (iii) Data-Driven Selection metrics that automatically choose the best kernel-divergence pair; and (iv) a Hierarchical Mixture of Kernels for both local precision and global modeling. Evaluations on 12 datasets demonstrate state-of-the-art performance in factuality, safety, reasoning, and instruction following. Grounded in Heavy-Tailed Self-Regularization, DPO-Kernels maintains robust generalization for LLMs, offering a comprehensive resource for further alignment research.


Iterative NLP Query Refinement for Enhancing Domain-Specific Information Retrieval: A Case Study in Career Services

arXiv.org Artificial Intelligence

Retrieving semantically relevant documents in niche domains poses significant challenges for traditional TF-IDF-based systems, often resulting in low similarity scores and suboptimal retrieval performance. This paper addresses these challenges by introducing an iterative and semi-automated query refinement methodology tailored to Humber College's career services webpages. Initially, generic queries related to interview preparation yield low top-document similarities (approximately 0.2--0.3). To enhance retrieval effectiveness, we implement a two-fold approach: first, domain-aware query refinement by incorporating specialized terms such as resources-online-learning, student-online-services, and career-advising; second, the integration of structured educational descriptors like "online resume and interview improvement tools." Additionally, we automate the extraction of domain-specific keywords from top-ranked documents to suggest relevant terms for query expansion. Through experiments conducted on five baseline queries, our semi-automated iterative refinement process elevates the average top similarity score from approximately 0.18 to 0.42, marking a substantial improvement in retrieval performance. The implementation details, including reproducible code and experimental setups, are made available in our GitHub repositories \url{https://github.com/Elipei88/HumberChatbotBackend} and \url{https://github.com/Nisarg851/HumberChatbot}. We also discuss the limitations of our approach and propose future directions, including the integration of advanced neural retrieval models.


Tail-Net: Extracting Lowest Singular Triplets for Big Data Applications

arXiv.org Artificial Intelligence

SVD serves as an exploratory tool in identifying the dominant features in the form of top rank-r singular factors corresponding to the largest singular values. For Big Data applications it is well known that Singular Value Decomposition (SVD) is restrictive due to main memory requirements. However, a number of applications such as community detection, clustering, or bottleneck identification in large scale graph data-sets rely upon identifying the lowest singular values and the singular corresponding vectors. For example, the lowest singular values of a graph Laplacian reveal the number of isolated clusters (zero singular values) or bottlenecks (lowest non-zero singular values) for undirected, acyclic graphs. A naive approach here would be to perform a full SVD however, this quickly becomes infeasible for practical big data applications due to the enormous memory requirements. Furthermore, for such applications only a few lowest singular factors are desired making a full decomposition computationally exorbitant. In this work, we trivially extend the previously proposed Range-Net to \textbf{Tail-Net} for a memory and compute efficient extraction of lowest singular factors of a given big dataset and a specified rank-r. We present a number of numerical experiments on both synthetic and practical data-sets for verification and bench-marking using conventional SVD as the baseline.


Streaming Singular Value Decomposition for Big Data Applications

arXiv.org Artificial Intelligence

Singular Value Decomposition (SVD) plays a pivotal role in exploratory data analysis. However, in a Big Data setting computing the dominant singular vectors is often restrictive due to the main memory requirements imposed by the dataset. Recently introduced randomized projection schemes attempt to mitigate this memory load by constructing approximate projections of the true dataset in a streaming setting. However, these projection methods come at the cost of approximation errors in both top singular values and vectors. Furthermore, in order to bound the approximation error, an over-sampled projection is required, often much larger in dimension than the desired rank. This latter consideration can still be memory intensive when the data dimension is large or extraneous when the desired rank approximation is close to the full rank. We present a two stage neural optimization approach as an alternative to conventional and randomized SVD techniques, where the memory requirement depends explicitly on the feature dimension and desired rank, independent of the sample size. The proposed scheme reads data samples in a streaming setting with the network minimization problem converging to a low rank approximation with high precision. Our architecture is fully interpretable where all the network outputs and weights have a specific meaning. We evaluate our results on various performance metrics against state of the art streaming methods. We also present numerical experiments for Singular and Eigen value decomposition on real data at various scales to show the memory efficiency of our proposed approach.


Prevention is Better than Cure: Handling Basis Collapse and Transparency in Dense Networks

arXiv.org Machine Learning

Dense nets are an integral part of any classification and regression problem. Recently, these networks have found a new application as solvers for known representations in various domains. However, one crucial issue with dense nets is it's feature interpretation and lack of reproducibility over multiple training runs. In this work, we identify a basis collapse issue as a primary cause and propose a modified loss function that circumvents this problem. We also provide a few general guidelines relating the choice of activations to loss surface roughness and appropriate scaling for designing low-weight dense nets. We demonstrate through carefully chosen numerical experiments that the basis collapse issue leads to the design of massively redundant networks. Our approach results in substantially concise nets, having $100 \times$ fewer parameters, while achieving a much lower $(10\times)$ MSE loss at scale than reported in prior works. Further, we show that the width of a dense net is acutely dependent on the feature complexity. This is in contrast to the dimension dependent width choice reported in prior theoretical works. To the best of our knowledge, this is the first time these issues and contradictions have been reported and experimentally verified. With our design guidelines we render transparency in terms of a low-weight network design. We share our codes for full reproducibility available at https://github.com/smjtgupta/Dense_Net_Regress.


TIME: A Transparent, Interpretable, Model-Adaptive and Explainable Neural Network for Dynamic Physical Processes

arXiv.org Machine Learning

Partial Differential Equations are infinite dimensional encoded representations of physical processes. However, imbibing multiple observation data towards a coupled representation presents significant challenges. We present a fully convolutional architecture that captures the invariant structure of the domain to reconstruct the observable system. The proposed architecture is significantly low-weight compared to other networks for such problems. Our intent is to learn coupled dynamic processes interpreted as deviations from true kernels representing isolated processes for model-adaptivity. Experimental analysis shows that our architecture is robust and transparent in capturing process kernels and system anomalies. We also show that high weights representation is not only redundant but also impacts network interpretability. Our design is guided by domain knowledge, with isolated process representations serving as ground truths for verification. These allow us to identify redundant kernels and their manifestations in activation maps to guide better designs that are both interpretable and explainable unlike traditional deep-nets.