Goto

Collaborating Authors

 dingo




DINGO: Constrained Inference for Diffusion LLMs

Suresh, Tarun, Banerjee, Debangshu, Ugare, Shubham, Misailovic, Sasa, Singh, Gagandeep

arXiv.org Artificial Intelligence

Diffusion LLMs have emerged as a promising alternative to conventional autoregressive LLMs, offering significant potential for improved runtime efficiency. However, existing diffusion models lack the ability to provably enforce user-specified formal constraints, such as regular expressions, which makes them unreliable for tasks that require structured outputs, such as fixed-schema JSON generation. Unlike autoregressive models that generate tokens sequentially, diffusion LLMs predict a block of tokens in parallel. This parallelism makes traditional constrained decoding algorithms, which are designed for sequential token prediction, ineffective at preserving the true output distribution. To address this limitation, we propose DINGO, a dynamic programming-based constrained decoding strategy that is both efficient and provably distribution-preserving. DINGO enables sampling of output strings with the highest probability under the model's predicted distribution, while strictly satisfying any user-specified regular expression. On standard symbolic math and JSON generation benchmarks, DINGO achieves up to a 68 percentage point improvement over unconstrained inference


Reviews: DINGO: Distributed Newton-Type Method for Gradient-Norm Optimization

Neural Information Processing Systems

In this paper, the authors propose a distributed Newton method for gradient-norm optimization. The method does not impose any specific form on the underlying objective function. The authors present convergence analysis for the method and illustrate the performance of the method on a convex problem (in the main paper). Originality: The topic of the paper, in my opinion, is very interesting. The paper presents an efficient Newton method that is motivated via the optimization of the norm of the gradient.


Diverse and Fine-Grained Instruction-Following Ability Exploration with Synthetic Data

Gu, Zihui, Sun, Xingwu, Lian, Fengzong, Kang, Zhanhui, Xu, Cheng-Zhong, Fan, Ju

arXiv.org Artificial Intelligence

Instruction-following is particularly crucial for large language models (LLMs) to support diverse user requests. While existing work has made progress in aligning LLMs with human preferences, evaluating their capabilities on instruction following remains a challenge due to complexity and diversity of real-world user instructions. While existing evaluation methods focus on general skills, they suffer from two main shortcomings, i.e., lack of fine-grained task-level evaluation and reliance on singular instruction expression. To address these problems, this paper introduces DINGO, a fine-grained and diverse instruction-following evaluation dataset that has two main advantages: (1) DINGO is based on a manual annotated, fine-grained and multi-level category tree with 130 nodes derived from real-world user requests; (2) DINGO includes diverse instructions, generated by both GPT-4 and human experts. Through extensive experiments, we demonstrate that DINGO can not only provide more challenging and comprehensive evaluation for LLMs, but also provide task-level fine-grained directions to further improve LLMs.


Bridging the Gap between Chemical Reaction Pretraining and Conditional Molecule Generation with a Unified Model

Qiang, Bo, Zhou, Yiran, Ding, Yuheng, Liu, Ningfeng, Song, Song, Zhang, Liangren, Huang, Bo, Liu, Zhenming

arXiv.org Artificial Intelligence

Chemical reactions are the fundamental building blocks of drug design and organic chemistry research. In recent years, there has been a growing need for a large-scale deep-learning framework that can efficiently capture the basic rules of chemical reactions. In this paper, we have proposed a unified framework that addresses both the reaction representation learning and molecule generation tasks, which allows for a more holistic approach. Inspired by the organic chemistry mechanism, we develop a novel pretraining framework that enables us to incorporate inductive biases into the model. Our framework achieves state-of-the-art results on challenging downstream tasks. By possessing chemical knowledge, our generative framework overcome the limitations of current molecule generation models that rely on a small number of reaction templates. In the extensive experiments, our model generates synthesizable drug-like structures of high quality. Overall, our work presents a significant step toward a large-scale deep-learning framework for a variety of reaction-based applications. Deep learning models have found applications across a multitude of scientific research domains [1-3]. Pretraining frameworks [4, 5] facilitate the seamless integration of new tasks, thereby expediting the modeling process, especially for scenarios with limited labeled data. Chemical reactions are the foundation of drug design and organic chemistry studies. Currently, data-mining works [6, 7] have enabled deep learning models to be applied to chemical reactions. Based on these data, there have been plenty of data-driven works that intend to delve into the representation learning of chemical reactions. Representation learning refers to automatically learning useful features from the data, which can then be used for various downstream tasks [8]. In earlier works, traditional molecular fingerprints were applied directly for reaction representations[9, 10]. Inspired by natural language processing (NLP) methods, researchers also applied attention-based network[11, 12] or contrastive learning techniques[13, 14] in chemical reaction pretraining networks. These representations have been tested on classification tasks[15] or regression tasks[16]. However, these methods ignore the fundamental theories in organic chemistry, which limits their performance. For example, electronic effects and inductive effects will be ignored if bonds or atoms outside the reactive centers are masked [13]. Except for reaction classification tasks, molecule generation based on chemical reactions is also an important application. This branch of models has been proven to be capable of generating synthetically accessible molecules.[17-20]. Earlier works always applied a step-wise template-based molecule generation strategy.


Real-time gravitational-wave science with neural posterior estimation

Dax, Maximilian, Green, Stephen R., Gair, Jonathan, Macke, Jakob H., Buonanno, Alessandra, Schölkopf, Bernhard

arXiv.org Artificial Intelligence

We demonstrate unprecedented accuracy for rapid gravitational-wave parameter estimation with deep learning. Using neural networks as surrogates for Bayesian posterior distributions, we analyze eight gravitational-wave events from the first LIGO-Virgo Gravitational-Wave Transient Catalog and find very close quantitative agreement with standard inference codes, but with inference times reduced from O(day) to a minute per event. Our networks are trained using simulated data, including an estimate of the detector-noise characteristics near the event. This encodes the signal and noise models within millions of neural-network parameters, and enables inference for any observed data consistent with the training distribution, accounting for noise nonstationarity from event to event. Our algorithm -- called "DINGO" -- sets a new standard in fast-and-accurate inference of physical parameters of detected gravitational-wave events, which should enable real-time data analysis without sacrificing accuracy.


DINGO: an ontology for projects and grants linked data

Chialva, Diego, Mugabushaka, Alexis-Michel

arXiv.org Artificial Intelligence

Services and resources built around Semantic Web, semantically-enabled applications and linked (open) data technologies have been increasingly impacting research and research-related activities in the last years. Development has been intense along several directions, for instance in "semantic publishing" [36], but also in the aspects directed toward the reproducibility and attribution of research and scholarly outputs, leading also to the interest in having Open Science Graphs interconnected at the global level [21]. All this has become more and more essential to research practices, also in light of the so-called reproducibility crisis affecting a number of research fields (see, for instance, the huge list of latest studies at https://reproduciblescience.org/2019). In fact, the demand of easily and automatically parsable, interoperable and processable data goes beyond the purely academic sphere. The research landscape comprises a vast number and type of activities, with multiple and diverse stakeholders, actors and with impact on several aspects and sectors of society.