Goto

Collaborating Authors

 eader




scReader: Prompting Large Language Models to Interpret scRNA-seq Data

Li, Cong, Long, Qingqing, Zhou, Yuanchun, Xiao, Meng

arXiv.org Artificial Intelligence

Large language models (LLMs) have demonstrated remarkable advancements, primarily due to their capabilities in modeling the hidden relationships within text sequences. This innovation presents a unique opportunity in the field of life sciences, where vast collections of single-cell omics data from multiple species provide a foundation for training foundational models. However, the challenge lies in the disparity of data scales across different species, hindering the development of a comprehensive model for interpreting genetic data across diverse organisms. In this study, we propose an innovative hybrid approach that integrates the general knowledge capabilities of LLMs with domain-specific representation models for single-cell omics data interpretation. We begin by focusing on genes as the fundamental unit of representation. Gene representations are initialized using functional descriptions, leveraging the strengths of mature language models such as LLaMA-2. By inputting single-cell gene-level expression data with prompts, we effectively model cellular representations based on the differential expression levels of genes across various species and cell types. In the experiments, we constructed developmental cells from humans and mice, specifically targeting cells that are challenging to annotate. We evaluated our methodology through basic tasks such as cell annotation and visualization analysis. The results demonstrate the efficacy of our approach compared to other methods using LLMs, highlighting significant improvements in accuracy and interoperability. Our hybrid approach enhances the representation of single-cell data and offers a robust framework for future research in cross-species genetic analysis.


Anchor Prediction: Automatic Refinement of Internet Links

Liu, Nelson F., Lee, Kenton, Toutanova, Kristina

arXiv.org Artificial Intelligence

Internet links enable users to deepen their understanding of a topic by providing convenient access to related information. However, the majority of links are unanchored -- they link to a target webpage as a whole, and readers may expend considerable effort localizing the specific parts of the target webpage that enrich their understanding of the link's source context. To help readers effectively find information in linked webpages, we introduce the task of anchor prediction, where the goal is to identify the specific part of the linked target webpage that is most related to the source linking context. We release the AuthorAnchors dataset, a collection of 34K naturally-occurring anchored links, which reflect relevance judgments by the authors of the source article. To model reader relevance judgments, we annotate and release ReaderAnchors, an evaluation set of anchors that readers find useful. Our analysis shows that effective anchor prediction often requires jointly reasoning over lengthy source and target webpages to determine their implicit relations and identify parts of the target webpage that are related but not redundant. We benchmark a performant T5-based ranking approach to establish baseline performance on the task, finding ample room for improvement.


Wukong-Reader: Multi-modal Pre-training for Fine-grained Visual Document Understanding

Bai, Haoli, Liu, Zhiguang, Meng, Xiaojun, Li, Wentao, Liu, Shuang, Xie, Nian, Zheng, Rongfu, Wang, Liangwei, Hou, Lu, Wei, Jiansheng, Jiang, Xin, Liu, Qun

arXiv.org Artificial Intelligence

Unsupervised pre-training on millions of digital-born or scanned documents has shown promising advances in visual document understanding~(VDU). While various vision-language pre-training objectives are studied in existing solutions, the document textline, as an intrinsic granularity in VDU, has seldom been explored so far. A document textline usually contains words that are spatially and semantically correlated, which can be easily obtained from OCR engines. In this paper, we propose Wukong-Reader, trained with new pre-training objectives to leverage the structural knowledge nested in document textlines. We introduce textline-region contrastive learning to achieve fine-grained alignment between the visual regions and texts of document textlines. Furthermore, masked region modeling and textline-grid matching are also designed to enhance the visual and layout representations of textlines. Experiments show that our Wukong-Reader has superior performance on various VDU tasks such as information extraction. The fine-grained alignment over textlines also empowers Wukong-Reader with promising localization ability.


Knowledge Guided Text Retrieval and Reading for Open Domain Question Answering

Min, Sewon, Chen, Danqi, Zettlemoyer, Luke, Hajishirzi, Hannaneh

arXiv.org Artificial Intelligence

This paper presents a general approach for open-domain question answering (QA) that models interactions between paragraphs using structural information from a knowledge base. We first describe how to construct a graph of passages from a large corpus, where the relations are either from the knowledge base or the internal structure of Wikipedia. We then introduce a reading comprehension model which takes this graph as an input, to better model relationships across pairs of paragraphs. This approach consistently outperforms competitive baselines in three open-domain QA datasets, WebQuestions, Natural Questions and TriviaQA, improving the pipeline-based state-of-the-art by 3--13%.


Acceleration through Optimistic No-Regret Dynamics

Wang, Jun-Kun, Abernethy, Jacob D.

Neural Information Processing Systems

We consider the problem of minimizing a smooth convex function by reducing the optimization to computing the Nash equilibrium of a particular zero-sum convex-concave game. Zero-sum games can be solved using online learning dynamics, where a classical technique involves simulating two no-regret algorithms that play against each other and, after $T$ rounds, the average iterate is guaranteed to solve the original optimization problem with error decaying as $O(\log T/T)$. In this paper we show that the technique can be enhanced to a rate of $O(1/T^2)$ by extending recent work \cite{RS13,SALS15} that leverages \textit{optimistic learning} to speed up equilibrium computation. The resulting optimization algorithm derived from this analysis coincides \textit{exactly} with the well-known \NA \cite{N83a} method, and indeed the same story allows us to recover several variants of the Nesterov's algorithm via small tweaks. We are also able to establish the accelerated linear rate for a function which is both strongly-convex and smooth. This methodology unifies a number of different iterative optimization methods: we show that the \HB algorithm is precisely the non-optimistic variant of \NA, and recent prior work already established a similar perspective on \FW \cite{AW17,ALLW18}.