Goto

Collaborating Authors

 Kong, Fanshuang


Lost-in-the-Middle in Long-Text Generation: Synthetic Dataset, Evaluation Framework, and Mitigation

arXiv.org Artificial Intelligence

Existing long-text generation methods primarily concentrate on producing lengthy texts from short inputs, neglecting the long-input and long-output tasks. Such tasks have numerous practical applications while lacking available benchmarks. Moreover, as the input grows in length, existing methods inevitably encounter the "lost-in-the-middle" phenomenon. In this paper, we first introduce a Long Input and Output Benchmark (LongInOutBench), including a synthetic dataset and a comprehensive evaluation framework, addressing the challenge of the missing benchmark. We then develop the Retrieval-Augmented Long-Text Writer (RAL-Writer), which retrieves and restates important yet overlooked content, mitigating the "lost-in-the-middle" issue by constructing explicit prompts. We finally employ the proposed LongInOutBench to evaluate our RAL-Writer against comparable baselines, and the results demonstrate the effectiveness of our approach. Our code has been released at https://github.com/OnlyAR/RAL-Writer.


LH-Mix: Local Hierarchy Correlation Guided Mixup over Hierarchical Prompt Tuning

arXiv.org Artificial Intelligence

Hierarchical text classification (HTC) aims to assign one or more labels in the hierarchy for each text. Many methods represent this structure as a global hierarchy, leading to redundant graph structures. To address this, incorporating a text-specific local hierarchy is essential. However, existing approaches often model this local hierarchy as a sequence, focusing on explicit parent-child relationships while ignoring implicit correlations among sibling/peer relationships. In this paper, we first integrate local hierarchies into a manual depth-level prompt to capture parent-child relationships. We then apply Mixup to this hierarchical prompt tuning scheme to improve the latent correlation within sibling/peer relationships. Notably, we propose a novel Mixup ratio guided by local hierarchy correlation to effectively capture intrinsic correlations. This Local Hierarchy Mixup (LH-Mix) model demonstrates remarkable performance across three widely-used datasets.


Rethink the Evaluation Protocol of Model Merging on Classification Task

arXiv.org Artificial Intelligence

Model merging combines multiple fine-tuned models into a single one via parameter fusion, achieving improvements across many tasks. However, in the classification task, we find a misalignment issue between merging outputs and the fine-tuned classifier, which limits its effectiveness. In this paper, we demonstrate the following observations: (1) The embedding quality of the merging outputs is already very high, and the primary reason for the differences in classification performance lies in the misalignment issue. (2) We propose FT-Classifier, a new protocol that fine-tunes an aligned classifier with few-shot samples to alleviate misalignment, enabling better evaluation of merging outputs and improved classification performance. (3) The misalignment is relatively straightforward and can be formulated as an orthogonal transformation. Experiments demonstrate the existence of misalignment and the effectiveness of our FT-Classifier evaluation protocol.


Embedding of Hierarchically Typed Knowledge Bases

AAAI Conferences

Embedding has emerged as an important approach to prediction, inference, data mining and information retrieval based on knowledge bases and various embedding models have been presented. Most of these models are "typeless," namely, treating a knowledge base solely as a collection of instances without considering the types of the entities therein. In this paper, we investigate the use of entity type information for knowledge base embedding. We present a framework that augments a generic "typeless" embedding model to a typed one. The framework interprets an entity type as a constraint on the set of all entities and let these type constraints induce isomorphically a set of subsets in the embedding space. Additional cost functions are then introduced to model the fitness between these constraints and the embedding of entities and relations. A concrete example scheme of the framework is proposed. We demonstrate experimentally that this framework offers improved embedding performance over the typeless models and other typed models.