Goto

Collaborating Authors

 Ma, Zeyu


Revisiting Benchmark and Assessment: An Agent-based Exploratory Dynamic Evaluation Framework for LLMs

arXiv.org Artificial Intelligence

While various vertical domain large language models (LLMs) have been developed, the challenge of automatically evaluating their performance across different domains remains significant. Current benchmark-based evaluation methods exhibit rigid, aimless interactions and rely on pre-collected static datasets that are costly to build, inflexible across domains, and misaligned with practical user needs. To address this issue, we revisit the evaluation components and introduce two concepts: Benchmark+, which extends traditional question-answer benchmark into a more flexible "strategy-criterion" format; and Assessment+, which enhances the interaction process, enabling deeper exploration and supporting both quantitative metrics and qualitative insights. These concepts capture the nuanced behaviors of LLMs through richer, multi-turn interactions. We propose an agent-based evaluation framework called TestAgent, which implements these concepts through retrieval augmented generation and reinforcement learning. Experiments on tasks ranging from constructing vertical domain evaluation to activating existing benchmarks demonstrate the effectiveness of TestAgent across various scenarios. We believe this work offers an interesting perspective on automatic evaluation for LLMs.


CoCo: A Coupled Contrastive Framework for Unsupervised Domain Adaptive Graph Classification

arXiv.org Artificial Intelligence

Although graph neural networks (GNNs) have achieved impressive achievements in graph classification, they often need abundant task-specific labels, which could be extensively costly to acquire. A credible solution is to explore additional labeled graphs to enhance unsupervised learning on the target domain. However, how to apply GNNs to domain adaptation remains unsolved owing to the insufficient exploration of graph topology and the significant domain discrepancy. In this paper, we propose Coupled Contrastive Graph Representation Learning (CoCo), which extracts the topological information from coupled learning branches and reduces the domain discrepancy with coupled contrastive learning. CoCo contains a graph convolutional network branch and a hierarchical graph kernel network branch, which explore graph topology in implicit and explicit manners. Besides, we incorporate coupled branches into a holistic multi-view contrastive learning framework, which not only incorporates graph representations learned from complementary views for enhanced understanding, but also encourages the similarity between cross-domain example pairs with the same semantics for domain alignment. Extensive experiments on popular datasets show that our CoCo outperforms these competing baselines in different settings generally.


Multi-Slice Dense-Sparse Learning for Efficient Liver and Tumor Segmentation

arXiv.org Artificial Intelligence

Accurate automatic liver and tumor segmentation plays a vital role in treatment planning and disease monitoring. Recently, deep convolutional neural network (DCNNs) has obtained tremendous success in 2D and 3D medical image segmentation. However, 2D DCNNs cannot fully leverage the inter-slice information, while 3D DCNNs are computationally expensive and memory intensive. To address these issues, we first propose a novel dense-sparse training flow from a data perspective, in which, densely adjacent slices and sparsely adjacent slices are extracted as inputs for regularizing DCNNs, thereby improving the model performance. Moreover, we design a 2.5D light-weight nnU-Net from a network perspective, in which, depthwise separable convolutions are adopted to improve the efficiency. Extensive experiments on the LiTS dataset have demonstrated the superiority of the proposed method.