Tao, Yicheng
Assisting Mathematical Formalization with A Learning-based Premise Retriever
Tao, Yicheng, Liu, Haotian, Wang, Shanwen, Xu, Hongteng
Premise selection is a crucial yet challenging step in mathematical formalization, especially for users with limited experience. Due to the lack of available formalization projects, existing approaches that leverage language models often suffer from data scarcity. In this work, we introduce an innovative method for training a premise retriever to support the formalization of mathematics. Our approach employs a BERT model to embed proof states and premises into a shared latent space. The retrieval model is trained within a contrastive learning framework and incorporates a domain-specific tokenizer along with a fine-grained similarity computation method. Experimental results show that our model is highly competitive compared to existing baselines, achieving strong performance while requiring fewer computational resources. Performance is further enhanced through the integration of a re-ranking module. To streamline the formalization process, we will release a search engine that enables users to query Mathlib theorems directly using proof states, significantly improving accessibility and efficiency. Codes are available at https://github.com/ruc-ai4math/Premise-Retrieval.
Graphical Reasoning: LLM-based Semi-Open Relation Extraction
Tao, Yicheng, Wang, Yiqun, Bai, Longju
This paper presents a comprehensive exploration of relation extraction utilizing advanced language models, specifically Chain of Thought (CoT) and Graphical Reasoning (GRE) techniques. We demonstrate how leveraging in-context learning with GPT-3.5 can significantly enhance the extraction process, particularly through detailed example-based reasoning. Additionally, we introduce a novel graphical reasoning approach that dissects relation extraction into sequential sub-tasks, improving precision and adaptability in processing complex relational data. Our experiments, conducted on multiple datasets, including manually annotated data, show considerable improvements in performance metrics, underscoring the effectiveness of our methodologies.
PlanGPT: Enhancing Urban Planning with Tailored Language Model and Efficient Retrieval
Zhu, He, Zhang, Wenjia, Huang, Nuoxian, Li, Boyang, Niu, Luyao, Fan, Zipei, Lun, Tianle, Tao, Yicheng, Su, Junyou, Gong, Zhaoya, Fang, Chenyu, Liu, Xing
In the field of urban planning, general-purpose large language models often struggle to meet the specific needs of planners. Tasks like generating urban planning texts, retrieving related information, and evaluating planning documents pose unique challenges. To enhance the efficiency of urban professionals and overcome these obstacles, we introduce PlanGPT, the first specialized Large Language Model tailored for urban and spatial planning. Developed through collaborative efforts with institutions like the Chinese Academy of Urban Planning, PlanGPT leverages a customized local database retrieval framework, domain-specific fine-tuning of base models, and advanced tooling capabilities. Empirical tests demonstrate that PlanGPT has achieved advanced performance, delivering responses of superior quality precisely tailored to the intricacies of urban planning.
Learning Sparsity and Randomness for Data-driven Low Rank Approximation
Chen, Tiejin, Tao, Yicheng
Learning-based low rank approximation algorithms can significantly improve the performance of randomized low rank approximation with sketch matrix. With the learned value and fixed non-zero positions for sketch matrices from learning-based algorithms, these matrices can reduce the test error of low rank approximation significantly. However, there is still no good method to learn non-zero positions as well as overcome the out-of-distribution performance loss. In this work, we introduce two new methods Learning Sparsity and Learning Randomness which try to learn a better sparsity patterns and add randomness to the value of sketch matrix. These two methods can be applied with any learning-based algorithms which use sketch matrix directly. Our experiments show that these two methods can improve the performance of previous learning-based algorithm for both test error and out-of-distribution test error without adding too much complexity.