Cheng, Pu-Jen
Not All LLM-Generated Data Are Equal: Rethinking Data Weighting in Text Classification
Kuo, Hsun-Yu, Liao, Yin-Hsiang, Chao, Yu-Chieh, Ma, Wei-Yun, Cheng, Pu-Jen
Synthetic data augmentation via large language models (LLMs) allows researchers to leverage additional training data, thus enhancing the performance of downstream tasks, especially when real-world data is scarce. However, the generated data can deviate from the real-world data, and this misalignment can bring deficient outcomes while applying the trained model to applications. Therefore, we proposed efficient weighted-loss approaches to align synthetic data with realworld distribution by emphasizing high-quality and diversified data generated by LLMs with using merely a little real-world data. We empirically assessed the effectiveness of our method on multiple text classification tasks, and the results showed leveraging our approaches on a BERT-level model robustly outperformed standard cross-entropy and other data weighting approaches, providing potential solutions to effectively leveraging synthetic data from any suitable data generator for model training. The quantity and quality of data play a significant role in many tasks of Natural Language Processing (NLP). However, due to the scarcity of data in a particular domain for a specific task, we may need expertise to collect such data, resulting in budget limitations. Fortunately, Large language models (LLMs) provide a practical solution to this problem. LLMs, such as GPT series (Brown et al., 2020; OpenAI, 2022; OpenAI et al., 2024), can be leveraged to generate synthetic data that mimics real-world examples, thereby enriching the training set (Wang et al., 2023). However, training models with LLM-generated data can lead to drawbacks such as model collapse (Shumailov et al., 2023; Dohmatob et al., 2024), tail phenomena, reinforcing LM biases (Wang et al., 2023). Moreover, based on our empirical study, the performance of models trained on synthetic data without proper processing can be lower than models trained on much smaller real-world data (Sec. Previous works took data filtering strategy to get high quality or variant data (Dubey et al., 2024; MetaAI, 2024; Chiang et al., 2023; West et al., 2022).
Legal Documents Drafting with Fine-Tuned Pre-Trained Large Language Model
Lin, Chun-Hsien, Cheng, Pu-Jen
With the development of large-scale Language Models (LLM), fine-tuning pre-trained LLM has become a mainstream paradigm for solving downstream tasks of natural language processing. However, training a language model in the legal field requires a large number of legal documents so that the language model can learn legal terminology and the particularity of the format of legal documents. The typical NLP approaches usually rely on many manually annotated data sets for training. However, in the legal field application, it is difficult to obtain a large number of manually annotated data sets, which restricts the typical method applied to the task of drafting legal documents. The experimental results of this paper show that not only can we leverage a large number of annotation-free legal documents without Chinese word segmentation to fine-tune a large-scale language model, but more importantly, it can fine-tune a pre-trained LLM on the local computer to achieve the generating legal document drafts task, and at the same time achieve the protection of information privacy and to improve information security issues. NTRODUCTION In recent years, researchers have applied neural networks to natural language processing, achieving state-of-the-art performance in processing legal documents, such as tasks related to textual entailment [1] and legal question answering [2]. With the development of large-scale Language Models (LLM), fine-tuning pretrained LLM to address natural language processing tasks, as mentioned above, has become a mainstream paradigm [3]. However, challenges persist when employing natural language text generation techniques as a solution for highly specialized tasks like legal document drafting.
Improving One-class Recommendation with Multi-tasking on Various Preference Intensities
Shao, Chu-Jen, Fu, Hao-Ming, Cheng, Pu-Jen
In general, implicit feedback is easier to obtain than explicit feedback. Thus, making recommendations with only implicit feedback is indispensable. This type of problems are referred to as one-class recommendation [6]. There are several efforts proposed to solve one-class recommendation problems. For example, model-based methods [2, 7] aim to learn a vector representation for each user and item and apply some kernel, such as inner product for matrix factorization (MF) [5], to measure similarity. On the other hand, graph-based methods [11] construct a user-item bipartite graph from historical interactions and utilize random walk on it to explore user interests and make recommendations. In recent years, hybrid approaches [4, 10] combining model-based and graph-based methods have been developed. They explore high-order relationships on the bipartite graph and encode this information into learned entity representations, resulting in remarkable improvements in one-class recommendation tasks. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page.
Learning Unsupervised Semantic Document Representation for Fine-grained Aspect-based Sentiment Analysis
Fu, Hao-Ming, Cheng, Pu-Jen
Document representation is the core of many NLP tasks on machine understanding. A general representation learned in an unsupervised manner reserves generality and can be used for various applications. In practice, sentiment analysis (SA) has been a challenging task that is regarded to be deeply semantic-related and is often used to assess general representations. Existing methods on unsupervised document representation learning can be separated into two families: sequential ones, which explicitly take the ordering of words into consideration, and non-sequential ones, which do not explicitly do so. However, both of them suffer from their own weaknesses. In this paper, we propose a model that overcomes difficulties encountered by both families of methods. Experiments show that our model outperforms state-of-the-art methods on popular SA datasets and a fine-grained aspect-based SA by a large margin.
Addressing Long-Horizon Tasks by Integrating Program Synthesis and State Machines
Lin, Yu-An, Lee, Chen-Tao, Liu, Guan-Ting, Cheng, Pu-Jen, Sun, Shao-Hua
Deep reinforcement learning excels in various domains but lacks generalizability and interoperability. Programmatic RL methods (Trivedi et al., 2021; Liu et al., 2023) reformulate solving RL tasks as synthesizing interpretable programs that can be executed in the environments. Despite encouraging results, these methods are limited to short-horizon tasks. On the other hand, representing RL policies using state machines (Inala et al., 2020) can inductively generalize to long-horizon tasks; however, it struggles to scale up to acquire diverse and complex behaviors. This work proposes Program Machine Policies (POMPs), which bridge the advantages of programmatic RL and state machine policies, allowing for the representation of complex behaviors and the address of long-term tasks. Specifically, we introduce a method that can retrieve a set of effective, diverse, compatible programs. Then, we use these programs as modes of a state machine and learn a transition function to transition among mode programs, allowing for capturing long-horizon repetitive behaviors. Our proposed framework outperforms programmatic RL and deep RL baselines on various tasks and demonstrates the ability to generalize to even longer horizons without any fine-tuning inductively. Ablation studies justify the effectiveness of our proposed search algorithm for retrieving a set of programs as modes.
Hierarchical Programmatic Reinforcement Learning via Learning to Compose Programs
Liu, Guan-Ting, Hu, En-Pei, Cheng, Pu-Jen, Lee, Hung-yi, Sun, Shao-Hua
Aiming to produce reinforcement learning (RL) policies that are human-interpretable and can generalize better to novel scenarios, Trivedi et al. (2021) present a method (LEAPS) that first learns a program embedding space to continuously parameterize diverse programs from a pre-generated program dataset, and then searches for a task-solving program in the learned program embedding space when given a task. Despite the encouraging results, the program policies that LEAPS can produce are limited by the distribution of the program dataset. Furthermore, during searching, LEAPS evaluates each candidate program solely based on its return, failing to precisely reward correct parts of programs and penalize incorrect parts. To address these issues, we propose to learn a meta-policy that composes a series of programs sampled from the learned program embedding space. By learning to compose programs, our proposed hierarchical programmatic reinforcement learning (HPRL) framework can produce program policies that describe out-of-distributionally complex behaviors and directly assign credits to programs that induce desired behaviors. The experimental results in the Karel domain show that our proposed framework outperforms baselines. The ablation studies confirm the limitations of LEAPS and justify our design choices.
Attentive Graph-based Text-aware Preference Modeling for Top-N Recommendation
Juan, Ming-Hao, Cheng, Pu-Jen, Hsu, Hui-Neng, Hsiao, Pin-Hsin
Textual data are commonly used as auxiliary information for modeling user preference nowadays. While many prior works utilize user reviews for rating prediction, few focus on top-N recommendation, and even few try to incorporate item textual contents such as title and description. Though delivering promising performance for rating prediction, we empirically find that many review-based models cannot perform comparably well on top-N recommendation. Also, user reviews are not available in some recommendation scenarios, while item textual contents are more prevalent. On the other hand, recent graph convolutional network (GCN) based models demonstrate state-of-the-art performance for top-N recommendation. Thus, in this work, we aim to further improve top-N recommendation by effectively modeling both item textual content and high-order connectivity in user-item graph. We propose a new model named Attentive Graph-based Text-aware Recommendation Model (AGTM). Extensive experiments are provided to justify the rationality and effectiveness of our model design.