target topic
Design, Implementation and Evaluation of a Novel Programming Language Topic Classification Workflow
Zhang, Michael, Tian, Yuan, Guizani, Mariam
As software systems grow in scale and complexity, understanding the distribution of programming language topics within source code becomes increasingly important for guiding technical decisions, improving onboarding, and informing tooling and education. This paper presents the design, implementation, and evaluation of a novel programming language topic classification workflow. Our approach combines a multi-label Support Vector Machine (SVM) with a sliding window and voting strategy to enable fine-grained localization of core language concepts such as operator overloading, virtual functions, inheritance, and templates. Trained on the IBM Project CodeNet dataset, our model achieves an average F1 score of 0.90 across topics and 0.75 in code-topic highlight. Our findings contribute empirical insights and a reusable pipeline for researchers and practitioners interested in code analysis and data-driven software engineering.
- North America > Canada > Ontario > Kingston (0.05)
- North America > United States (0.04)
- Information Technology > Software > Programming Languages (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.86)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)
Adversarial Topic-aware Prompt-tuning for Cross-topic Automated Essay Scoring
Zhang, Chunyun, Zhao, Hongyan, Cui, Chaoran, Song, Qilong, Lu, Zhiqing, Gong, Shuai, Liu, Kailin
Cross-topic automated essay scoring (AES) aims to develop a transferable model capable of effectively evaluating essays on a target topic. A significant challenge in this domain arises from the inherent discrepancies between topics. While existing methods predominantly focus on extracting topic-shared features through distribution alignment of source and target topics, they often neglect topic-specific features, limiting their ability to assess critical traits such as topic adherence. To address this limitation, we propose an Adversarial TOpic-aware Prompt-tuning (ATOP), a novel method that jointly learns topic-shared and topic-specific features to improve cross-topic AES. ATOP achieves this by optimizing a learnable topic-aware prompt--comprising both shared and specific components--to elicit relevant knowledge from pre-trained language models (PLMs). To enhance the robustness of topic-shared prompt learning and mitigate feature scale sensitivity introduced by topic alignment, we incorporate adversarial training within a unified regression and classification framework. In addition, we employ a neighbor-based classifier to model the local structure of essay representations and generate pseudo-labels for target-topic essays. These pseudo-labels are then used to guide the supervised learning of topic-specific prompts tailored to the target topic. Extensive experiments on the publicly available ASAP++ dataset demonstrate that ATOP significantly outperforms existing state-of-the-art methods in both holistic and multi-trait essay scoring. The implementation of our method is publicly available at: https://anonymous.4open.science/r/ATOP-A271.
- North America > Canada > Ontario > Toronto (0.14)
- Asia > China > Shandong Province (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Enterprise Applications > Human Resources > Learning Management (0.62)
Writing Like the Best: Exemplar-Based Expository Text Generation
Liu, Yuxiang, Chang, Kevin Chen-Chuan
We introduce the Exemplar-Based Expository Text Generation task, aiming to generate an expository text on a new topic using an exemplar on a similar topic. Current methods fall short due to their reliance on extensive exemplar data, difficulty in adapting topic-specific content, and issues with long-text coherence. To address these challenges, we propose the concept of Adaptive Imitation and present a novel Recurrent Plan-then-Adapt (RePA) framework. RePA leverages large language models (LLMs) for effective adaptive imitation through a fine-grained plan-then-adapt process. RePA also enables recurrent segment-by-segment imitation, supported by two memory structures that enhance input clarity and output coherence. We also develop task-specific evaluation metrics--imitativeness, adaptiveness, and adaptive-imitativeness--using LLMs as evaluators. Experimental results across our collected three diverse datasets demonstrate that RePA surpasses existing baselines in producing factual, consistent, and relevant texts for this task.
- Europe > Russia > Volga Federal District > Republic of Bashkortostan (0.14)
- North America > United States > Texas > Harris County > Houston (0.14)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- (15 more...)
- Media > Music (1.00)
- Leisure & Entertainment (1.00)
- Education (0.67)
Supporting Automated Fact-checking across Topics: Similarity-driven Gradual Topic Learning for Claim Detection
Abumansour, Amani S., Zubiaga, Arkaitz
Selecting check-worthy claims for fact-checking is considered a crucial part of expediting the fact-checking process by filtering out and ranking the check-worthy claims for being validated among the impressive amount of claims could be found online. The check-worthy claim detection task, however, becomes more challenging when the model needs to deal with new topics that differ from those seen earlier. In this study, we propose a domain-adaptation framework for check-worthy claims detection across topics for the Arabic language to adopt a new topic, mimicking a real-life scenario of the daily emergence of events worldwide. We propose the Gradual Topic Learning (GTL) model, which builds an ability to learning gradually and emphasizes the check-worthy claims for the target topic during several stages of the learning process. In addition, we introduce the Similarity-driven Gradual Topic Learning (SGTL) model that synthesizes gradual learning with a similarity-based strategy for the target topic. Our experiments demonstrate the effectiveness of our proposed model, showing an overall tendency for improving performance over the state-of-the-art baseline across 11 out of the 14 topics under study.
- Europe > United Kingdom > England > Greater London > London (0.14)
- Asia > Middle East > Lebanon (0.05)
- Asia > Middle East > Kuwait (0.05)
- (13 more...)
- Information Technology > Data Science (0.93)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Assisting humans in complex comparisons: automated information comparison at scale
Yuen, Truman, Watt, Graham A., Lawryshyn, Yuri
Generative Large Language Models enable efficient analytics across knowledge domains, rivalling human experts in information comparisons. However, the applications of LLMs for information comparisons face scalability challenges due to the difficulties in maintaining information across large contexts and overcoming model token limitations. To address these challenges, we developed the novel Abstractive Summarization \& Criteria-driven Comparison Endpoint (ASC$^2$End) system to automate information comparison at scale. Our system employs Semantic Text Similarity comparisons for generating evidence-supported analyses. We utilize proven data-handling strategies such as abstractive summarization and retrieval augmented generation to overcome token limitations and retain relevant information during model inference. Prompts were designed using zero-shot strategies to contextualize information for improved model reasoning. We evaluated abstractive summarization using ROUGE scoring and assessed the generated comparison quality using survey responses. Models evaluated on the ASC$^2$End system show desirable results providing insights on the expected performance of the system. ASC$^2$End is a novel system and tool that enables accurate, automated information comparison at scale across knowledge domains, overcoming limitations in context length and retrieval.
- North America > Canada > Ontario > Toronto (0.28)
- North America > United States (0.28)
- Europe > Belgium (0.04)
- Asia > China > Beijing > Beijing (0.04)
- Questionnaire & Opinion Survey (1.00)
- Research Report > New Finding (0.93)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
Target-constrained Bidirectional Planning for Generation of Target-oriented Proactive Dialogue
Wang, Jian, Lin, Dongding, Li, Wenjie
Target-oriented proactive dialogue systems aim to lead conversations from a dialogue context toward a pre-determined target, such as making recommendations on designated items or introducing new specific topics. To this end, it is critical for such dialogue systems to plan reasonable actions to drive the conversation proactively, and meanwhile, to plan appropriate topics to move the conversation forward to the target topic smoothly. In this work, we mainly focus on effective dialogue planning for target-oriented dialogue generation. Inspired by decision-making theories in cognitive science, we propose a novel target-constrained bidirectional planning (TRIP) approach, which plans an appropriate dialogue path by looking ahead and looking back. By formulating the planning as a generation task, our TRIP bidirectionally generates a dialogue path consisting of a sequence of pairs using two Transformer decoders. They are expected to supervise each other and converge on consistent actions and topics by minimizing the decision gap and contrastive generation of targets. Moreover, we propose a target-constrained decoding algorithm with a bidirectional agreement to better control the planning process. Subsequently, we adopt the planned dialogue paths to guide dialogue generation in a pipeline manner, where we explore two variants: prompt-based generation and plan-controlled generation. Extensive experiments are conducted on two challenging dialogue datasets, which are re-purposed for exploring target-oriented dialogue. Our automatic and human evaluations demonstrate that the proposed methods significantly outperform various baseline models.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > China > Hong Kong (0.05)
- North America > United States > New York > Bronx County > New York City (0.04)
- (8 more...)
- Research Report > Experimental Study (0.46)
- Research Report > New Finding (0.45)
- Media > Film (1.00)
- Leisure & Entertainment (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.69)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.68)
Prompting and Evaluating Large Language Models for Proactive Dialogues: Clarification, Target-guided, and Non-collaboration
Deng, Yang, Liao, Lizi, Chen, Liang, Wang, Hongru, Lei, Wenqiang, Chua, Tat-Seng
Conversational systems based on Large Language Models (LLMs), such as ChatGPT, show exceptional proficiency in context understanding and response generation. However, despite their impressive capabilities, they still possess limitations, such as providing randomly-guessed answers to ambiguous queries or failing to refuse users' requests, both of which are considered aspects of a conversational agent's proactivity. This raises the question of whether LLM-based conversational systems are equipped to handle proactive dialogue problems. In this work, we conduct a comprehensive analysis of LLM-based conversational systems, specifically focusing on three aspects of proactive dialogue systems: clarification, target-guided, and non-collaborative dialogues. To trigger the proactivity of LLMs, we propose the Proactive Chain-of-Thought prompting scheme, which augments LLMs with the goal planning capability over descriptive reasoning chains. Empirical findings are discussed to promote future studies on LLM-based proactive dialogue systems.
Target-oriented Proactive Dialogue Systems with Personalization: Problem Formulation and Dataset Curation
Wang, Jian, Cheng, Yi, Lin, Dongding, Leong, Chak Tou, Li, Wenjie
Target-oriented dialogue systems, designed to proactively steer conversations toward predefined targets or accomplish specific system-side goals, are an exciting area in conversational AI. In this work, by formulating a
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
- Asia > China > Hong Kong (0.04)
- Oceania > Australia > Victoria > Melbourne (0.04)
- (6 more...)
- Media > Film (1.00)
- Leisure & Entertainment (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.72)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)
- Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)
- North America > United States > New York (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Asia > Singapore (0.04)
- Law > Intellectual Property & Technology Law (0.99)
- Leisure & Entertainment > Games > Computer Games (0.46)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Sensing and Signal Processing > Image Processing (0.94)
- Information Technology > Artificial Intelligence > Vision (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)
Check-worthy Claim Detection across Topics for Automated Fact-checking
Abumansour, Amani S., Zubiaga, Arkaitz
An important component of an automated fact-checking system is the claim check-worthiness detection system, which ranks sentences by prioritising them based on their need to be checked. Despite a body of research tackling the task, previous research has overlooked the challenging nature of identifying check-worthy claims across different topics. In this paper, we assess and quantify the challenge of detecting check-worthy claims for new, unseen topics. After highlighting the problem, we propose the AraCWA model to mitigate the performance deterioration when detecting check-worthy claims across topics. The AraCWA model enables boosting the performance for new topics by incorporating two components for few-shot learning and data augmentation. Using a publicly available dataset of Arabic tweets consisting of 14 different topics, we demonstrate that our proposed data augmentation strategy achieves substantial improvements across topics overall, where the extent of the improvement varies across topics. Further, we analyse the semantic similarities between topics, suggesting that the similarity metric could be used as a proxy to determine the difficulty level of an unseen topic prior to undertaking the task of labelling the underlying sentences.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > Middle East > Qatar (0.05)
- Asia > Middle East > Kuwait (0.05)
- (13 more...)