AITopics | professionalism

Collaborating Authors

professionalism

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Dynamic Fusion Model for Consistent Crisis Response

Song, Xiaoying, Anik, Anirban Saha, Blanco, Eduardo, Frias-Martinez, Vanessa, Hong, Lingzi

arXiv.org Artificial IntelligenceNov-18-2025

In response to the urgent need for effective communication with crisis-affected populations, automated responses driven by language models have been proposed to assist in crisis communications. A critical yet often overlooked factor is the consistency of response style, which could affect the trust of affected individuals in responders. Despite its importance, few studies have explored methods for maintaining stylistic consistency across generated responses. To address this gap, we propose a novel metric for evaluating style consistency and introduce a fusion-based generation approach grounded in this metric. Our method employs a two-stage process: it first assesses the style of candidate responses and then optimizes and integrates them at the instance level through a fusion process. This enables the generation of high-quality responses while significantly reducing stylistic variation between instances. Experimental results across multiple datasets demonstrate that our approach consistently outperforms baselines in both response quality and stylistic uniformity.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2509.01053

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.67)

Industry:

Health & Medicine (1.00)
Information Technology (0.93)
Government (0.68)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

Meta-rater: A Multi-dimensional Data Selection Method for Pre-training Language Models

Zhuang, Xinlin, Peng, Jiahui, Ma, Ren, Wang, Yinfan, Bai, Tianyi, Wei, Xingjian, Qiu, Jiantao, Zhang, Chi, Qian, Ying, He, Conghui

arXiv.org Artificial IntelligenceAug-7-2025

The composition of pre-training datasets for large language models (LLMs) remains largely undisclosed, hindering transparency and efforts to optimize data quality, a critical driver of model performance. Current data selection methods, such as natural language quality assessments, diversity-based filters, and classifier-based approaches, are limited by single-dimensional evaluation or redundancy-focused strategies. To address these gaps, we propose four dimensions to evaluate data quality: professionalism, readability, reasoning, and cleanliness. We further introduce Meta-rater,a multi-dimensional data selection method that integrates these dimensions with existing quality metrics through learned optimal weightings. Meta-rater employs proxy models to train a regression model that predicts validation loss, enabling the identification of optimal combinations of quality scores. Experiments demonstrate that Meta-rater doubles convergence speed for 1.3B parameter models and improves downstream task performance by 3.23, with advantages that scale to models as large as 7.2B parameters. Our work establishes that holistic, multi-dimensional quality integration significantly outperforms conventional single-dimension approaches, offering a scalable paradigm for enhancing pre-training efficiency and model capability. To advance future research, we release scripts, data, and models at https://github.com/opendatalab/Meta-rater.

large language model, machine learning, meta-rater, (16 more...)

arXiv.org Artificial Intelligence

2504.14194

Country:

Asia (1.00)
Europe (0.92)
North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Leisure & Entertainment > Sports (1.00)
Information Technology (0.68)
Health & Medicine > Therapeutic Area (0.67)
Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

ElectriQ: A Benchmark for Assessing the Response Capability of Large Language Models in Power Marketing

Wang, Jinzhi, Peng, Qingke, Li, Haozhou, Zeng, Zeyuan, Song, Qinfeng, Yang, Kaixuan, Zhang, Jiangbo, Wang, Yaoying, Li, Ruimeng, Zhou, Biyi

arXiv.org Artificial IntelligenceAug-1-2025

Electric power marketing telephone customer service primarily communicates with customers via phone calls to understand their electricity usage needs, provide consultations, process service applications, and handle complaints [1]. Ensuring timely and effective responses is essential throughout the service process. However, current systems (e.g., 95598, the customer service hotline of State Grid Corporation of China) often suffer from poor user experience, delayed responses, and inaccurate information[2] [3]. These traditional systems rely heavily on fixed procedures and templates, lacking the flexibility to address complex and diverse customer demands. This limitation is particularly pronounced in the highly specialized field of electric power marketing, where slow response times and insufficiently tailored solutions negatively impact service quality. Although human agents can complement these systems by managing more complex issues, they also face significant challenges, such as high workloads during peak periods, delayed response times, and inconsistent levels of professional knowledge and expertise. As a result, it is difficult to guarantee consistent and high-quality service for all customers.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2507.22911

Country: Asia > China (0.49)

Genre: Research Report > New Finding (1.00)

Industry: Energy > Power Industry (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Modeling Professionalism in Expert Questioning through Linguistic Differentiation

D'Agostino, Giulia, Chen, Chung-Chi

arXiv.org Artificial IntelligenceJul-29-2025

Professionalism is a crucial yet underexplored dimension of expert communication, particularly in high-stakes domains like finance. This paper investigates how linguistic features can be leveraged to model and evaluate professionalism in expert questioning. We introduce a novel annotation framework to quantify structural and pragmatic elements in financial analyst questions, such as discourse regulators, prefaces, and request types. Using both human-authored and large language model (LLM)-generated questions, we construct two datasets: one annotated for perceived professionalism and one labeled by question origin. We show that the same linguistic features correlate strongly with both human judgments and authorship origin, suggesting a shared stylistic foundation. Furthermore, a classifier trained solely on these interpretable features outperforms gemini-2.0 and SVM baselines in distinguishing expert-authored questions. Our findings demonstrate that professionalism is a learnable, domain-general construct that can be captured through linguistically grounded modeling.

artificial intelligence, large language model, natural language, (16 more...)

arXiv.org Artificial Intelligence

2507.20249

Country:

Europe (0.14)
Asia (0.14)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)

Add feedback

A Fuzzy Supervisor Agent Design for Clinical Reasoning Assistance in a Multi-Agent Educational Clinical Scenario Simulation

Zheng, Weibing, Turner, Laurah, Kropczynski, Jess, Ozer, Murat, Overla, Seth, Halse, Shane

arXiv.org Artificial IntelligenceJul-9-2025

Assisting medical students with clinical reasoning (CR) during clinical scenario training remains a persistent challenge in medical education. This paper presents the design and architecture of the Fuzzy Supervisor Agent (FSA), a novel component for the Multi-Agent Educational Clinical Scenario Simulation (MAECSS) platform. The FSA leverages a Fuzzy Inference System (FIS) to continuously interpret student interactions with specialized clinical agents (e.g., patient, physical exam, diagnostic, intervention) using pre-defined fuzzy rule bases for professionalism, medical relevance, ethical behavior, and contextual distraction. By analyzing student decision-making processes in real-time, the FSA is designed to deliver adaptive, context-aware feedback and provides assistance precisely when students encounter difficulties. This work focuses on the technical framework and rationale of the FSA, highlighting its potential to provide scalable, flexible, and human-like supervision in simulation-based medical education. Future work will include empirical evaluation and integration into broader educational settings. More detailed design and implementation is open sourced here.

artificial intelligence, assistance, fuzzy logic, (10 more...)

arXiv.org Artificial Intelligence

2507.05275

Country: North America > United States > Ohio > Hamilton County > Cincinnati (0.04)

Genre: Research Report > Experimental Study (0.34)

Industry:

Education > Educational Setting (0.77)
Health & Medicine > Diagnostic Medicine (0.76)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

LLM-as-a-Fuzzy-Judge: Fine-Tuning Large Language Models as a Clinical Evaluation Judge with Fuzzy Logic

Zheng, Weibing, Turner, Laurah, Kropczynski, Jess, Ozer, Murat, Nguyen, Tri, Halse, Shane

arXiv.org Artificial IntelligenceJun-16-2025

Clinical communication skills are critical in medical education, and practicing and assessing clinical communication skills on a scale is challenging. Although LLM-powered clinical scenario simulations have shown promise in enhancing medical students' clinical practice, providing automated and scalable clinical evaluation that follows nuanced physician judgment is difficult. This paper combines fuzzy logic and Large Language Model (LLM) and proposes LLM-as-a-Fuzzy-Judge to address the challenge of aligning the automated evaluation of medical students' clinical skills with subjective physicians' preferences. LLM-as-a-Fuzzy-Judge is an approach that LLM is fine-tuned to evaluate medical students' utterances within student-AI patient conversation scripts based on human annotations from four fuzzy sets, including Professionalism, Medical Relevance, Ethical Behavior, and Contextual Distraction. The methodology of this paper started from data collection from the LLM-powered medical education system, data annotation based on multidimensional fuzzy sets, followed by prompt engineering and the supervised fine-tuning (SFT) of the pre-trained LLMs using these human annotations. The results show that the LLM-as-a-Fuzzy-Judge achieves over 80\% accuracy, with major criteria items over 90\%, effectively leveraging fuzzy logic and LLM as a solution to deliver interpretable, human-aligned assessment. This work suggests the viability of leveraging fuzzy logic and LLM to align with human preferences, advances automated evaluation in medical education, and supports more robust assessment and judgment practices. The GitHub repository of this work is available at https://github.com/2sigmaEdTech/LLMAsAJudge

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2506.11221

Country:

South America > Argentina > Patagonia > Río Negro Province > Viedma (0.04)
North America > United States > Ohio > Hamilton County > Cincinnati (0.04)
Asia > Singapore (0.04)
Asia > Middle East > Israel (0.04)

Genre: Research Report > New Finding (0.88)

Industry:

Health & Medicine (1.00)
Education > Educational Setting (0.78)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

ScoreRAG: A Retrieval-Augmented Generation Framework with Consistency-Relevance Scoring and Structured Summarization for News Generation

Lin, Pei-Yun, Tsai, Yen-lung

arXiv.org Artificial IntelligenceJun-5-2025

This research introduces ScoreRAG, an approach to enhance the quality of automated news generation. Despite advancements in Natural Language Processing and large language models, current news generation methods often struggle with hallucinations, factual inconsistencies, and lack of domain-specific expertise when producing news articles. ScoreRAG addresses these challenges through a multi-stage framework combining retrieval-augmented generation, consistency relevance evaluation, and structured summarization. The system first retrieves relevant news documents from a vector database, maps them to complete news items, and assigns consistency relevance scores based on large language model evaluations. These documents are then reranked according to relevance, with low-quality items filtered out. The framework proceeds to generate graded summaries based on relevance scores, which guide the large language model in producing complete news articles following professional journalistic standards. Through this methodical approach, ScoreRAG aims to significantly improve the accuracy, coherence, informativeness, and professionalism of generated news articles while maintaining stability and consistency throughout the generation process. The code and demo are available at: https://github.com/peiyun2260/ScoreRAG.

large language model, machine learning, news article, (19 more...)

arXiv.org Artificial Intelligence

2506.03704

Country: Asia > Taiwan (0.04)

Genre: Research Report (0.84)

Industry: Media > News (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Will AI Tell Lies to Save Sick Children? Litmus-Testing AI Values Prioritization with AIRiskDilemmas

Chiu, Yu Ying, Wang, Zhilin, Maiya, Sharan, Choi, Yejin, Fish, Kyle, Levine, Sydney, Hubinger, Evan

arXiv.org Artificial IntelligenceMay-21-2025

Detecting AI risks becomes more challenging as stronger models emerge and find novel methods such as Alignment Faking to circumvent these detection attempts. Inspired by how risky behaviors in humans (i.e., illegal activities that may hurt others) are sometimes guided by strongly-held values, we believe that identifying values within AI models can be an early warning system for AI's risky behaviors. We create LitmusValues, an evaluation pipeline to reveal AI models' priorities on a range of AI value classes. Then, we collect AIRiskDilemmas, a diverse collection of dilemmas that pit values against one another in scenarios relevant to AI safety risks such as Power Seeking. By measuring an AI model's value prioritization using its aggregate choices, we obtain a self-consistent set of predicted value priorities that uncover potential risks. We show that values in LitmusValues (including seemingly innocuous ones like Care) can predict for both seen risky behaviors in AIRiskDilemmas and unseen risky behaviors in HarmBench.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2505.14633

Country: North America > United States (0.28)

Genre: Research Report > Promising Solution (0.87)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Add feedback

In Pursuit of Professionalism

Communications of the ACMApr-28-2025, 14:27:39 GMT

Robin K. Hill Is Computer Science a Profession? We computer scientists--many of us--like to think of ourselves as professionals, as do doctors and lawyers, and police officers, and accountants. But there are definitions of "profession," with criteria and expectations, that we fail to meet. Are we ready, collectively, to confront the criteria? Do we want to be card-carrying members of a learned institution of service?

artificial intelligence, guild, professionalism, (13 more...)

Communications of the ACM

Country: Europe > United Kingdom (0.06)

Industry: Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.35)

Technology: Information Technology > Artificial Intelligence (0.31)

Add feedback

Values in the Wild: Discovering and Analyzing Values in Real-World Language Model Interactions

Huang, Saffron, Durmus, Esin, McCain, Miles, Handa, Kunal, Tamkin, Alex, Hong, Jerry, Stern, Michael, Somani, Arushi, Zhang, Xiuruo, Ganguli, Deep

arXiv.org Artificial IntelligenceApr-22-2025

AI assistants can impart value judgments that shape people's decisions and worldviews, yet little is known empirically about what values these systems rely on in practice. To address this, we develop a bottom-up, privacy-preserving method to extract the values (normative considerations stated or demonstrated in model responses) that Claude 3 and 3.5 models exhibit in hundreds of thousands of real-world interactions. We empirically discover and taxonomize 3,307 AI values and study how they vary by context. We find that Claude expresses many practical and epistemic values, and typically supports prosocial human values while resisting values like "moral nihilism". While some values appear consistently across contexts (e.g. "transparency"), many are more specialized and context-dependent, reflecting the diversity of human interlocutors and their varied contexts. For example, "harm prevention" emerges when Claude resists users, "historical accuracy" when responding to queries about controversial events, "healthy boundaries" when asked for relationship advice, and "human agency" in technology ethics discussions. By providing the first large-scale empirical mapping of AI values in deployment, our work creates a foundation for more grounded evaluation and design of values in AI systems.

ai value, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2504.15236

Country: Asia > Middle East > UAE (0.28)

Genre: Research Report > New Finding (0.67)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)

Add feedback