AITopics | content analysis

Collaborating Authors

content analysis

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Generative Large Language Models (gLLMs) in Content Analysis: A Practical Guide for Communication Research

Kravets-Meinke, Daria, Schmid-Petri, Hannah, Niemann, Sonja, Schmid, Ute

arXiv.org Artificial IntelligenceOct-29-2025

Generative Large Language Models (gLLMs), such as ChatGPT, are increasingly being used in communication research for content analysis. Studies show that gLLMs can outperform both crowd workers and trained coders, such as research assistants, on various coding tasks relevant to communication science, often at a fraction of the time and cost. Additionally, gLLMs can decode implicit meanings and contextual information, be instructed using natural language, deployed with only basic programming skills, and require little to no annotated data beyond a validation dataset - constituting a paradigm shift in automated content analysis. Despite their potential, the integration of gLLMs into the methodological toolkit of communication research remains underdeveloped. In gLLM-assisted quantitative content analysis, researchers must address at least seven critical challenges that impact result quality: (1) codebook development, (2) prompt engineering, (3) model selection, (4) parameter tuning, (5) iterative refinement, (6) validation of the model's reliability, and optionally, (7) performance enhancement. This paper synthesizes emerging research on gLLM-assisted quantitative content analysis and proposes a comprehensive best-practice guide to navigate these challenges. Our goal is to make gLLM-based content analysis more accessible to a broader range of communication researchers and ensure adherence to established disciplinary quality standards of validity, reliability, reproducibility, and research ethics.

data mining, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2510.24337

Country: Europe > Austria (0.28)

Genre: Research Report > New Finding (0.68)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Hierarchical Error Framework for Reliable Automated Coding in Communication Research: Applications to Health and Political Communication

Zhao, Zhilong, Liu, Yindi

arXiv.org Artificial IntelligenceOct-27-2025

Automated content analysis increasingly supports communication research, yet scaling manual coding into computational pipelines raises concerns about measurement reliability and validity. We introduce a Hierarchical Error Correction (HEC) framework that treats model failures as layered measurement errors (knowledge gaps, reasoning limitations, and complexity constraints) and targets the layers that most affect inference. The framework implements a three-phase methodology: systematic error profiling across hierarchical layers, targeted intervention design matched to dominant error sources, and rigorous validation with statistical testing. Evaluating HEC across health communication (medical specialty classification) and political communication (bias detection), and legal tasks, we validate the approach with five diverse large language models. Results show average accuracy gains of 11.2 percentage points (p < .001, McNemar's test) and stable conclusions via reduced systematic misclassification. Cross-model validation demonstrates consistent improvements (range: +6.8 to +14.6pp), with effectiveness concentrated in moderate-to-high baseline tasks (50-85% accuracy). A boundary study reveals diminished returns in very high-baseline (>85%) or precision-matching tasks, establishing applicability limits. We map layered errors to threats to construct and criterion validity and provide a transparent, measurement-first blueprint for diagnosing error profiles, selecting targeted interventions, and reporting reliability/validity evidence alongside accuracy. This applies to automated coding across communication research and the broader social sciences.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2509.24841

Country: Asia > China (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry:

Law (0.95)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(3 more...)

Add feedback

Echoes of Humanity: Exploring the Perceived Humanness of AI Music

Figueiredo, Flavio, Martinelli, Giovanni, Sousa, Henrique, Rodrigues, Pedro, Pedrosa, Frederico, Ferreira, Lucas N.

arXiv.org Artificial IntelligenceOct-1-2025

Recent advances in AI music (AIM) generation services are currently transforming the music industry. Given these advances, understanding how humans perceive AIM is crucial both to educate users on identifying AIM songs, and, conversely, to improve current models. We present results from a listener-focused experiment aimed at understanding how humans perceive AIM. In a blind, Turing-like test, participants were asked to distinguish, from a pair, the AIM and human-made song. We contrast with other studies by utilizing a randomized controlled crossover trial that controls for pairwise similarity and allows for a causal interpretation. We are also the first study to employ a novel, author-uncontrolled dataset of AIM songs from real-world usage of commercial models (i.e., Suno). We establish that listeners' reliability in distinguishing AIM causally increases when pairs are similar. Lastly, we conduct a mixed-methods content analysis of listeners' free-form feedback, revealing a focus on vocal and technical cues in their judgments.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2509.25601

Genre:

Research Report > Strength High (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Empowering Computing Education Researchers Through LLM-Assisted Content Analysis

Gale, Laurie, Nicolajsen, Sebastian Mateos

arXiv.org Artificial IntelligenceAug-27-2025

Computing education research (CER) is often instigated by practitioners wanting to improve both their own and the wider discipline's teaching practice. However, the latter is often difficult as many researchers lack the colleagues, resources, or capacity to conduct research that is generalisable or rigorous enough to advance the discipline. As a result, research methods that enable sense-making with larger volumes of qualitative data, while not increasing the burden on the researcher, have significant potential within CER. In this discussion paper, we propose such a method for conducting rigorous analysis on large volumes of textual data, namely a variation of LLM-assisted content analysis (LACA). This method combines content analysis with the use of large language models, empowering researchers to conduct larger-scale research which they would otherwise not be able to perform. Using a computing education dataset, we illustrate how LACA could be applied in a reproducible and rigorous manner. We believe this method has potential in CER, enabling more generalisable findings from a wider range of research. This, together with the development of similar methods, can help to advance both the practice and research quality of the CER discipline.

artificial intelligence, large language model, natural language, (14 more...)

arXiv.org Artificial Intelligence

2508.18872

Country:

North America > United States (1.00)
Europe (0.93)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.69)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

ProxAnn: Use-Oriented Evaluations of Topic Models and Document Clustering

Hoyle, Alexander, Calvo-Bartolomé, Lorena, Boyd-Graber, Jordan, Resnik, Philip

arXiv.org Artificial IntelligenceJul-2-2025

Topic model and document-clustering evaluations either use automated metrics that align poorly with human preferences or require expert labels that are intractable to scale. We design a scalable human evaluation protocol and a corresponding automated approximation that reflect practitioners' real-world usage of models. Annotators -- or an LLM-based proxy -- review text items assigned to a topic or cluster, infer a category for the group, then apply that category to other documents. Using this protocol, we collect extensive crowdworker annotations of outputs from a diverse set of topic models on two datasets. We then use these annotations to validate automated proxies, finding that the best LLM proxies are statistically indistinguishable from a human annotator and can therefore serve as a reasonable substitute in automated evaluations. Package, web interface, and data are at https://github.com/ahoho/proxann

annotator, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2507.00828

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > Ohio (0.04)
North America > United States > Maryland (0.04)
(25 more...)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment > Sports > Baseball (1.00)
Health & Medicine (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Recalibrating the Compass: Integrating Large Language Models into Classical Research Methods

Peng, Tai-Quan, Yang, Xuzhen

arXiv.org Artificial IntelligenceMay-27-2025

This paper examines how large language models (LLMs) are transforming core quantitative methods in communication research in particular, and in the social sciences more broadly-namely, content analysis, survey research, and experimental studies. Rather than replacing classical approaches, LLMs introduce new possibilities for coding and interpreting text, simulating dynamic respondents, and generating personalized and interactive stimuli. Drawing on recent interdisciplinary work, the paper highlights both the potential and limitations of LLMs as research tools, including issues of validity, bias, and interpretability. To situate these developments theoretically, the paper revisits Lasswell's foundational framework -- "Who says what, in which channel, to whom, with what effect?" -- and demonstrates how LLMs reconfigure message studies, audience analysis, and effects research by enabling interpretive variation, audience trajectory modeling, and counterfactual experimentation. Revisiting the metaphor of the methodological compass, the paper argues that classical research logics remain essential as the field integrates LLMs and generative AI. By treating LLMs not only as technical instruments but also as epistemic and cultural tools, the paper calls for thoughtful, rigorous, and imaginative use of LLMs in future communication and social science research.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2505.19402

Country:

North America > United States > Michigan (0.28)
North America > United States > California (0.28)

Genre:

Research Report > Experimental Study (0.88)
Research Report > New Finding (0.88)

Industry:

Government (1.00)
Media > News (0.67)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.66)

Add feedback

Data to Decisions: A Computational Framework to Identify skill requirements from Advertorial Data

Singh, Aakash, Kanaujia, Anurag, Singh, Vivek Kumar

arXiv.org Artificial IntelligenceMar-21-2025

Among the factors of production, human capital or skilled manpower is the one that keeps evolving and adapts to changing conditions and resources. This adaptability makes human capital the most crucial factor in ensuring a sustainable growth of industry/sector. As new technologies are developed and adopted, the new generations are required to acquire skills in newer technologies in order to be employable. At the same time professionals are required to upskill and reskill themselves to remain relevant in the industry. There is however no straightforward method to identify the skill needs of the industry at a given point of time. Therefore, this paper proposes a data to decision framework that can successfully identify the desired skill set in a given area by analysing the advertorial data collected from popular online job portals and supplied as input to the framework. The proposed framework uses techniques of statistical analysis, data mining and natural language processing for the purpose. The applicability of the framework is demonstrated on CS&IT job advertisement data from India. The analytical results not only provide useful insights about current state of skill needs in CS&IT industry but also provide practical implications to prospective job applicants, training agencies, and institutions of higher education & professional training.

data mining, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-83793-7_28

2503.17424

Country:

North America > United States (0.68)
Asia > India > Karnataka > Bengaluru (0.05)
Asia > India > Maharashtra > Mumbai (0.04)
(9 more...)

Genre:

Instructional Material (0.88)
Research Report > New Finding (0.67)

Industry:

Banking & Finance (1.00)
Information Technology > Software (0.94)
Education > Educational Setting > Higher Education (0.68)
Government > Regional Government (0.68)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

MediaMind: Revolutionizing Media Monitoring using Agentification

Gunduz, Ahmet, Yuksel, Kamer Ali, Sawaf, Hassan

arXiv.org Artificial IntelligenceFeb-18-2025

In an era of rapid technological advancements, agentification of software tools has emerged as a critical innovation, enabling systems to function autonomously and adaptively. This paper introduces MediaMind as a case study to demonstrate the agentification process, highlighting how existing software can be transformed into intelligent agents capable of independent decision-making and dynamic interaction. Developed by aiXplain, MediaMind leverages agent-based architecture to autonomously monitor, analyze, and provide insights from multilingual media content in real time. The focus of this paper is on the technical methodologies and design principles behind agentifying MediaMind, showcasing how agentification enhances adaptability, efficiency, and responsiveness. Through detailed case studies and practical examples, we illustrate how the agentification of MediaMind empowers organizations to streamline workflows, optimize decision-making, and respond to evolving trends. This work underscores the broader potential of agentification to revolutionize software tools across various domains.

artificial intelligence, mediamind, natural language, (17 more...)

arXiv.org Artificial Intelligence

2502.12745

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)

Genre: Research Report (0.40)

Industry: Media (0.30)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

SCALE: Towards Collaborative Content Analysis in Social Science with Large Language Model Agents and Human Intervention

Zhao, Chengshuai, Tan, Zhen, Wong, Chau-Wai, Zhao, Xinyan, Chen, Tianlong, Liu, Huan

arXiv.org Artificial IntelligenceFeb-15-2025

Content analysis breaks down complex and unstructured texts into theory-informed numerical categories. Particularly, in social science, this process usually relies on multiple rounds of manual annotation, domain expert discussion, and rule-based refinement. In this paper, we introduce SCALE, a novel multi-agent framework that effectively $\underline{\textbf{S}}$imulates $\underline{\textbf{C}}$ontent $\underline{\textbf{A}}$nalysis via $\underline{\textbf{L}}$arge language model (LLM) ag$\underline{\textbf{E}}$nts. SCALE imitates key phases of content analysis, including text coding, collaborative discussion, and dynamic codebook evolution, capturing the reflective depth and adaptive discussions of human researchers. Furthermore, by integrating diverse modes of human intervention, SCALE is augmented with expert input to further enhance its performance. Extensive evaluations on real-world datasets demonstrate that SCALE achieves human-approximated performance across various complex content analysis tasks, offering an innovative potential for future social science research.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2502.10937

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.93)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.98)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

DreamLLM-3D: Affective Dream Reliving using Large Language Model and 3D Generative AI

Liu, Pinyao, Lee, Keon Ju, Steinmaurer, Alexander, Picard-Deland, Claudia, Carr, Michelle, Kitson, Alexandra

arXiv.org Artificial IntelligenceFeb-13-2025

We present DreamLLM-3D, a composite multimodal AI system behind an immersive art installation for dream re-experiencing. It enables automated dream content analysis for immersive dream-reliving, by integrating a Large Language Model (LLM) with text-to-3D Generative AI. The LLM processes voiced dream reports to identify key dream entities (characters and objects), social interaction, and dream sentiment. The extracted entities are visualized as dynamic 3D point clouds, with emotional data influencing the color and soundscapes of the virtual dream environment. Additionally, we propose an experiential AI-Dreamworker Hybrid paradigm. Our system and paradigm could potentially facilitate a more emotionally engaging dream-reliving experience, enhancing personal insights and creativity.

content analysis, dreamwork, interaction, (15 more...)

arXiv.org Artificial Intelligence

2503.16439

Country:

South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
North America > United States > District of Columbia > Washington (0.04)
North America > United States > Connecticut > Fairfield County > Norwalk (0.04)
(4 more...)

Genre: Research Report (0.41)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.62)

Add feedback