AITopics | version 1

Collaborating Authors

version 1

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

InvisibleBench: A Deployment Gate for Caregiving Relationship AI

Madad, Ali

arXiv.org Artificial IntelligenceNov-27-2025

InvisibleBench is a deployment gate for caregiving-relationship AI, evaluating 3-20+ turn interactions across five dimensions: Safety, Compliance, Trauma-Informed Design, Belonging/Cultural Fitness, and Memory. The benchmark includes autofail conditions for missed crises, medical advice (WOPR Act), harmful information, and attachment engineering. We evaluate four frontier models across 17 scenarios (N=68) spanning three complexity tiers. All models show significant safety gaps (11.8-44.8 percent crisis detection), indicating the necessity of deterministic crisis routing in production systems. DeepSeek Chat v3 achieves the highest overall score (75.9 percent), while strengths differ by dimension: GPT-4o Mini leads Compliance (88.2 percent), Gemini leads Trauma-Informed Design (85.0 percent), and Claude Sonnet 4.5 ranks highest in crisis detection (44.8 percent). We release all scenarios, judge prompts, and scoring configurations with code. InvisibleBench extends single-turn safety tests by evaluating longitudinal risk, where real harms emerge. No clinical claims; this is a deployment-readiness evaluation.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2511.20733

Country: North America > United States (0.28)

Genre: Research Report > Experimental Study (0.93)

Industry:

Law (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Government (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A modular framework for automated evaluation of procedural content generation in serious games with deep reinforcement learning agents

Kalafatis, Eleftherios, Mitsis, Konstantinos, Zarkogianni, Konstantia, Athanasiou, Maria, Nikita, Konstantina

arXiv.org Artificial IntelligenceJul-21-2025

Serious Games (SGs) are nowadays shifting focus to include procedural content generation (PCG) in the development process as a means of offering personalized and enhanced player experience. However, the development of a framework to assess the impact of PCG techniques when integrated into SGs remains particularly challenging. This study proposes a methodology for automated evaluation of PCG integration in SGs, incorporating deep reinforcement learning (DRL) game testing agents. To validate the proposed framework, a previously introduced SG featuring card game mechanics and incorporating three different versions of PCG for nonplayer character (NPC) creation has been deployed. Version 1 features random NPC creation, while versions 2 and 3 utilize a genetic algorithm approach. These versions are used to test the impact of different dynamic SG environments on the proposed framework's agents. The obtained results highlight the superiority of the DRL game testing agents trained on Versions 2 and 3 over those trained on Version 1 in terms of win rate (i.e. number of wins per played games) and training time. More specifically, within the execution of a test emulating regular gameplay, both Versions 2 and 3 peaked at a 97% win rate and achieved statistically significant higher (p=0009) win rates compared to those achieved in Version 1 that peaked at 94%. Overall, results advocate towards the proposed framework's capability to produce meaningful data for the evaluation of procedurally generated content in SGs.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TG.2025.3589439

2505.16801

Country:

Europe (0.93)
North America > United States (0.28)

Genre:

Research Report > Experimental Study (0.88)
Research Report > New Finding (0.66)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Lecture Video Visual Objects (LVVO) Dataset: A Benchmark for Visual Object Detection in Educational Videos

Biswas, Dipayan, Shah, Shishir, Subhlok, Jaspal

arXiv.org Artificial IntelligenceJun-18-2025

We introduce the Lecture Video Visual Objects (LVVO) dataset, a new benchmark for visual object detection in educational video content. The dataset consists of 4,000 frames extracted from 245 lecture videos spanning biology, computer science, and geosciences. A subset of 1,000 frames, referred to as LVVO_1k, has been manually annotated with bounding boxes for four visual categories: Table, Chart-Graph, Photographic-image, and Visual-illustration. Each frame was labeled independently by two annotators, resulting in an inter-annotator F1 score of 83.41%, indicating strong agreement. To ensure high-quality consensus annotations, a third expert reviewed and resolved all cases of disagreement through a conflict resolution process. To expand the dataset, a semi-supervised approach was employed to automatically annotate the remaining 3,000 frames, forming LVVO_3k. The complete dataset offers a valuable resource for developing and evaluating both supervised and semi-supervised methods for visual content detection in educational videos. The LVVO dataset is publicly available to support further research in this domain.

annotation, artificial intelligence, dataset, (13 more...)

arXiv.org Artificial Intelligence

2506.13657

Genre:

Instructional Material (0.47)
Research Report (0.40)

Industry: Education > Educational Technology > Audio & Video (0.82)

Technology: Information Technology > Artificial Intelligence > Vision (0.62)

Add feedback

Assessing and Refining ChatGPT's Performance in Identifying Targeting and Inappropriate Language: A Comparative Study

Baran, Barbarestani, Isa, Maks, Piek, Vossen

arXiv.org Artificial IntelligenceMay-29-2025

This study evaluates the effectiveness of ChatGPT, an advanced AI model for natural language processing, in identifying targeting and inappropriate language in online comments. With the increasing challenge of moderating vast volumes of user-generated content on social network sites, the role of AI in content moderation has gained prominence. We compared ChatGPT's performance against crowd-sourced annotations and expert evaluations to assess its accuracy, scope of detection, and consistency. Our findings highlight that ChatGPT performs well in detecting inappropriate content, showing notable improvements in accuracy through iterative refinements, particularly in Version 6. However, its performance in targeting language detection showed variability, with higher false positive rates compared to expert judgments. This study contributes to the field by demonstrating the potential of AI models like ChatGPT to enhance automated content moderation systems while also identifying areas for further improvement. The results underscore the importance of continuous model refinement and contextual understanding to better support automated moderation and mitigate harmful online behavior.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2505.2171

Country: Asia (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (1.00)
Media (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

MoESD: Mixture of Experts Stable Diffusion to Mitigate Gender Bias

Wang, Guorun, Specia, Lucia

arXiv.org Artificial IntelligenceJun-25-2024

Text-to-image models are known to propagate social biases. For example when prompted to generate images of people in certain professions, these models tend to systematically generate specific genders or ethnicity. In this paper, we show that this bias is already present in the text encoder of the model and introduce a Mixture-of-Experts approach by identifying text-encoded bias in the latent space and then creating a bias-identification gate. More specifically, we propose MoESD (Mixture of Experts Stable Diffusion) with BiAs (Bias Adapters) to mitigate gender bias. We also demonstrate that a special token is essential during the mitigation process. With experiments focusing on gender bias, we demonstrate that our approach successfully mitigates gender bias while maintaining image quality.

diffusion, occupation, stable diffusion, (16 more...)

arXiv.org Artificial Intelligence

2407.11002

Country:

Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine (1.00)
Consumer Products & Services > Restaurants (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

The 2nd FutureDial Challenge: Dialog Systems with Retrieval Augmented Generation (FutureDial-RAG)

Cai, Yucheng, Chen, Si, Huang, Yi, Feng, Junlan, Ou, Zhijian

arXiv.org Artificial IntelligenceMay-21-2024

Developing intelligent dialog systems has been one of the longest running goals in AI. In recent years, significant progress has been made in building dialog systems with the breakthrough of deep learning methods and the large amount of conversational data being made available for system development (Budzianowski et al., 2018; Ou et al., 2022a; Ouyang et al., 2022; Achiam et al., 2023). There are still full of challenges toward building future dialog systems. The first FutureDial challenge focused on building semi-supervised and reinforced task-oriented dialog systems (FutureDial-SereTOD) (Ou et al., 2022a;b), which was successfully held at EMNLP 2022 SereTOD workshop However, problems like hallucination and fabrication (Alkaissi & McFarlane, 2023) still hinder the usage of such systems in real-life applications like customer service systems, which requires pin-point accuracy. Retrieval augmented generation (RAG) (Lewis et al., 2020; Guu et al., 2020) has been introduced to enhance dialog systems with retrieved information from external knowledge bases and has attracted increasing interests.

dataset, dialog system, information, (13 more...)

arXiv.org Artificial Intelligence

2405.13084

Country: Asia > China (0.06)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Deep Privacy Funnel Model: From a Discriminative to a Generative Approach with an Application to Face Recognition

Razeghi, Behrooz, Rahimi, Parsa, Marcel, Sébastien

arXiv.org Artificial IntelligenceApr-3-2024

In this study, we apply the information-theoretic Privacy Funnel (PF) model to the domain of face recognition, developing a novel method for privacy-preserving representation learning within an end-to-end training framework. Our approach addresses the trade-off between obfuscation and utility in data protection, quantified through logarithmic loss, also known as self-information loss. This research provides a foundational exploration into the integration of information-theoretic privacy principles with representation learning, focusing specifically on the face recognition systems. We particularly highlight the adaptability of our framework with recent advancements in face recognition networks, such as AdaFace and ArcFace. In addition, we introduce the Generative Privacy Funnel ($\mathsf{GenPF}$) model, a paradigm that extends beyond the traditional scope of the PF model, referred to as the Discriminative Privacy Funnel ($\mathsf{DisPF}$). This $\mathsf{GenPF}$ model brings new perspectives on data generation methods with estimation-theoretic and information-theoretic privacy guarantees. Complementing these developments, we also present the deep variational PF (DVPF) model. This model proposes a tractable variational bound for measuring information leakage, enhancing the understanding of privacy preservation challenges in deep representation learning. The DVPF model, associated with both $\mathsf{DisPF}$ and $\mathsf{GenPF}$ models, sheds light on connections with various generative models such as Variational Autoencoders (VAEs), Generative Adversarial Networks (GANs), and Diffusion models. Complementing our theoretical contributions, we release a reproducible PyTorch package, facilitating further exploration and application of these privacy-preserving methodologies in face recognition systems.

information, privacy, version 1, (13 more...)

arXiv.org Artificial Intelligence

2404.02696

Country:

Asia > Middle East > Jordan (0.04)
Europe > Switzerland > Vaud > Lausanne (0.04)
Africa > Sudan (0.04)
(9 more...)

Genre:

Research Report > Promising Solution (0.87)
Research Report > New Finding (0.87)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Government > Military (0.92)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Summarizing Strategy Card Game AI Competition

Kowalski, Jakub, Miernik, Radosław

arXiv.org Artificial IntelligenceJul-7-2023

This paper concludes five years of AI competitions based on Legends of Code and Magic (LOCM), a small Collectible Card Game (CCG), designed with the goal of supporting research and algorithm development. The game was used in a number of events, including Community Contests on the CodinGame platform, and Strategy Card Game AI Competition at the IEEE Congress on Evolutionary Computation and IEEE Conference on Games. LOCM has been used in a number of publications related to areas such as game tree search algorithms, neural networks, evaluation functions, and CCG deckbuilding. We present the rules of the game, the history of organized competitions, and a listing of the participant and their approaches, as well as some general advice on organizing AI competitions for the research community. Although the COG 2022 edition was announced to be the last one, the game remains available and can be played using an online leaderboard arena.

artificial intelligence, evolutionary algorithm, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2305.11814

Country:

Europe > Poland > Lower Silesia Province > Wroclaw (0.04)
Asia > China (0.04)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

Add feedback

Davinci the Dualist: the mind-body divide in large language models and in human learners

Berent, Iris, Sansiveri, Alexzander

arXiv.org Artificial IntelligenceMay-30-2023

A large literature suggests that people are intuitive Dualists--they consider the mind ethereal, distinct from the body. Past research also shows that Dualism emerges, in part, via learning (e.g., Barlev & Shtulman, 2021). But whether learning is sufficient to give rise to Dualism is unknown.The evidence from human learners does address this question because humans are endowed not only with general learning capacities but also with core knowledge capacities. And recent results suggest that core knowledge begets Dualism (Berent, Theodore & Valencia, 2021; Berent, 2023). To evaluate the role of learning, here, we probe for a mind-body divide in Davinci--a large language model (LLM) that is devoid of any innate core knowledge. We show that Davinci still leans towards Dualism, and that this bias increases systematically with the learner's inductive potential. Thus, davinci (a GPT-3 model) exhibits mild Dualist tendencies, whereas its descendent, text-davinci-003 (a GPT-3.5 model), shows a full-blown bias. It selectively considers thoughts (epistemic states) as disembodied--as unlikely to show up in the body (in the brain), but not in its absence (after death). While Davinci's performance is constrained by its syntactic limitations, and it differs from humans, its Dualist bias is robust. These results demonstrate that the mind-body divide is partly learnable from experience.They also show how, as LLM's are exposed to human narratives, they induce not only human knowledge but also human biases.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2305.07667

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > New York (0.04)

Genre: Research Report > New Finding (0.69)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.96)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Data Engineer - Managed service at Version 1 - Edinburgh, United Kingdom

#artificialintelligenceApr-16-2023, 00:05:42 GMT

We pledge "to prove IT can make a real difference to our customer's businesses". We work hard to ensure we understand what our customers need from their technology solutions and then we deliver. We are an award-winning company who provide world class customer service; we think big and we hire great people. Version 1 are more than just another IT services company - we are leaders in implementing and supporting Oracle, Microsoft and AWS technologies. Version 1 is looking for an experienced data engineer join its Managed Services team.

data engineer, united kingdom, version 1, (1 more...)

#artificialintelligence

Country: Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.40)

Industry: Information Technology > Services (0.64)

Technology: Information Technology > Artificial Intelligence (0.40)

Add feedback