AITopics

Developers are widely using AI code-generation models, aiming to increase productivity and efficiency. However, there are also quality concerns regarding the AI-generated code. The generated code is produced by models trained on publicly available code, which are known to contain bugs and quality issues. Those issues can cause trust and maintenance challenges during the development process. Several quality issues associated with AI-generated code have been reported, including bugs and defects. However, these findings are often scattered and lack a systematic summary. A comprehensive review is currently lacking to reveal the types and distribution of these errors, possible remediation strategies, as well as their correlation with the specific models. In this paper, we systematically analyze the existing AI-generated code literature to establish an overall understanding of bugs and defects in generated code, providing a reference for future model improvement and quality assessment. We aim to understand the nature and extent of bugs in AI-generated code, and provide a classification of bug types and patterns present in code generated by different models. We also discuss possible fixes and mitigation strategies adopted to eliminate bugs from the generated code.

large language model, machine learning, programming language, (21 more...)

2512.05239

Country: Asia (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Education > Curriculum > Subject-Specific Education (0.46)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Information Management > Search (1.00)
Information Technology > Data Science > Data Mining (1.00)
(4 more...)

Rethinking Tokenization for Clinical Time Series: When Less is More

Attrach, Rafi Al, Fani, Rajna, Restrepo, David, Jia, Yugang, Schüffler, Peter

Tokenization strategies shape how models process electronic health records, yet fair comparisons of their effectiveness remain limited. We present a systematic evaluation of tokenization approaches for clinical time series modeling using transformer-based architectures, revealing task-dependent and sometimes counterintuitive findings about temporal and value feature importance. Through controlled ablations across four clinical prediction tasks on MIMIC-IV, we demonstrate that explicit time encodings provide no consistent statistically significant benefit for the evaluated downstream tasks. Value features show task-dependent importance, affecting mortality prediction but not readmission, suggesting code sequences alone can carry sufficient predictive signal. We further show that frozen pretrained code encoders dramatically outperform their trainable counterparts while requiring dramatically fewer parameters. Larger clinical encoders provide consistent improvements across tasks, benefiting from frozen embeddings that eliminate computational overhead. Our controlled evaluation enables fairer tokenization comparisons and demonstrates that simpler, parameter-efficient approaches can, in many cases, achieve strong performance, though the optimal tokenization strategy remains task-dependent.

large language model, machine learning, natural language, (17 more...)

2512.05217

Country:

Europe (0.30)
North America > United States > Massachusetts (0.15)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.68)

Industry: Health & Medicine > Health Care Technology > Medical Record (0.90)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Coefficient of Variation Masking: A Volatility-Aware Strategy for EHR Foundation Models

Fani, Rajna, Attrach, Rafi Al, Restrepo, David, Jia, Yugang, Celi, Leo Anthony, Schüffler, Peter

Masked autoencoders (MAEs) are increasingly applied to electronic health records (EHR) for learning general-purpose representations that support diverse clinical tasks. However, existing approaches typically rely on uniform random masking, implicitly assuming all features are equally predictable. In reality, laboratory tests exhibit substantial heterogeneity in volatility: some biomarkers (e.g., sodium) remain stable, while others (e.g., lactate) fluctuate considerably and are more difficult to model. Clinically, volatile biomarkers often signal acute pathophysiology and require more sophisticated modeling to capture their complex temporal patterns. We propose a volatility-aware pretraining strategy, Coefficient of Variation Masking (CV-Masking), that adaptively adjusts masking probabilities according to the intrinsic variability of each feature. Combined with a value-only masking objective aligned with clinical workflows, CV-Masking yields systematic improvements over random and variance-based strategies. Experiments on a large panel of laboratory tests show that CV-Masking enhances reconstruction, improves downstream predictive performance, and accelerates convergence, producing more robust and clinically meaningful EHR representations.

laboratory test, large language model, machine learning, (16 more...)

2512.05216

Country:

Europe (0.29)
North America > United States > Massachusetts (0.15)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.94)

Industry:

Health & Medicine > Diagnostic Medicine (0.91)
Health & Medicine > Health Care Technology > Medical Record (0.70)
Health & Medicine > Therapeutic Area > Immunology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.46)

Mappouras, Georgios, Rossides, Charalambos

On the Computability of Artificial General Intelligence

In recent years we observed rapid and significant advancements in artificial intelligence (A.I.). So much so that many wonder how close humanity is to developing an A.I. model that can achieve human level of intelligence, also known as artificial general intelligence (A.G.I.). In this work we look at this question and we attempt to define the upper bounds, not just of A.I., but rather of any machine-computable process (a.k.a. an algorithm). To answer this question however, one must first precisely define A.G.I. We borrow prior work's definition of A.G.I. [1] that best describes the sentiment of the term, as used by the leading developers of A.I. That is, the ability to be creative and innovate in some field of study in a way that unlocks new and previously unknown functional capabilities in that field. Based on this definition we draw new bounds on the limits of computation. We formally prove that no algorithm can demonstrate new functional capabilities that were not already present in the initial algorithm itself. Therefore, no algorithm (and thus no A.I. model) can be truly creative in any field of study, whether that is science, engineering, art, sports, etc. In contrast, A.I. models can demonstrate existing functional capabilities, as well as combinations and permutations of existing functional capabilities. We conclude this work by discussing the implications of this proof both as it regards to the future of A.I. development, as well as to what it means for the origins of human intelligence.

large language model, machine learning, natural language, (19 more...)

2512.05212

Genre: Research Report (0.82)

Industry:

Information Technology (1.00)
Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Cognitive Science > Creativity & Intelligence (0.68)

Fine-Tuning BERT for Domain-Specific Question Answering: Toward Educational NLP Resources at University Scale

Montfrond, Aurélie

Prior work on scientific question answering has largely emphasized chatbot-style systems, with limited exploration of fine-tuning foundation models for domain-specific reasoning. In this study, we developed a chatbot for the University of Limerick's Department of Electronic and Computer Engineering to provide course information to students. A custom dataset of 1,203 question-answer pairs in SQuAD format was constructed using the university book of modules, supplemented with manually and synthetically generated entries. We fine-tuned BERT (Devlin et al., 2019) using PyTorch and evaluated performance with Exact Match and F1 scores. Results show that even modest fine-tuning improves hypothesis framing and knowledge extraction, demonstrating the feasibility of adapting foundation models to educational domains. While domain-specific BERT variants such as BioBERT and SciBERT exist for biomedical and scientific literature, no foundation model has yet been tailored to university course materials. Our work addresses this gap by showing that fine-tuning BERT with academic QA pairs yields effective results, highlighting the potential to scale towards the first domain-specific QA model for universities and enabling autonomous educational knowledge systems.

large language model, machine learning, question answering, (20 more...)

2512.05179

Country: Europe > Ireland (0.14)

Genre:

Research Report > New Finding (1.00)
Instructional Material (1.00)

Industry:

Education > Curriculum (0.50)
Education > Educational Setting > Higher Education (0.35)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Johnson, Brittany, Reddick, Erin, Smith, Angela D. R.

Towards A Cultural Intelligence and Values Inferences Quality Benchmark for Community Values and Common Knowledge

Large language models (LLMs) have emerged as a powerful technology, and thus, we have seen widespread adoption and use on software engineering teams. Most often, LLMs are designed as "general purpose" technologies meant to represent the general population. Unfortunately, this often means alignment with predominantly Western Caucasian narratives and misalignment with other cultures and populations that engage in collaborative innovation. In response to this misalignment, there have been recent efforts centered on the development of "culturally-informed" LLMs, such as ChatBlackGPT, that are capable of better aligning with historically marginalized experiences and perspectives. Despite this progress, there has been little effort aimed at supporting our ability to develop and evaluate culturally-informed LLMs. A recent effort proposed an approach for developing a national alignment benchmark that emphasizes alignment with national social values and common knowledge. However, given the range of cultural identities present in the United States (U.S.), a national alignment benchmark is an ineffective goal for broader representation. To help fill this gap in this US context, we propose a replication study that translates the process used to develop KorNAT, a Korean National LLM alignment benchmark, to develop CIVIQ, a Cultural Intelligence and Values Inference Quality benchmark centered on alignment with community social values and common knowledge. Our work provides a critical foundation for research and development aimed at cultural alignment of AI technologies in practice.

artificial intelligence, large language model, natural language, (15 more...)

2512.05176

Country: North America > United States > District of Columbia > Washington (0.15)

Genre:

Questionnaire & Opinion Survey (1.00)
Research Report (0.82)

Industry:

Law (0.68)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.68)
Education (0.47)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Semore: VLM-guided Enhanced Semantic Motion Representations for Visual Reinforcement Learning

Wang, Wentao, Liu, Chunyang, Sheng, Kehua, Zhang, Bo, Wang, Yan

The growing exploration of Large Language Models (LLM) and Vision-Language Models (VLM) has opened avenues for enhancing the effectiveness of reinforcement learning (RL). However, existing LLM-based RL methods often focus on the guidance of control policy and encounter the challenge of limited representations of the backbone networks. To tackle this problem, we introduce Enhanced Semantic Motion Representations (Semore), a new VLM-based framework for visual RL, which can simultaneously extract semantic and motion representations through a dual-path backbone from the RGB flows. Semore utilizes VLM with common-sense knowledge to retrieve key information from observations, while using the pre-trained clip to achieve the text-image alignment, thereby embedding the ground-truth representations into the backbone. To efficiently fuse semantic and motion representations for decision-making, our method adopts a separately supervised approach to simultaneously guide the extraction of semantics and motion, while allowing them to interact spontaneously. Extensive experiments demonstrate that, under the guidance of VLM at the feature level, our method exhibits efficient and adaptive ability compared to state-of-art methods. All codes are released.

large language model, machine learning, reinforcement learning, (14 more...)

2512.05172

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Bridging Traditional Machine Learning and Large Language Models: A Two-Part Course Design for Modern AI Education

Li, Fang

This paper presents an innovative pedagogical approach for teaching artificial intelligence and data science that systematically bridges traditional machine learning techniques with modern Large Language Models (LLMs). We describe a course structured in two sequential and complementary parts: foundational machine learning concepts and contemporary LLM applications. This design enables students to develop a comprehensive understanding of AI evolution while building practical skills with both established and cutting-edge technologies. We detail the course architecture, implementation strategies, assessment methods, and learning outcomes from our summer course delivery spanning two seven-week terms. Our findings demonstrate that this integrated approach enhances student comprehension of the AI landscape and better prepares them for industry demands in the rapidly evolving field of artificial intelligence.

large language model, machine learning, natural language, (16 more...)

2512.05167

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (1.00)

Industry: Education > Curriculum > Subject-Specific Education (0.35)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

AREA3D: Active Reconstruction Agent with Unified Feed-Forward 3D Perception and Vision-Language Guidance

Xu, Tianling, Gan, Shengzhe, Gu, Leslie, Li, Yuelei, Zhan, Fangneng, Pfister, Hanspeter

Active 3D reconstruction enables an agent to autonomously select viewpoints to build accurate and complete scene geometry efficiently, rather than passively reconstructing scenes from pre-collected images. Existing active reconstruction methods often rely on geometric heuristics, which may result in redundant observations without improving reconstruction quality. T o address this, we propose AREA3D, an active reconstruction agent for 3D reconstruction by leveraging feed-forward 3D models and vision-language guidance. The framework decouples view uncertainty modeling from feed-forward reconstruction, enabling precise uncertainty estimation without online optimization. Moreover, the integrated Vision-Language Model provides high-level semantic guidance that guides exploration beyond purely geometric cues. Extensive experiments on both scene-level and object-level benchmarks demonstrate that AREA3D achieves state-of-the-art reconstruction accuracy, especially in sparse views.

large language model, machine learning, natural language, (18 more...)

2512.05131

Genre: Research Report (0.64)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.46)

Radhakrishnan, Unnikrishnan

Documenting SME Processes with Conversational AI: From Tacit Knowledge to BPMN

Small and medium-sized enterprises (SMEs) still depend heavily on tacit, experience-based know-how that rarely makes its way into formal documentation. This paper introduces a large-language-model (LLM)-driven conversational assistant that captures such knowledge on the shop floor and converts it incrementally and interactively into standards-compliant Business Process Model and Notation (BPMN) 2.0 diagrams. Powered by Gemini 2.5 Pro and delivered through a lightweight Gradio front-end with client-side bpmn-js visualisation, the assistant conducts an interview-style dialogue: it elicits process details, supports clarifying dialogue and on-demand analysis, and renders live diagrams that users can refine in real time. A proof-of-concept evaluation in an equipment-maintenance scenario shows that the chatbot produced an accurate "AS-IS" model, flagged issues via on-diagram annotations, and generated an improved "TO-BE" variant, all within about 12-minutes, while keeping API costs within an SME-friendly budget. The study analyses latency sources, model-selection trade-offs, and the challenges of enforcing strict XML schemas, then outlines a roadmap toward agentic and multimodal deployments. The results demonstrate that conversational LLMs can potentially be used to lower the skill and cost barriers to rigorous process documentation, helping SMEs preserve institutional knowledge, enhance operational transparency, and accelerate continuous-improvement efforts.

large language model, machine learning, natural language, (19 more...)

doi: 10.1049/icp.2025.3640

2512.05122

Genre:

Workflow (0.91)
Research Report > New Finding (0.48)

Industry: Information Technology (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)