Goto

Collaborating Authors

 Yang, Wei


HDLCoRe: A Training-Free Framework for Mitigating Hallucinations in LLM-Generated HDL

arXiv.org Artificial Intelligence

Recent advances in large language models (LLMs) have demonstrated remarkable capabilities in code generation tasks. However, when applied to hardware description languages (HDL), these models exhibit significant limitations due to data scarcity, resulting in hallucinations and incorrect code generation. To address these challenges, we propose HDLCoRe, a training-free framework that enhances LLMs' HDL generation capabilities through prompt engineering techniques and retrieval-augmented generation (RAG). Our approach consists of two main components: (1) an HDL-aware Chain-of-Thought (CoT) prompting technique with self-verification that classifies tasks by complexity and type, incorporates domainspecific knowledge, and guides LLMs through step-by-step self-simulation for error correction; and (2) a two-stage heterogeneous RAG system that addresses formatting inconsistencies through key component extraction and efficiently retrieves relevant HDL examples through sequential filtering and re-ranking. HDLCoRe eliminates the need for model fine-tuning while substantially improving LLMs' HDL generation capabilities. Experimental results demonstrate that our framework achieves superior performance on the RTLLM2.0 With the rapid advancement of semiconductor technology, the design of very large-scale integration (VLSI) has become increasingly vital across industries Huang et al. (2021). Hardware description language (HDL) code, as the foundation of VLSI design, plays a critical role in defining the circuit architecture and functionality Palnitkar (2003). In recent years, large language models (LLMs) have experienced explosive growth and demonstrated extraordinary capabilities in many aspects Kanakaris et al. (2025); Li et al. (2025), especially in automated code generation Brown et al. (2020); Chen et al. (2021).


Agents Play Thousands of 3D Video Games

arXiv.org Artificial Intelligence

We present PORTAL, a novel framework for developing artificial intelligence agents capable of playing thousands of 3D video games through language-guided policy generation. By transforming decision-making problems into language modeling tasks, our approach leverages large language models (LLMs) to generate behavior trees represented in domain-specific language (DSL). This method eliminates the computational burden associated with traditional reinforcement learning approaches while preserving strategic depth and rapid adaptability. Our framework introduces a hybrid policy structure that combines rule-based nodes with neural network components, enabling both high-level strategic reasoning and precise low-level control. A dual-feedback mechanism incorporating quantitative game metrics and vision-language model analysis facilitates iterative policy improvement at both tactical and strategic levels. The resulting policies are instantaneously deployable, human-interpretable, and capable of generalizing across diverse gaming environments. Experimental results demonstrate PORTAL's effectiveness across thousands of first-person shooter (FPS) games, showcasing significant improvements in development efficiency, policy generalization, and behavior diversity compared to traditional approaches. PORTAL represents a significant advancement in game AI development, offering a practical solution for creating sophisticated agents that can operate across thousands of commercial video games with minimal development overhead. Experiment results on the 3D video games are best viewed on https://zhongwen.one/projects/portal .


RURANET++: An Unsupervised Learning Method for Diabetic Macular Edema Based on SCSE Attention Mechanisms and Dynamic Multi-Projection Head Clustering

arXiv.org Artificial Intelligence

Diabetic Macular Edema (DME), a prevalent complication among diabetic patients, constitutes a major cause of visual impairment and blindness. Although deep learning has achieved remarkable progress in medical image analysis, traditional DME diagnosis still relies on extensive annotated data and subjective ophthalmologist assessments, limiting practical applications. To address this, we present RURANET++, an unsupervised learning-based automated DME diagnostic system. This framework incorporates an optimized U-Net architecture with embedded Spatial and Channel Squeeze & Excitation (SCSE) attention mechanisms to enhance lesion feature extraction. During feature processing, a pre-trained GoogLeNet model extracts deep features from retinal images, followed by PCA-based dimensionality reduction to 50 dimensions for computational efficiency. Notably, we introduce a novel clustering algorithm employing multi-projection heads to explicitly control cluster diversity while dynamically adjusting similarity thresholds, thereby optimizing intra-class consistency and inter-class discrimination. Experimental results demonstrate superior performance across multiple metrics, achieving maximum accuracy (0.8411), precision (0.8593), recall (0.8411), and F1-score (0.8390), with exceptional clustering quality. This work provides an efficient unsupervised solution for DME diagnosis with significant clinical implications.


ClimateLLM: Efficient Weather Forecasting via Frequency-Aware Large Language Models

arXiv.org Artificial Intelligence

Weather forecasting is crucial for public safety, disaster prevention and mitigation, agricultural production, and energy management, with global relevance. Although deep learning has significantly advanced weather prediction, current methods face critical limitations: (i) they often struggle to capture both dynamic temporal dependencies and short-term abrupt changes, making extreme weather modeling difficult; (ii) they incur high computational costs due to extensive training and resource requirements; (iii) they have limited adaptability to multi-scale frequencies, leading to challenges when separating global trends from local fluctuations. To address these issues, we propose ClimateLLM, a foundation model for weather forecasting. It captures spatiotemporal dependencies via a cross-temporal and cross-spatial collaborative modeling framework that integrates Fourier-based frequency decomposition with Large Language Models (LLMs) to strengthen spatial and temporal modeling. Our framework uses a Mixture-of-Experts (MoE) mechanism that adaptively processes different frequency components, enabling efficient handling of both global signals and localized extreme events. In addition, we introduce a cross-temporal and cross-spatial dynamic prompting mechanism, allowing LLMs to incorporate meteorological patterns across multiple scales effectively. Extensive experiments on real-world datasets show that ClimateLLM outperforms state-of-the-art approaches in accuracy and efficiency, as a scalable solution for global weather forecasting. For almost half a century, numerical weather prediction (NWP) methods that rely on solving atmospheric partial differential equations have formed the backbone of operational forecasting Kalnay (2002); Lynch (2008); Bauer et al. (2015); Nguyen et al. (2024).


Boundary-enhanced time series data imputation with long-term dependency diffusion models

arXiv.org Artificial Intelligence

Data imputation is crucial for addressing challenges posed by missing values in multivariate time series data across various fields, such as healthcare, traffic, and economics, and has garnered significant attention. Among various methods, diffusion model-based approaches show notable performance improvements. However, existing methods often cause disharmonious boundaries between missing and known regions and overlook long-range dependencies in missing data estimation, leading to suboptimal results. To address these issues, we propose a Diffusion-based time Series Data Imputation (DSDI) framework. We develop a weight-reducing injection strategy that incorporates the predicted values of missing points with reducing weights into the reverse diffusion process to mitigate boundary inconsistencies. Further, we introduce a multi-scale S4-based U-Net, which combines hierarchical information from different levels via multi-resolution integration to capture long-term dependencies. Experimental results demonstrate that our model outperforms existing imputation methods.


LLM4SR: A Survey on Large Language Models for Scientific Research

arXiv.org Artificial Intelligence

In recent years, the rapid advancement of Large Language Models (LLMs) has transformed the landscape of scientific research, offering unprecedented support across various stages of the research cycle. This paper presents the first systematic survey dedicated to exploring how LLMs are revolutionizing the scientific research process. We analyze the unique roles LLMs play across four critical stages of research: hypothesis discovery, experiment planning and implementation, scientific writing, and peer reviewing. Our review comprehensively showcases the task-specific methodologies and evaluation benchmarks. By identifying current challenges and proposing future research directions, this survey not only highlights the transformative potential of LLMs, but also aims to inspire and guide researchers and practitioners in leveraging LLMs to advance scientific inquiry. Resources are available at the following repository: https://github.com/du-nlp-lab/LLM4SR


Cosmos World Foundation Model Platform for Physical AI

arXiv.org Artificial Intelligence

Physical AI needs to be trained digitally first. It needs a digital twin of itself, the policy model, and a digital twin of the world, the world model. In this paper, we present the Cosmos World Foundation Model Platform to help developers build customized world models for their Physical AI setups. We position a world foundation model as a general-purpose world model that can be fine-tuned into customized world models for downstream applications. Our platform covers a video curation pipeline, pre-trained world foundation models, examples of post-training of pre-trained world foundation models, and video tokenizers. To help Physical AI builders solve the most critical problems of our society, we make our platform open-source and our models open-weight with permissive licenses available via https://github.com/NVIDIA/Cosmos.


Playable Game Generation

arXiv.org Artificial Intelligence

In recent years, Artificial Intelligence Generated Content (AIGC) has advanced from textto-image generation to text-to-video and multimodal video synthesis. However, generating playable games presents significant challenges due to the stringent requirements for realtime interaction, high visual quality, and accurate simulation of game mechanics. Existing approaches often fall short, either lacking real-time capabilities or failing to accurately simulate interactive mechanics. To tackle the playability issue, we propose a novel method called PlayGen, which encompasses game data generation, an autoregressive DiT-based diffusion model, and a comprehensive playability-based evaluation framework. Validated on well-known 2D and 3D games, PlayGen achieves real-time interaction, ensures sufficient visual quality, and provides accurate interactive mechanics simulation. Notably, these results are sustained even after over 1000 frames of gameplay on an NVIDIA RTX 2060 GPU. Our code is publicly available: here. Our playable demo generated by AI is: here.


Beyond Forecasting: Compositional Time Series Reasoning for End-to-End Task Execution

arXiv.org Artificial Intelligence

In recent decades, there has been substantial advances in time series models and benchmarks across various individual tasks, such as time series forecasting, classification, and anomaly detection. Meanwhile, compositional reasoning in time series is prevalent in real-world applications (e.g., decision-making and compositional question answering) and is in great demand. Unlike simple tasks that primarily focus on predictive accuracy, compositional reasoning emphasizes the synthesis of diverse information from both time series data and various domain knowledge, making it distinct and extremely more challenging. In this paper, we introduce Compositional Time Series Reasoning, a new task of handling intricate multistep reasoning tasks from time series data. Specifically, this new task focuses on various question instances requiring structural and compositional reasoning abilities on time series data, such as decision-making and compositional question answering. As an initial attempt to tackle this novel task, we developed TS-Reasoner, a program-aided approach that utilizes large language model (LLM) to decompose a complex task into steps of programs that leverage existing time series models and numerical subroutines. Unlike existing reasoning work which only calls off-the-shelf modules, TS-Reasoner allows for the creation of custom modules and provides greater flexibility to incorporate domain knowledge as well as user-specified constraints. We demonstrate the effectiveness of our method through a comprehensive set of experiments. These promising results indicate potential opportunities in the new task of time series reasoning and highlight the need for further research.


Foundation Model Engineering: Engineering Foundation Models Just as Engineering Software

arXiv.org Artificial Intelligence

By treating data and models as the source code, Foundation Models (FMs) become a new type of software. Mirroring the concept of software crisis, the increasing complexity of FMs making FM crisis a tangible concern in the coming decade, appealing for new theories and methodologies from the field of software engineering. In this paper, we outline our vision of introducing Foundation Model (FM) engineering, a strategic response to the anticipated FM crisis with principled engineering methodologies. FM engineering aims to mitigate potential issues in FM development and application through the introduction of declarative, automated, and unified programming interfaces for both data and model management, reducing the complexities involved in working with FMs by providing a more structured and intuitive process for developers. Through the establishment of FM engineering, we aim to provide a robust, automated, and extensible framework that addresses the imminent challenges, and discovering new research opportunities for the software engineering field.