Cui, Ming
Dial-insight: Fine-tuning Large Language Models with High-Quality Domain-Specific Data Preventing Capability Collapse
Sun, Jianwei, Mei, Chaoyang, Wei, Linlin, Zheng, Kaiyu, Liu, Na, Cui, Ming, Li, Tianyi
The efficacy of large language models (LLMs) is heavily dependent on the quality of the underlying data, particularly within specialized domains. A common challenge when fine-tuning LLMs for domain-specific applications is the potential degradation of the model's generalization capabilities. To address these issues, we propose a two-stage approach for the construction of production prompts designed to yield high-quality data. This method involves the generation of a diverse array of prompts that encompass a broad spectrum of tasks and exhibit a rich variety of expressions. Furthermore, we introduce a cost-effective, multi-dimensional quality assessment framework to ensure the integrity of the generated labeling data. Utilizing a dataset comprised of service provider and customer interactions from the real estate sector, we demonstrate a positive correlation between data quality and model performance. Notably, our findings indicate that the domain-specific proficiency of general LLMs can be enhanced through fine-tuning with data produced via our proposed method, without compromising their overall generalization abilities, even when exclusively domain-specific data is employed for fine-tuning.
From LLM to Conversational Agent: A Memory Enhanced Architecture with Fine-Tuning of Large Language Models
Liu, Na, Chen, Liangyu, Tian, Xiaoyu, Zou, Wei, Chen, Kaijiang, Cui, Ming
RAISE, an enhancement exhibit high levels of performance in isolated of the ReAct framework, incorporates a tasks, creating an agent that can sustain coherent, dual-component memory system, mirroring context-aware, and purpose-driven conversations human short-term and long-term memory, remains an intricate endeavor. The need for to maintain context and continuity a more sophisticated framework that leverages the in conversations. It entails a comprehensive strengths of LLMs while addressing their limitations agent construction scenario, including in conversational settings has become increasingly phases like Conversation Selection, apparent. Scene Extraction, CoT Completion, and In response to this need, we introduce the Scene Augmentation, leading to the LLMs RAISE (Reasoning and Acting through Scratchpad Training phase. This approach appears to and Examples) architecture. RAISE represents enhance agent controllability and adaptability a refined enhancement of the existing Rein complex, multi-turn dialogues. Act(Yao et al., 2023) framework, specifically designed Our preliminary evaluations in a real estate to augment the capabilities of conversational sales context suggest that RAISE has agents. This paper presents a detailed exploration some advantages over traditional agents, of RAISE, highlighting its unique components indicating its potential for broader applications.
Exploration and Improvement of Nerf-based 3D Scene Editing Techniques
Fang, Shun, Cui, Ming, Feng, Xing, Zhang, Yanan
NeRF's high-quality scene synthesis capability was quickly accepted by scholars in the years after it was proposed, and significant progress has been made in 3D scene representation and synthesis. However, the high computational cost limits intuitive and efficient editing of scenes, making NeRF's development in the scene editing field facing many challenges. This paper reviews the preliminary explorations of scholars on NeRF in the scene or object editing field in recent years, mainly changing the shape and texture of scenes or objects in new synthesized scenes; through the combination of residual models such as GaN and Transformer with NeRF, the generalization ability of NeRF scene editing has been further expanded, including realizing real-time new perspective editing feedback, multimodal editing of text synthesized 3D scenes, 4D synthesis performance, and in-depth exploration in light and shadow editing, initially achieving optimization of indirect touch editing and detail representation in complex scenes. Currently, most NeRF editing methods focus on the touch points and materials of indirect points, but when dealing with more complex or larger 3D scenes, it is difficult to balance accuracy, breadth, efficiency, and quality. Overcoming these challenges may become the direction of future NeRF 3D scene editing technology.
Methods and strategies for improving the novel view synthesis quality of neural radiation field
Fang, Shun, Cui, Ming, Feng, Xing, Lv, Yanna
In recent years, researchers have increasingly focused on the NeRF[1] in this regard. NeRF provides an accurate and simple method to represent 3D scenes useing an implicit function based on MLPand, and has achieved satisfactory rendering quality in 3D reconstruction tasks. Current efforts aim to extend the original NeRF to different situations, such as scene synthesis[2, 3], dynamic scenes[4, 5], large scene reconstruction[6, 7] or rapid convergence[8, 9], among others. Since NeRF was published in 2020, the NeRF paper has been cited more than thousands of times in the past three years. In addition, researchers have made numerous improvements to the NeRF technology. Some work have focused on optimizing the rendering speed of NeRF[10, 11], while others have explored different application scenarios[12, 13]. Futhermore, there have been efforts to extended NeRF for scene inpainting[14, 15], texture synthesis[16], handing complex scenes[17], and addressing more challenging problems.
DUMA: a Dual-Mind Conversational Agent with Fast and Slow Thinking
Tian, Xiaoyu, Chen, Liangyu, Liu, Na, Liu, Yaxuan, Zou, Wei, Chen, Kaijiang, Cui, Ming
Inspired by the dual-process theory of human cognition, we introduce DUMA, a novel conversational agent framework that embodies a dual-mind mechanism through the utilization of two generative Large Language Models (LLMs) dedicated to fast and slow thinking respectively. The fast thinking model serves as the primary interface for external interactions and initial response generation, evaluating the necessity for engaging the slow thinking model based on the complexity of the complete response. When invoked, the slow thinking model takes over the conversation, engaging in meticulous planning, reasoning, and tool utilization to provide a well-analyzed response. This dual-mind configuration allows for a seamless transition between intuitive responses and deliberate problem-solving processes based on the situation. We have constructed a conversational agent to handle online inquiries in the real estate industry. The experiment proves that our method balances effectiveness and efficiency, and has a significant improvement compared to the baseline.