Digging Into the Internal: Causality-Based Analysis of LLM Function Calling
Ji, Zhenlan, Wu, Daoyuan, Wang, Wenxuan, Ma, Pingchuan, Wang, Shuai, Ma, Lei
–arXiv.org Artificial Intelligence
Function calling (FC) has emerged as a powerful technique for facilitating large language models (LLMs) to interact with external systems and perform structured tasks. However, the mechanisms through which it influences model behavior remain largely under-explored. Besides, we discover that in addition to the regular usage of FC, this technique can substantially enhance the compliance of LLMs with user instructions. These observations motivate us to leverage causality, a canonical analysis method, to investigate how FC works within LLMs. In particular, we conduct layer-level and token-level causal interventions to dissect FC's impact on the model's internal computational logic when responding to user queries. Our analysis confirms the substantial influence of FC and reveals several in-depth insights into its mechanisms. To further validate our findings, we conduct extensive experiments comparing the effectiveness of FC-based instructions against conventional prompting methods. We focus on enhancing LLM safety robustness, a critical LLM application scenario, and evaluate four mainstream LLMs across two benchmark datasets. The results are striking: FC shows an average performance improvement of around 135% over conventional prompting methods in detecting malicious inputs, demonstrating its promising potential to enhance LLM reliability and capability in practical applications.
arXiv.org Artificial Intelligence
Sep-23-2025
- Country:
- Asia
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.04)
- North America > Canada
- Alberta (0.14)
- Genre:
- Research Report > New Finding (1.00)
- Industry:
- Law (0.46)
- Technology: