Document Intelligence in the Era of Large Language Models: A Survey
Wang, Weishi, Hu, Hengchang, Zhang, Zhijie, Li, Zhaochen, Shao, Hongxin, Dahlmeier, Daniel
–arXiv.org Artificial Intelligence
Document AI (DAI) has emerged as a vital application area, and is significantly transformed by the advent of large language models (LLMs). While earlier approaches relied on encoder-decoder architectures, decoder-only LLMs have revolutionized DAI, bringing remarkable advancements in understanding and generation. This survey provides a comprehensive overview of DAI's evolution, highlighting current research attempts and future prospects of LLMs in this field. We explore key advancements and challenges in multimodal, multilingual, and retrieval-augmented DAI, while also suggesting future research directions, including agent-based approaches and document-specific foundation models. This paper aims to provide a structured analysis of the state-of-the-art in DAI and its implications for both academic and practical applications.
arXiv.org Artificial Intelligence
Oct-16-2025
- Country:
- Africa > Rwanda
- Asia
- Indonesia > Bali (0.04)
- Japan
- Honshū > Kansai
- Kyoto Prefecture > Kyoto (0.04)
- Kyūshū & Okinawa > Kyūshū
- Miyazaki Prefecture > Miyazaki (0.04)
- Honshū > Kansai
- Middle East
- Israel (0.04)
- Jordan (0.04)
- UAE > Abu Dhabi Emirate
- Abu Dhabi (0.04)
- China
- Sri Lanka (0.04)
- Thailand > Bangkok
- Bangkok (0.05)
- Vietnam > Hanoi
- Hanoi (0.04)
- Singapore (0.04)
- Myanmar > Tanintharyi Region
- Dawei (0.04)
- Europe
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Croatia > Dubrovnik-Neretva County
- Dubrovnik (0.04)
- Netherlands > South Holland
- The Hague (0.04)
- Middle East > Cyprus
- Italy
- France
- Grand Est > Meurthe-et-Moselle
- Nancy (0.04)
- Nouvelle-Aquitaine > Gironde
- Bordeaux (0.04)
- Provence-Alpes-Côte d'Azur > Bouches-du-Rhône
- Marseille (0.04)
- Île-de-France > Paris
- Paris (0.04)
- Grand Est > Meurthe-et-Moselle
- Portugal > Lisbon
- Lisbon (0.04)
- Greece > Attica
- Athens (0.04)
- Sweden > Uppsala County
- Uppsala (0.04)
- Spain
- Germany > Berlin (0.04)
- Switzerland > Vaud
- Lausanne (0.04)
- Austria > Vienna (0.14)
- Belgium > Brussels-Capital Region
- North America
- Canada
- British Columbia > Vancouver (0.04)
- Quebec > Montreal (0.04)
- Dominican Republic (0.04)
- Mexico > Mexico City
- Mexico City (0.04)
- United States
- California
- Los Angeles County > Long Beach (0.04)
- San Francisco County > San Francisco (0.14)
- Santa Clara County > San Jose (0.04)
- Washington > King County
- Seattle (0.14)
- New Jersey (0.04)
- Oregon > Multnomah County
- Portland (0.04)
- New York > Niagara County
- Niagara Falls (0.04)
- Tennessee > Davidson County
- Nashville (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Maryland > Baltimore (0.04)
- Ohio > Franklin County
- Columbus (0.04)
- Florida
- Miami-Dade County > Miami (0.14)
- Palm Beach County > Boca Raton (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- California
- Canada
- Oceania > Australia
- New South Wales > Sydney (0.04)
- Queensland (0.04)
- South America > Chile
- Genre:
- Overview (1.00)
- Research Report (1.00)
- Industry:
- Information Technology (0.45)
- Technology: