Instruct and Extract: Instruction Tuning for On-Demand Information Extraction
Jiao, Yizhu, Zhong, Ming, Li, Sha, Zhao, Ruining, Ouyang, Siru, Ji, Heng, Han, Jiawei
–arXiv.org Artificial Intelligence
Large language models with instruction-following capabilities open the door to a wider group of users. However, when it comes to information extraction - a classic task in natural language processing - most task-specific systems cannot align well with long-tail ad hoc extraction use cases for non-expert users. To address this, we propose a novel paradigm, termed On-Demand Information Extraction, to fulfill the personalized demands of real-world users. Our task aims to follow the instructions to extract the desired content from the associated text and present it in a structured tabular format. The table headers can either be user-specified or inferred contextually by the model. To facilitate research in this emerging area, we present a benchmark named InstructIE, inclusive of both automatically generated training data, as well as the human-annotated test set. Building on InstructIE, we further develop an On-Demand Information Extractor, ODIE. Comprehensive evaluations on our benchmark reveal that ODIE substantially outperforms the existing open-source models of similar size. Our code and dataset are released on https://github.com/yzjiao/On-Demand-IE.
arXiv.org Artificial Intelligence
Oct-24-2023
- Country:
- Oceania > Australia
- New South Wales > Sydney (0.04)
- North America
- Dominican Republic (0.04)
- United States
- Texas
- Travis County > Austin (0.04)
- Harris County > Houston (0.04)
- Pennsylvania > Philadelphia County
- Philadelphia (0.04)
- Ohio > Franklin County
- Columbus (0.04)
- Illinois > Champaign County
- Urbana (0.04)
- California > San Diego County
- San Diego (0.04)
- Texas
- Canada > Alberta
- Europe
- Italy > Tuscany
- Florence (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Denmark > Capital Region
- Copenhagen (0.04)
- Croatia > Dubrovnik-Neretva County
- Dubrovnik (0.04)
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Italy > Tuscany
- Asia
- China > Hong Kong (0.04)
- Taiwan > Taiwan Province
- Taipei (0.04)
- Middle East
- Republic of Türkiye (0.05)
- UAE > Abu Dhabi Emirate
- Abu Dhabi (0.04)
- Oceania > Australia
- Genre:
- Research Report (0.64)
- Industry:
- Information Technology > Security & Privacy (1.00)
- Education (1.00)
- Health & Medicine > Consumer Health (0.67)