PIVOINE: Instruction Tuning for Open-world Information Extraction
Lu, Keming, Pan, Xiaoman, Song, Kaiqiang, Zhang, Hongming, Yu, Dong, Chen, Jianshu
–arXiv.org Artificial Intelligence
We consider the problem of Open-world Information Extraction (Open-world IE), which extracts comprehensive entity profiles from unstructured texts. Different from the conventional closed-world setting of Information Extraction (IE), Open-world IE considers a more general situation where entities and relations could be beyond a predefined ontology. More importantly, we seek to develop a large language model (LLM) that is able to perform Open-world IE to extract desirable entity profiles characterized by (possibly fine-grained) natural language instructions. We achieve this by finetuning LLMs using instruction tuning. In particular, we construct INSTRUCTOPENWIKI, a substantial instruction tuning dataset for Open-world IE enriched with a comprehensive corpus, extensive annotations, and diverse instructions. We finetune the pretrained BLOOM models on INSTRUCTOPENWIKI and obtain PIVOINE, an LLM for Open-world IE with strong instruction-following capabilities. Our experiments demonstrate that PIVOINE significantly outperforms traditional closed-world methods and other LLM baselines, displaying impressive generalization capabilities on both unseen instructions and out-of-ontology cases. Consequently, PIVOINE emerges as a promising solution to tackle the open-world challenge in IE effectively.
arXiv.org Artificial Intelligence
May-24-2023
- Country:
- North America
- Dominican Republic (0.04)
- United States
- Washington > King County
- Bellevue (0.04)
- Texas > Travis County
- Austin (0.04)
- New Mexico > Santa Fe County
- Santa Fe (0.04)
- California
- Los Angeles County > Los Angeles (0.28)
- San Diego County > San Diego (0.04)
- Washington > King County
- Europe
- Asia
- North America
- Genre:
- Research Report > Promising Solution (0.34)
- Industry:
- Media > Music (0.46)
- Leisure & Entertainment > Sports
- Tennis (0.46)
- Technology: