AITopics | ipbench

Collaborating Authors

ipbench

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

IPBench: Benchmarking the Knowledge of Large Language Models in Intellectual Property

Wang, Qiyao, Chen, Guhong, Wang, Hongbo, Liu, Huaren, Zhu, Minghui, Qin, Zhifei, Li, Linwei, Yue, Yilin, Wang, Shiqiang, Li, Jiayan, Wu, Yihang, Liu, Ziqiang, Chen, Longze, Luo, Run, Fan, Liyang, Li, Jiaming, Zhang, Lei, Xu, Kan, Li, Chengming, Alinejad-Rokny, Hamid, Ni, Shiwen, Lin, Yuan, Yang, Min

arXiv.org Artificial IntelligenceSep-30-2025

Intellectual Property (IP) is a highly specialized domain that integrates technical and legal knowledge, making it inherently complex and knowledge-intensive. Recent advancements in LLMs have demonstrated their potential to handle IP-related tasks, enabling more efficient analysis, understanding, and generation of IP-related content. However, existing datasets and benchmarks focus narrowly on patents or cover limited aspects of the IP field, lacking alignment with real-world scenarios. To bridge this gap, we introduce IPBench, the first comprehensive IP task taxonomy and a large-scale bilingual benchmark encompassing 8 IP mechanisms and 20 distinct tasks, designed to evaluate LLMs in real-world IP scenarios. We benchmark 17 main LLMs, ranging from general purpose to domain-specific, including chat-oriented and reasoning-focused models, under zero-shot, few-shot, and chain-of-thought settings. Our results show that even the top-performing model, DeepSeek-V3, achieves only 75.8% accuracy, indicating significant room for improvement. Notably, open-source IP and law-oriented models lag behind closed-source general-purpose models. To foster future research, we publicly release IPBench, and will expand it with additional tasks to better reflect real-world complexities and support model advancements in the IP domain. We provide the data and code in the supplementary URLs.

ipbench, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2504.15524

Country:

Asia (1.00)
North America > United States (0.93)

Genre: Research Report > New Finding (1.00)

Industry: Law > Intellectual Property & Technology Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback