AITopics | patent text

Collaborating Authors

patent text

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Research on feature fusion and multimodal patent text based on graph attention network

Song, Zhenzhen, Liu, Ziwei, Li, Hongji

arXiv.org Artificial IntelligenceMay-27-2025

Aiming at the problems of cross-modal feature fusion, low efficiency of long text modeling and lack of hierarchical semantic coherence in patent text semantic mining, this study proposes HGM-Net, a deep learning framework that integrates Hierarchical Comparative Learning (HCL), Multi-modal Graph Attention Network (M-GAT) and Multi-Granularity Sparse Attention (MSA), which builds a dynamic mask, contrast and cross-structural similarity constraints on the word, sentence and paragraph hierarchies through HCL. Contrast and cross-structural similarity constraints are constructed at the word and paragraph levels by HCL to strengthen the local semantic and global thematic consistency of patent text; M-GAT models patent classification codes, citation relations and text semantics as heterogeneous graph structures, and achieves dynamic fusion of multi-source features by cross-modal gated attention; MSA adopts a hierarchical sparsity strategy to optimize the computational efficiency of long text modeling at word, phrase, sentence and paragraph granularity. Experiments show that the framework demonstrates significant advantages over existing deep learning methods in tasks such as patent classification and similarity matching, and provides a solution with both theoretical innovation and practical value for solving the problems of patent examination efficiency improvement and technology relevance mining.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2505.20188

Country: North America > United States > Illinois (0.15)

Genre: Research Report (0.40)

Industry: Law > Intellectual Property & Technology Law (0.90)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

AutoPatent: A Multi-Agent Framework for Automatic Patent Generation

Wang, Qiyao, Ni, Shiwen, Liu, Huaren, Lu, Shule, Chen, Guhong, Feng, Xi, Wei, Chi, Qu, Qiang, Alinejad-Rokny, Hamid, Lin, Yuan, Yang, Min

arXiv.org Artificial IntelligenceDec-12-2024

As the capabilities of Large Language Models (LLMs) continue to advance, the field of patent processing has garnered increased attention within the natural language processing community. However, the majority of research has been concentrated on classification tasks, such as patent categorization and examination, or on short text generation tasks like patent summarization and patent quizzes. In this paper, we introduce a novel and practical task known as Draft2Patent, along with its corresponding D2P benchmark, which challenges LLMs to generate full-length patents averaging 17K tokens based on initial drafts. Patents present a significant challenge to LLMs due to their specialized nature, standardized terminology, and extensive length. We propose a multi-agent framework called AutoPatent which leverages the LLM-based planner agent, writer agents, and examiner agent with PGTree and RRAG to generate lengthy, intricate, and high-quality complete patent documents. The experimental results demonstrate that our AutoPatent framework significantly enhances the ability to generate comprehensive patents across various LLMs. Furthermore, we have discovered that patents generated solely with the AutoPatent framework based on the Qwen2.5-7B model outperform those produced by larger and more powerful LLMs, such as GPT-4o, Qwen2.5-72B, and LLAMA3.1-70B, in both objective metrics and human evaluations. We will make the data and code available upon acceptance at \url{https://github.com/QiYao-Wang/AutoPatent}.

large language model, machine learning, patent, (21 more...)

arXiv.org Artificial Intelligence

2412.09796

Country:

Asia > China > Guangdong Province > Shenzhen (0.04)
North America > Mexico > Mexico City > Mexico City (0.04)
Oceania > Australia > New South Wales (0.04)
(9 more...)

Genre: Research Report (0.84)

Industry: Law > Intellectual Property & Technology Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)

Add feedback

The Master-Slave Encoder Model for Improving Patent Text Summarization: A New Approach to Combining Specifications and Claims

Zhou, Shu, Wang, Xin, Zhou, Zhengda, Yi, Haohan, Zheng, Xuhui, Wan, Hao

arXiv.org Artificial IntelligenceNov-21-2024

In order to solve the problem of insufficient generation quality caused by traditional patent text abstract generation models only originating from patent specifications, the problem of new terminology OOV caused by rapid patent updates, and the problem of information redundancy caused by insufficient consideration of the high professionalism, accuracy, and uniqueness of patent texts, we proposes a patent text abstract generation model (MSEA) based on a master-slave encoder architecture; Firstly, the MSEA model designs a master-slave encoder, which combines the instructions in the patent text with the claims as input, and fully explores the characteristics and details between the two through the master-slave encoder; Then, the model enhances the consideration of new technical terms in the input sequence based on the pointer network, and further enhances the correlation with the input text by re weighing the "remembered" and "for-gotten" parts of the input sequence from the encoder; Finally, an enhanced repetition suppression mechanism for patent text was introduced to ensure accurate and non redundant abstracts generated. On a publicly available patent text dataset, compared to the state-of-the-art model, Improved Multi-Head Attention Mechanism (IMHAM), the MSEA model achieves an improvement of 0.006, 0.005, and 0.005 in Rouge-1, Rouge-2, and Rouge-L scores, respectively. MSEA leverages the characteristics of patent texts to effectively enhance the quality of patent text generation, demonstrating its advancement and effectiveness in the experiments.

computational linguistic, encoder, proceedings, (15 more...)

arXiv.org Artificial Intelligence

2411.14072

Country:

Oceania > Australia > Victoria > Melbourne (0.05)
Asia > China > Hong Kong (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
(22 more...)

Genre: Research Report (1.00)

Industry:

Education (0.93)
Government > Voting & Elections (0.46)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.67)

Add feedback

Early screening of potential breakthrough technologies with enhanced interpretability: A patent-specific hierarchical attention network model

Choi, Jaewoong, Yoon, Janghyeok, Lee, Changyong

arXiv.org Artificial IntelligenceJul-23-2024

Despite the usefulness of machine learning approaches for the early screening of potential breakthrough technologies, their practicality is often hindered by opaque models. To address this, we propose an interpretable machine learning approach to predicting future citation counts from patent texts using a patent-specific hierarchical attention network (PatentHAN) model. Central to this approach are (1) a patent-specific pre-trained language model, capturing the meanings of technical words in patent claims, (2) a hierarchical network structure, enabling detailed analysis at the claim level, and (3) a claim-wise self-attention mechanism, revealing pivotal claims during the screening process. A case study of 35,376 pharmaceutical patents demonstrates the effectiveness of our approach in early screening of potential breakthrough technologies while ensuring interpretability. Furthermore, we conduct additional analyses using different language models and claim types to examine the robustness of the approach. It is expected that the proposed approach will enhance expert-machine collaboration in identifying breakthrough technologies, providing new insight derived from text mining into technological value.

breakthrough technology, patent, potential breakthrough technology, (16 more...)

arXiv.org Artificial Intelligence

2407.16939

Country:

Asia > South Korea > Seoul > Seoul (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > India > NCT > New Delhi (0.04)

Genre: Research Report > Promising Solution (1.00)

Industry:

Law > Intellectual Property & Technology Law (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

A comparative analysis of embedding models for patent similarity

Ascione, Grazia Sveva, Sterzi, Valerio

arXiv.org Artificial IntelligenceMar-25-2024

This paper makes two contributions to the field of text-based patent similarity. First, it compares the performance of different kinds of patent-specific pretrained embedding models, namely static word embeddings (such as word2vec and doc2vec models) and contextual word embeddings (such as transformers based models), on the task of patent similarity calculation. Second, it compares specifically the performance of Sentence Transformers (SBERT) architectures with different training phases on the patent similarity task. To assess the models' performance, we use information about patent interferences, a phenomenon in which two or more patent claims belonging to different patent applications are proven to be overlapping by patent examiners. Therefore, we use these interferences cases as a proxy for maximum similarity between two patents, treating them as ground-truth to evaluate the performance of the different embedding models. Our results point out that, first, Patent SBERT-adapt-ub, the domain adaptation of the pretrained Sentence Transformer architecture proposed in this research, outperforms the current state-of-the-art in patent similarity. Second, they show that, in some cases, large static models performances are still comparable to contextual ones when trained on extensive data; thus, we believe that the superiority in the performance of contextual embeddings may not be related to the actual architecture but rather to the way the training phase is performed.

dataset, patent, similarity, (13 more...)

arXiv.org Artificial Intelligence

2403.1663

Country: Asia > India > Telangana > Hyderabad (0.04)

Genre: Research Report > New Finding (0.66)

Industry: Law > Intellectual Property & Technology Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Add feedback

Artificial Intelligence Exploring the Patent Field

Jiang, Lekang, Goetz, Stephan

arXiv.org Artificial IntelligenceMar-6-2024

Advanced language-processing and machine-learning techniques promise massive efficiency improvements in the previously widely manual field of patent and technical knowledge management. This field presents large-scale and complex data with very precise contents and language representation of those contents. Particularly, patent texts can differ from mundane texts in various aspects, which entails significant opportunities and challenges. This paper presents a systematic overview of patent-related tasks and popular methodologies with a special focus on evolving and promising techniques. Language processing and particularly large language models as well as the recent boost of general generative methods promise to become game changers in the patent field. The patent literature and the fact-based argumentative procedures around patents appear almost as an ideal use case. However, patents entail a number of difficulties with which existing models struggle. The paper introduces fundamental aspects of patents and patent-related data that affect technology that wants to explore or manage them. It further reviews existing methods and approaches and points out how important reliable and unbiased evaluation metrics become. Although research has made substantial progress on certain tasks, the performance across many others remains suboptimal, sometimes because of either the special nature of patents and their language or inconsistencies between legal terms and the everyday meaning of terms. Moreover, yet few methods have demonstrated the ability to produce satisfactory text for specific sections of patents. By pointing out key developments, opportunities, and gaps, we aim to encourage further research and accelerate the advancement of this field.

information, language model, patent, (17 more...)

arXiv.org Artificial Intelligence

2403.04105

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Oceania > Australia (0.14)
North America > United States > New York > New York County > New York City (0.04)
(7 more...)

Genre:

Research Report > Promising Solution (1.00)
Overview (1.00)
Research Report > New Finding (0.93)

Industry:

Law > Intellectual Property & Technology Law (1.00)
Information Technology (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(6 more...)

Add feedback

PaECTER: Patent-level Representation Learning using Citation-informed Transformers

Ghosh, Mainak, Erhardt, Sebastian, Rose, Michael E., Buunk, Erik, Harhoff, Dietmar

arXiv.org Artificial IntelligenceFeb-29-2024

PaECTER is a publicly available, open-source document-level encoder specific for patents. We fine-tune BERT for Patents with examiner-added citation information to generate numerical representations for patent documents. PaECTER performs better in similarity tasks than current state-of-the-art models used in the patent domain. More specifically, our model outperforms the next-best patent specific pre-trained language model (BERT for Patents) on our patent citation prediction test dataset on two different rank evaluation metrics. PaECTER predicts at least one most similar patent at a rank of 1.32 on average when compared against 25 irrelevant patents. Numerical representations generated by PaECTER from patent text can be used for downstream tasks such as classification, tracing knowledge flows, or semantic similarity search. Semantic similarity search is especially relevant in the context of prior art search for both inventors and patent examiners. PaECTER is available on Hugging Face.

focal patent, paecter, patent, (13 more...)

arXiv.org Artificial Intelligence

2402.19411

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Hong Kong (0.05)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry: Law > Intellectual Property & Technology Law (0.49)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Unveiling Black-boxes: Explainable Deep Learning Models for Patent Classification

Shajalal, Md, Denef, Sebastian, Karim, Md. Rezaul, Boden, Alexander, Stevens, Gunnar

arXiv.org Artificial IntelligenceOct-31-2023

Recent technological advancements have led to a large number of patents in a diverse range of domains, making it challenging for human experts to analyze and manage. State-of-the-art methods for multi-label patent classification rely on deep neural networks (DNNs), which are complex and often considered black-boxes due to their opaque decision-making processes. In this paper, we propose a novel deep explainable patent classification framework by introducing layer-wise relevance propagation (LRP) to provide human-understandable explanations for predictions. We train several DNN models, including Bi-LSTM, CNN, and CNN-BiLSTM, and propagate the predictions backward from the output layer up to the input layer of the model to identify the relevance of words for individual predictions. Considering the relevance score, we then generate explanations by visualizing relevant words for the predicted patent class. Experimental results on two datasets comprising two-million patent texts demonstrate high performance in terms of various evaluation measures. The explanations generated for each prediction highlight important relevant words that align with the predicted class, making the prediction more understandable. Explainable systems have the potential to facilitate the adoption of complex AI-enabled methods for patent classification in real-world applications.

classification, patent, prediction, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-44067-0_24

2310.20478

Country:

Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Europe > Portugal > Lisbon > Lisbon (0.04)
Europe > Germany > North Rhine-Westphalia > Cologne Region > Aachen (0.04)
(3 more...)

Genre: Research Report > Promising Solution (0.34)

Industry:

Law > Intellectual Property & Technology Law (0.69)
Transportation > Air (0.61)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Evaluating Generative Patent Language Models

Lee, Jieh-Sheng

arXiv.org Artificial IntelligenceJun-5-2023

Generative language models are promising for assisting human writing in various domains. This manuscript aims to build generative language models in the patent domain and evaluate model performance from a human-centric perspective. The perspective is to measure the ratio of keystrokes that can be saved by autocompletion based on generative patent language models. A higher ratio means a more effective model which can save more keystrokes. This metric can be used to benchmark model performance. The metric is different from conventional machine-centric metrics that are token-based instead of keystroke-based. In terms of model size, the largest model built in this manuscript is 6B, which is state-of-the-art in the patent domain. Based on the metric, it is found that the largest model is not necessarily the best for the human-centric metric. The finding means that keeping increasing model sizes in the patent domain might be unnecessary if the purpose is to assist human writing with autocompletion. Several patent language models are pre-trained from scratch in this research. The pre-trained models are released for future researchers. Several visualization tools are also provided. The importance of building a generative language model in the patent domain is the potential to facilitate creativity and innovations in the future.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.wpi.2023.102173

2206.14578

Country:

Asia > Taiwan (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry: Law > Intellectual Property & Technology Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)

Add feedback

Technological taxonomies for hypernym and hyponym retrieval in patent texts

Zuo, You, Li, Yixuan, García, Alma Parias, Gerdes, Kim

arXiv.org Artificial IntelligenceDec-13-2022

This paper presents an automatic approach to creating taxonomies of technical terms based on the Cooperative Patent Classification (CPC). The resulting taxonomy contains about 170k nodes in 9 separate technological branches and is freely available. We also show that a Text-to-Text Transfer Transformer (T5) model can be fine-tuned to generate hypernyms and hyponyms with relatively high precision, confirming the manually assessed quality of the resource. The T5 model opens the taxonomy to any new technological terms for which a hypernym can be generated, thus making the resource updateable with new terms, an essential feature for the constantly evolving field of technological terminology.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2212.06039

Country:

North America > United States > New York > New York County > New York City (0.05)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)
Europe > Czechia > Prague (0.04)

Genre: Research Report (0.50)

Industry:

Law > Intellectual Property & Technology Law (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.68)

Add feedback