AITopics | proflingo

Collaborating Authors

proflingo

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

CTCC: A Robust and Stealthy Fingerprinting Framework for Large Language Models via Cross-Turn Contextual Correlation Backdoor

Xu, Zhenhua, Zhao, Xixiang, Yue, Xubin, Tian, Shengwei, Lin, Changting, Han, Meng

arXiv.org Artificial IntelligenceSep-15-2025

The widespread deployment of large language models (LLMs) has intensified concerns around intellectual property (IP) protection, as model theft and unauthorized redistribution become increasingly feasible. To address this, model fingerprinting aims to embed verifiable ownership traces into LLMs. However, existing methods face inherent trade-offs between stealthness, robustness, and generalizability, being either detectable via distributional shifts, vulnerable to adversarial modifications, or easily invalidated once the fingerprint is revealed. In this work, we introduce CTCC, a novel rule-driven fingerprinting framework that encodes contextual correlations across multiple dialogue turns, such as counterfactual, rather than relying on token-level or single-turn triggers. CTCC enables fingerprint verification under black-box access while mitigating false positives and fingerprint leakage, supporting continuous construction under a shared semantic rule even if partial triggers are exposed. Extensive experiments across multiple LLM architectures demonstrate that CTCC consistently achieves stronger stealth and robustness than prior work. Our findings position CTCC as a reliable and practical solution for ownership verification in real-world LLM deployment scenarios. Our code and data are publicly available at .

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2509.09703

Country: Europe > Austria (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Law (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

The Challenge of Identifying the Origin of Black-Box Large Language Models

Yang, Ziqing, Wu, Yixin, Shen, Yun, Dai, Wei, Backes, Michael, Zhang, Yang

arXiv.org Artificial IntelligenceMar-6-2025

The tremendous commercial potential of large language models (LLMs) has heightened concerns about their unauthorized use. Third parties can customize LLMs through fine-tuning and offer only black-box API access, effectively concealing unauthorized usage and complicating external auditing processes. This practice not only exacerbates unfair competition, but also violates licensing agreements. In response, identifying the origin of black-box LLMs is an intrinsic solution to this issue. In this paper, we first reveal the limitations of state-of-the-art passive and proactive identification methods with experiments on 30 LLMs and two real-world black-box APIs. Then, we propose the proactive technique, PlugAE, which optimizes adversarial token embeddings in a continuous space and proactively plugs them into the LLM for tracing and identification. The experiments show that PlugAE can achieve substantial improvement in identifying fine-tuned derivatives. We further advocate for legal frameworks and regulations to better address the challenges posed by the unauthorized use of LLMs.

base model, llm, plugae, (16 more...)

arXiv.org Artificial Intelligence

2503.04332

Genre: Research Report > New Finding (0.67)

Industry:

Transportation > Air (1.00)
Information Technology > Security & Privacy (0.93)
Law > Intellectual Property & Technology Law (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

ProFLingo: A Fingerprinting-based Intellectual Property Protection Scheme for Large Language Models

Jin, Heng, Zhang, Chaoyu, Shi, Shanghao, Lou, Wenjing, Hou, Y. Thomas

arXiv.org Artificial IntelligenceJun-26-2024

Large language models (LLMs) have attracted significant attention in recent years. Due to their "Large" nature, training LLMs from scratch consumes immense computational resources. Since several major players in the artificial intelligence (AI) field have open-sourced their original LLMs, an increasing number of individual researchers and smaller companies are able to build derivative LLMs based on these open-sourced models at much lower costs. However, this practice opens up possibilities for unauthorized use or reproduction that may not comply with licensing agreements, and fine-tuning can change the model's behavior, thus complicating the determination of model ownership. Current intellectual property (IP) protection schemes for LLMs are either designed for white-box settings or require additional modifications to the original model, which restricts their use in real-world settings. In this paper, we propose ProFLingo, a black-box fingerprinting-based IP protection scheme for LLMs. ProFLingo generates queries that elicit specific responses from an original model, thereby establishing unique fingerprints. Our scheme assesses the effectiveness of these queries on a suspect model to determine whether it has been derived from the original model. ProFLingo offers a non-invasive approach, which neither requires knowledge of the suspect model nor modifications to the base model or its training process. To the best of our knowledge, our method represents the first black-box fingerprinting technique for IP protection for LLMs. Our source code and generated queries are available at: https://github.com/hengvt/ProFLingo.

original model, proflingo, query, (13 more...)

arXiv.org Artificial Intelligence

2405.02466

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.04)
Asia (0.04)
(10 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Law > Intellectual Property & Technology Law (1.00)
Information Technology > Security & Privacy (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback