AITopics | Petrov, Aleksandar

Collaborating Authors

Petrov, Aleksandar

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

On the Coexistence and Ensembling of Watermarks

Petrov, Aleksandar, Agarwal, Shruti, Torr, Philip H. S., Bibi, Adel, Collomosse, John

arXiv.org Artificial IntelligenceJan-28-2025

Watermarking, the practice of embedding imperceptible information into media such as images, videos, audio, and text, is essential for intellectual property protection, content provenance and attribution. The growing complexity of digital ecosystems necessitates watermarks for different uses to be embedded in the same media. However, to detect and decode all watermarks, they need to coexist well with one another. We perform the first study of coexistence of deep image watermarking methods and, contrary to intuition, we find that various open-source watermarks can coexist with only minor impacts on image quality and decoding robustness. The coexistence of watermarks also opens the avenue for ensembling watermarking methods. We show how ensembling can increase the overall message capacity and enable new trade-offs between capacity, accuracy, robustness and image quality, without needing to retrain the base models.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2501.17356

Genre: Research Report (0.63)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(3 more...)

Add feedback

Mimicking User Data: On Mitigating Fine-Tuning Risks in Closed Large Language Models

Eiras, Francisco, Petrov, Aleksandar, Torr, Phillip H. S., Kumar, M. Pawan, Bibi, Adel

arXiv.org Artificial IntelligenceJul-1-2024

Fine-tuning large language models on small, high-quality datasets can enhance their performance on specific downstream tasks. Recent research shows that fine-tuning on benign, instruction-following data can inadvertently undo the safety alignment process and increase a model's propensity to comply with harmful queries. Although critical, understanding and mitigating safety risks in well-defined tasks remains distinct from the instruction-following context due to structural differences in the data. Our work addresses the gap in our understanding of these risks across diverse types of data in closed models - where providers control how user data is utilized in the fine-tuning process. We demonstrate how malicious actors can subtly manipulate the structure of almost any task-specific dataset to foster significantly more dangerous model behaviors, while maintaining an appearance of innocuity and reasonable downstream task performance. To address this issue, we propose a novel mitigation strategy that mixes in safety data which mimics the task format and prompting style of the user data, showing this is more effective than existing baselines at re-establishing safety alignment while maintaining similar task performance.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2406.10288

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology > Security & Privacy (0.91)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Universal In-Context Approximation By Prompting Fully Recurrent Models

Petrov, Aleksandar, Lamb, Tom A., Paren, Alasdair, Torr, Philip H. S., Bibi, Adel

arXiv.org Artificial IntelligenceJun-3-2024

Zero-shot and in-context learning enable solving tasks without model fine-tuning, making them essential for developing generative model solutions. Therefore, it is crucial to understand whether a pretrained model can be prompted to approximate any function, i.e., whether it is a universal in-context approximator. While it was recently shown that transformer models do possess this property, these results rely on their attention mechanism. Hence, these findings do not apply to fully recurrent architectures like RNNs, LSTMs, and the increasingly popular SSMs. We demonstrate that RNNs, LSTMs, GRUs, Linear RNNs, and linear gated architectures such as Mamba and Hawk/Griffin can also serve as universal in-context approximators. To streamline our argument, we introduce a programming language called LSRL that compiles to these fully recurrent architectures. LSRL may be of independent interest for further studies of fully recurrent models, such as constructing interpretability benchmarks. We also study the role of multiplicative gating and observe that architectures incorporating such gating (e.g., LSTMs, GRUs, Hawk/Griffin) can implement certain operations more stably, making them more viable candidates for practical in-context universal approximation.

machine learning, natural language, node, (19 more...)

arXiv.org Artificial Intelligence

2406.01424

Country:

Europe > Greece (0.14)
Europe > Austria (0.14)

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Risks and Opportunities of Open-Source Generative AI

Eiras, Francisco, Petrov, Aleksandar, Vidgen, Bertie, Schroeder, Christian, Pizzati, Fabio, Elkins, Katherine, Mukhopadhyay, Supratik, Bibi, Adel, Purewal, Aaron, Botos, Csaba, Steibel, Fabro, Keshtkar, Fazel, Barez, Fazl, Smith, Genevieve, Guadagni, Gianluca, Chun, Jon, Cabot, Jordi, Imperial, Joseph, Nolazco, Juan Arturo, Landay, Lori, Jackson, Matthew, Torr, Phillip H. S., Darrell, Trevor, Lee, Yong, Foerster, Jakob

arXiv.org Artificial IntelligenceMay-29-2024

Applications of Generative AI (Gen AI) are expected to revolutionize a number of different areas, ranging from science & medicine to education. The potential for these seismic changes has triggered a lively debate about the potential risks of the technology, and resulted in calls for tighter regulation, in particular from some of the major tech companies who are leading in AI development. This regulation is likely to put at risk the budding field of open-source generative AI. Using a three-stage framework for Gen AI development (near, mid and long-term), we analyze the risks and opportunities of open-source generative AI models with similar capabilities to the ones currently available (near to mid-term) and with greater capabilities (long-term). We argue that, overall, the benefits of open-source Gen AI outweigh its risks. As such, we encourage the open sourcing of models, training and evaluation data, and provide a set of recommendations and best practices for managing risks associated with open-source generative AI.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2405.08597

Country:

North America > United States (1.00)
Europe (1.00)
Asia > Middle East > Saudi Arabia (0.28)
Asia > Middle East > UAE (0.28)

Genre:

Overview (1.00)
Research Report > Experimental Study (0.45)

Industry:

Social Sector (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Information Technology > Security & Privacy (1.00)
(12 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

Near to Mid-term Risks and Opportunities of Open-Source Generative AI

Eiras, Francisco, Petrov, Aleksandar, Vidgen, Bertie, de Witt, Christian Schroeder, Pizzati, Fabio, Elkins, Katherine, Mukhopadhyay, Supratik, Bibi, Adel, Csaba, Botos, Steibel, Fabro, Barez, Fazl, Smith, Genevieve, Guadagni, Gianluca, Chun, Jon, Cabot, Jordi, Imperial, Joseph Marvin, Nolazco-Flores, Juan A., Landay, Lori, Jackson, Matthew, Röttger, Paul, Torr, Philip H. S., Darrell, Trevor, Lee, Yong Suk, Foerster, Jakob

arXiv.org Artificial IntelligenceMay-24-2024

In the next few years, applications of Generative AI are expected to revolutionize a number of different areas, ranging from science & medicine to education. The potential for these seismic changes has triggered a lively debate about potential risks and resulted in calls for tighter regulation, in particular from some of the major tech companies who are leading in AI development. This regulation is likely to put at risk the budding field of open-source Generative AI. We argue for the responsible open sourcing of generative AI models in the near and medium term. To set the stage, we first introduce an AI openness taxonomy system and apply it to 40 current large language models. We then outline differential benefits and risks of open versus closed source AI and present potential risk mitigation, ranging from best practices to calls for technical and scientific contributions. We hope that this report will add a much needed missing voice to the current public discourse on near to mid-term AI safety and other societal impact.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2404.17047

Country:

North America > United States (1.00)
Europe (1.00)
Asia > Middle East > UAE (0.28)

Genre: Research Report (0.63)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
(7 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

Prompting a Pretrained Transformer Can Be a Universal Approximator

Petrov, Aleksandar, Torr, Philip H. S., Bibi, Adel

arXiv.org Artificial IntelligenceFeb-22-2024

Despite the widespread adoption of prompting, prompt tuning and prefix-tuning of transformer models, our theoretical understanding of these fine-tuning methods remains limited. A key question is whether one can arbitrarily modify the behavior of pretrained model by prompting or prefix-tuning it. Formally, whether prompting and prefix-tuning a pretrained model can universally approximate sequence-to-sequence functions. This paper answers in the affirmative and demonstrates that much smaller pretrained models than previously thought can be universal approximators when prefixed. In fact, the attention mechanism is uniquely suited for universal approximation with prefix-tuning a single attention head being sufficient to approximate any continuous function. Moreover, any sequence-to-sequence function can be approximated by prefixing a transformer with depth linear in the sequence length. Beyond these density-type results, we also offer Jackson-type bounds on the length of the prefix needed to approximate a function to a desired precision.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2402.14753

Country: Europe (0.14)

Genre:

Overview (0.67)
Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Language Models as a Service: Overview of a New Paradigm and its Challenges

La Malfa, Emanuele, Petrov, Aleksandar, Frieder, Simon, Weinhuber, Christoph, Burnell, Ryan, Nazar, Raza, Cohn, Anthony G., Shadbolt, Nigel, Wooldridge, Michael

arXiv.org Artificial IntelligenceNov-30-2023

Some of the most powerful language models currently are proprietary systems, accessible only via (typically restrictive) web or software programming interfaces. This is the Language-Models-as-a-Service (LMaaS) paradigm. In contrast with scenarios where full model access is available, as in the case of open-source models, such closed-off language models present specific challenges for evaluating, benchmarking, and testing them. This paper has two goals: on the one hand, we delineate how the aforementioned challenges act as impediments to the accessibility, replicability, reliability, and trustworthiness of LMaaS. We systematically examine the issues that arise from a lack of information about language models for each of these four aspects. We conduct a detailed analysis of existing solutions and put forth a number of considered recommendations, and highlight the directions for future advancements. On the other hand, it serves as a comprehensive resource for existing knowledge on current, major LMaaS, offering a synthesized overview of the licences and capabilities their interfaces offer.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2309.16573

Country:

Europe (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Media > Film (0.46)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

When Do Prompting and Prefix-Tuning Work? A Theory of Capabilities and Limitations

Petrov, Aleksandar, Torr, Philip H. S., Bibi, Adel

arXiv.org Artificial IntelligenceOct-30-2023

Context-based fine-tuning methods, including prompting, in-context learning, soft prompting (also known as prompt tuning), and prefix-tuning, have gained popularity due to their ability to often match the performance of full fine-tuning with a fraction of the parameters. Despite their empirical successes, there is little theoretical understanding of how these techniques influence the internal computation of the model and their expressiveness limitations. We show that despite the continuous embedding space being more expressive than the discrete token space, soft-prompting and prefix-tuning are strictly less expressive than full fine-tuning, even with the same number of learnable parameters. Concretely, context-based fine-tuning cannot change the relative attention pattern over the content and can only bias the outputs of an attention layer in a fixed direction. This suggests that while techniques like prompting, in-context learning, soft prompting, and prefix-tuning can effectively elicit skills present in the pretrained model, they cannot learn novel tasks that require new attention patterns.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2310.19698

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Language Model Tokenizers Introduce Unfairness Between Languages

Petrov, Aleksandar, La Malfa, Emanuele, Torr, Philip H. S., Bibi, Adel

arXiv.org Artificial IntelligenceOct-20-2023

Recent language models have shown impressive multilingual performance, even when not explicitly trained for it. Despite this, there are concerns about the quality of their outputs across different languages. In this paper, we show how disparity in the treatment of different languages arises at the tokenization stage, well before a model is even invoked. The same text translated into different languages can have drastically different tokenization lengths, with differences up to 15 times in some cases. These disparities persist even for tokenizers that are intentionally trained for multilingual support. Character-level and byte-level models also exhibit over 4 times the difference in the encoding length for some language pairs. This induces unfair treatment for some language communities in regard to the cost of accessing commercial language services, the processing time and latency, as well as the amount of content that can be provided as context to the models. Therefore, we make the case that we should train future language models using multilingually fair subword tokenizers.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2305.15425

Country:

Asia > Middle East (1.00)
Africa (1.00)
Asia > Japan > Honshū > Kantō > Saitama Prefecture (0.14)

Genre:

Research Report (0.49)
Overview (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)

Add feedback

Certifying Ensembles: A General Certification Theory with S-Lipschitzness

Petrov, Aleksandar, Eiras, Francisco, Sanyal, Amartya, Torr, Philip H. S., Bibi, Adel

arXiv.org Artificial IntelligenceApr-25-2023

Improving and guaranteeing the robustness of deep learning models has been a topic of intense research. Ensembling, which combines several classifiers to provide a better model, has shown to be beneficial for generalisation, uncertainty estimation, calibration, and mitigating the effects of concept drift. However, the impact of ensembling on certified robustness is less well understood. In this work, we generalise Lipschitz continuity by introducing S-Lipschitz classifiers, which we use to analyse the theoretical robustness of ensembles. Our results are precise conditions when ensembles of robust classifiers are more robust than any constituent classifier, as well as conditions when they are less robust.

artificial intelligence, classifier, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2304.13019

Country:

North America > United States (0.45)
Europe > Switzerland (0.28)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology > Security & Privacy (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback