AITopics | Giulivi, Loris

Collaborating Authors

Giulivi, Loris

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Explaining Multi-modal Large Language Models by Analyzing their Vision Perception

Giulivi, Loris, Boracchi, Giacomo

arXiv.org Artificial IntelligenceMay-28-2024

Multi-modal Large Language Models (MLLMs) have demonstrated remarkable capabilities in understanding and generating content across various modalities, such as images and text. However, their interpretability remains a challenge, hindering their adoption in critical applications. This research proposes a novel approach to enhance the interpretability of MLLMs by focusing on the image embedding component. We combine an open-world localization model with a MLLM, thus creating a new architecture able to simultaneously produce text and object localization outputs from the same vision embedding. The proposed architecture greatly promotes interpretability, enabling us to design a novel saliency map to explain any output token, to identify model hallucinations, and to assess model biases through semantic adversarial perturbations.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2405.14612

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (0.84)

Industry: Leisure & Entertainment > Sports > Skiing (0.30)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Concept Visualization: Explaining the CLIP Multi-modal Embedding Using WordNet

Giulivi, Loris, Boracchi, Giacomo

arXiv.org Artificial IntelligenceMay-23-2024

Advances in multi-modal embeddings, and in particular CLIP, have recently driven several breakthroughs in Computer Vision (CV). CLIP has shown impressive performance on a variety of tasks, yet, its inherently opaque architecture may hinder the application of models employing CLIP as backbone, especially in fields where trust and model explainability are imperative, such as in the medical domain. Current explanation methodologies for CV models rely on Saliency Maps computed through gradient analysis or input perturbation. However, these Saliency Maps can only be computed to explain classes relevant to the end task, often smaller in scope than the backbone training classes. In the context of models implementing CLIP as their vision backbone, a substantial portion of the information embedded within the learned representations is thus left unexplained. In this work, we propose Concept Visualization (ConVis), a novel saliency methodology that explains the CLIP embedding of an image by exploiting the multi-modal nature of the embeddings. ConVis makes use of lexical information from WordNet to compute task-agnostic Saliency Maps for any concept, not limited to concepts the end model was trained on. We validate our use of WordNet via an out of distribution detection experiment, and test ConVis on an object localization benchmark, showing that Concept Visualizations correctly identify and localize the image's semantic content. Additionally, we perform a user study demonstrating that our methodology can give users insight on the model's functioning.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2405.14563

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (1.00)

Industry: Transportation (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Adversarial Scratches: Deployable Attacks to CNN Classifiers

Giulivi, Loris, Jere, Malhar, Rossi, Loris, Koushanfar, Farinaz, Ciocarlie, Gabriela, Hitaj, Briland, Boracchi, Giacomo

arXiv.org Artificial IntelligenceMay-18-2023

A growing body of work has shown that deep neural networks are susceptible to adversarial examples. These take the form of small perturbations applied to the model's input which lead to incorrect predictions. Unfortunately, most literature focuses on visually imperceivable perturbations to be applied to digital images that often are, by design, impossible to be deployed to physical targets. We present Adversarial Scratches: a novel L0 black-box attack, which takes the form of scratches in images, and which possesses much greater deployability than other state-of-the-art attacks. Adversarial Scratches leverage B\'ezier Curves to reduce the dimension of the search space and possibly constrain the attack to a specific location. We test Adversarial Scratches in several scenarios, including a publicly available API and images of traffic signs. Results show that, often, our attack achieves higher fooling rate than other deployable state-of-the-art methods, while requiring significantly fewer queries and modifying very few pixels.

adversarial scratch, artificial intelligence, machine learning, (2 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.patcog.2022.108985

2204.09397

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.53)

Add feedback