AITopics | Graziani, Mara

Collaborating Authors

Graziani, Mara

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Making Sense of Data in the Wild: Data Analysis Automation at Scale

Graziani, Mara, Molnar, Malina, Morales, Irina Espejo, Cadow-Gossweiler, Joris, Laino, Teodoro

arXiv.org Artificial IntelligenceJan-27-2025

As the volume of publicly available data continues to grow, researchers face the challenge of limited diversity in benchmarking machine learning tasks. Although thousands of datasets are available in public repositories, the sheer abundance often complicates the search for suitable data, leaving many valuable datasets underexplored. This situation is further amplified by the fact that, despite longstanding advocacy for improving data curation quality, current solutions remain prohibitively time-consuming and resource-intensive. In this paper, we propose a novel approach that combines intelligent agents with retrieval augmented generation to automate data analysis, dataset curation and indexing at scale. Our system leverages multiple agents to analyze raw, unstructured data across public repositories, generating dataset reports and interactive visual indexes that can be easily explored. We demonstrate that our approach results in more detailed dataset descriptions, higher hit rates and greater diversity in dataset retrieval tasks. Additionally, we show that the dataset reports generated by our method can be leveraged by other machine learning models to improve the performance on specific tasks, such as improving the accuracy and realism of synthetic data generation. By streamlining the process of transforming raw data into machine-learning-ready datasets, our approach enables researchers to better utilize existing data resources.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2502.15718

Country:

Europe (1.00)
North America > United States > Massachusetts > Middlesex County (0.14)
North America > United States > Texas > Travis County > Austin (0.14)
(3 more...)

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.46)

Industry:

Materials > Chemicals > Commodity Chemicals > Petrochemicals (1.00)
Health & Medicine (1.00)
Transportation (0.93)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Uncovering Unique Concept Vectors through Latent Space Decomposition

Graziani, Mara, Mahony, Laura O', Nguyen, An-Phi, Müller, Henning, Andrearczyk, Vincent

arXiv.org Artificial IntelligenceJul-14-2023

Interpreting the inner workings of deep learning models is crucial for establishing trust and ensuring model safety. Concept-based explanations have emerged as a superior approach that is more interpretable than feature attribution estimates such as pixel saliency. However, defining the concepts for the interpretability analysis biases the explanations by the user's expectations on the concepts. To address this, we propose a novel post-hoc unsupervised method that automatically uncovers the concepts learned by deep models during training. By decomposing the latent space of a layer in singular vectors and refining them by unsupervised clustering, we uncover concept vectors aligned with directions of high variance that are relevant to the model prediction, and that point to semantically distinct concepts. Our extensive experiments reveal that the majority of our concepts are readily understandable to humans, exhibit coherency, and bear relevance to the task at hand. Moreover, we showcase the practical utility of our method in dataset exploration, where our concept vectors successfully identify outlier training samples affected by various confounding factors. This novel exploration technique has remarkable versatility to data types and model architectures and it will facilitate the identification of biases and the discovery of sources of error within training data.

artificial intelligence, machine learning, vector, (20 more...)

arXiv.org Artificial Intelligence

2307.06913

Country:

Europe > Switzerland (0.14)
North America > United States (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Add feedback

Disentangling Neuron Representations with Concept Vectors

O'Mahony, Laura, Andrearczyk, Vincent, Muller, Henning, Graziani, Mara

arXiv.org Artificial IntelligenceApr-19-2023

Mechanistic interpretability aims to understand how models store representations by breaking down neural networks into interpretable units. However, the occurrence of polysemantic neurons, or neurons that respond to multiple unrelated features, makes interpreting individual neurons challenging. This has led to the search for meaningful vectors, known as concept vectors, in activation space instead of individual neurons. The main contribution of this paper is a method to disentangle polysemantic neurons into concept vectors encapsulating distinct features. Our method can search for fine-grained concepts according to the user's desired level of concept separation. The analysis shows that polysemantic neurons can be disentangled into directions consisting of linear combinations of neurons. Our evaluations show that the concept vectors found encode coherent, human-understandable features.

artificial intelligence, machine learning, neuron, (19 more...)

arXiv.org Artificial Intelligence

2304.09707

Country: Europe > Switzerland (0.47)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.49)

Add feedback

Regression-based Deep-Learning predicts molecular biomarkers from pathology slides

Nahhas, Omar S. M. El, Loeffler, Chiara M. L., Carrero, Zunamys I., van Treeck, Marko, Kolbinger, Fiona R., Hewitt, Katherine J., Muti, Hannah S., Graziani, Mara, Zeng, Qinghe, Calderaro, Julien, Ortiz-Brüchle, Nadina, Yuan, Tanwei, Hoffmeister, Michael, Brenner, Hermann, Brobeil, Alexander, Reis-Filho, Jorge S., Kather, Jakob Nikolas

arXiv.org Artificial IntelligenceApr-11-2023

Deep Learning (DL) can predict biomarkers from cancer histopathology. Several clinically approved applications use this technology. Most approaches, however, predict categorical labels, whereas biomarkers are often continuous measurements. We hypothesized that regression-based DL outperforms classification-based DL. Therefore, we developed and evaluated a new self-supervised attention-based weakly supervised regression method that predicts continuous biomarkers directly from images in 11,671 patients across nine cancer types. We tested our method for multiple clinically and biologically relevant biomarkers: homologous repair deficiency (HRD) score, a clinically used pan-cancer biomarker, as well as markers of key biological processes in the tumor microenvironment. Using regression significantly enhances the accuracy of biomarker prediction, while also improving the interpretability of the results over classification. In a large cohort of colorectal cancer patients, regression-based prediction scores provide a higher prognostic value than classification-based scores. Our open-source regression approach offers a promising alternative for continuous biomarker analysis in computational pathology.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2304.05153

Country: Europe > Germany (0.68)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Oncology > Carcinoma (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.73)

Add feedback

Regression Concept Vectors for Bidirectional Explanations in Histopathology

Graziani, Mara, Andrearczyk, Vincent, Müller, Henning

arXiv.org Machine LearningApr-9-2019

Explanations for deep neural network predictions in terms of domain-related concepts can be valuable in medical applications, where justifications are important for confidence in the decision-making. In this work, we propose a methodology to exploit continuous concept measures as Regression Concept Vectors (RCVs) in the activation space of a layer. The directional derivative of the decision function along the RCVs represents the network sensitivity to increasing values of a given concept measure. When applied to breast cancer grading, nuclei texture emerges as a relevant concept in the detection of tumor tissue in breast lymph node samples. We evaluate score robustness and consistency by statistical analysis.

deep learning, explanation, neural network, (19 more...)

arXiv.org Machine Learning

1904.0452

Country:

Europe > Switzerland (0.15)
Europe > Spain (0.14)

Genre: Research Report > Experimental Study (0.97)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.31)

Add feedback