AITopics | Mamalakis, Michail

Collaborating Authors

Mamalakis, Michail

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Fine-tuning Dataset and Benchmark for Large Language Models for Protein Understanding

Shen, Yiqing, Chen, Zan, Mamalakis, Michail, He, Luhan, Xia, Haiyang, Li, Tianbin, Su, Yanzhou, He, Junjun, Wang, Yu Guang

arXiv.org Artificial IntelligenceJul-8-2024

The parallels between protein sequences and natural language in their sequential structures have inspired the application of large language models (LLMs) to protein understanding. Despite the success of LLMs in NLP, their effectiveness in comprehending protein sequences remains an open question, largely due to the absence of datasets linking protein sequences to descriptive text. Researchers have then attempted to adapt LLMs for protein understanding by integrating a protein sequence encoder with a pre-trained LLM. However, this adaptation raises a fundamental question: "Can LLMs, originally designed for NLP, effectively comprehend protein sequences as a form of language?" Current datasets fall short in addressing this question due to the lack of a direct correlation between protein sequences and corresponding text descriptions, limiting the ability to train and evaluate LLMs for protein understanding effectively. To bridge this gap, we introduce ProteinLMDataset, a dataset specifically designed for further self-supervised pretraining and supervised fine-tuning (SFT) of LLMs to enhance their capability for protein sequence comprehension. Specifically, ProteinLMDataset includes 17.46 billion tokens for pretraining and 893,000 instructions for SFT. Additionally, we present ProteinLMBench, the first benchmark dataset consisting of 944 manually verified multiple-choice questions for assessing the protein understanding capabilities of LLMs. ProteinLMBench incorporates protein-related details and sequences in multiple languages, establishing a new standard for evaluating LLMs' abilities in protein comprehension. The large language model InternLM2-7B, pretrained and fine-tuned on the ProteinLMDataset, outperforms GPT-4 on ProteinLMBench, achieving the highest accuracy score.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2406.0554

Country: Asia > China (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

The Explanation Necessity for Healthcare AI

Mamalakis, Michail, de Vareilles, Héloïse, Murray, Graham, Lio, Pietro, Suckling, John

arXiv.org Artificial IntelligenceMay-31-2024

Explainability is often critical to the acceptable implementation of artificial intelligence (AI). Nowhere is this more important than healthcare where decision-making directly impacts patients and trust in AI systems is essential. This trust is often built on the explanations and interpretations the AI provides. Despite significant advancements in AI interpretability, there remains the need for clear guidelines on when and to what extent explanations are necessary in the medical context. We propose a novel categorization system with four distinct classes of explanation necessity, guiding the level of explanation required: patient or sample (local) level, cohort or dataset (global) level, or both levels. We introduce a mathematical formulation that distinguishes these categories and offers a practical framework for researchers to determine the necessity and depth of explanations required in medical AI applications. Three key factors are considered: the robustness of the evaluation protocol, the variability of expert observations, and the representation dimensionality of the application. In this perspective, we address the question: When does an AI medical application need to be explained, and at what level of detail?

application, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2406.00216

Country:

Europe > United Kingdom > England > Cambridgeshire (0.15)
North America > United States > New York > New York County > New York City (0.14)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.94)
Health & Medicine > Therapeutic Area > Neurology (0.93)
Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.69)
(2 more...)

Add feedback

Hunting imaging biomarkers in pulmonary fibrosis: Benchmarks of the AIIB23 challenge

Nan, Yang, Xing, Xiaodan, Wang, Shiyi, Tang, Zeyu, Felder, Federico N, Zhang, Sheng, Ledda, Roberta Eufrasia, Ding, Xiaoliu, Yu, Ruiqi, Liu, Weiping, Shi, Feng, Sun, Tianyang, Cao, Zehong, Zhang, Minghui, Gu, Yun, Zhang, Hanxiao, Gao, Jian, Tang, Wen, Yu, Pengxin, Kang, Han, Chen, Junqiang, Lu, Xing, Zhang, Boyu, Mamalakis, Michail, Prinzi, Francesco, Carlini, Gianluca, Cuneo, Lisa, Banerjee, Abhirup, Xing, Zhaohu, Zhu, Lei, Mesbah, Zacharia, Jain, Dhruv, Mayet, Tsiry, Yuan, Hongyu, Lyu, Qing, Wells, Athol, Walsh, Simon LF, Yang, Guang

arXiv.org Artificial IntelligenceDec-21-2023

Airway-related quantitative imaging biomarkers are crucial for examination, diagnosis, and prognosis in pulmonary diseases. However, the manual delineation of airway trees remains prohibitively time-consuming. While significant efforts have been made towards enhancing airway modelling, current public-available datasets concentrate on lung diseases with moderate morphological variations. The intricate honeycombing patterns present in the lung tissues of fibrotic lung disease patients exacerbate the challenges, often leading to various prediction errors. To address this issue, the 'Airway-Informed Quantitative CT Imaging Biomarker for Fibrotic Lung Disease 2023' (AIIB23) competition was organized in conjunction with the official 2023 International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI). The airway structures were meticulously annotated by three experienced radiologists. Competitors were encouraged to develop automatic airway segmentation models with high robustness and generalization abilities, followed by exploring the most correlated QIB of mortality prediction. A training set of 120 high-resolution computerised tomography (HRCT) scans were publicly released with expert annotations and mortality status. The online validation set incorporated 52 HRCT scans from patients with fibrotic lung disease and the offline test set included 140 cases from fibrosis and COVID-19 patients. The results have shown that the capacity of extracting airway trees from patients with fibrotic lung disease could be enhanced by introducing voxel-wise weighted general union loss and continuity loss. In addition to the competitive image biomarkers for prognosis, a strong airway-derived biomarker (Hazard ratio>1.5, p<0.0001) was revealed for survival prognostication compared with existing clinical measurements, clinician assessment and AI-based biomarkers.

artificial intelligence, machine learning, segmentation, (17 more...)

arXiv.org Artificial Intelligence

2312.13752

Country:

Asia > China (0.68)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study > Negative Result (0.46)

Industry: Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

A novel framework employing deep multi-attention channels network for the autonomous detection of metastasizing cells through fluorescence microscopy

Mamalakis, Michail, Macfarlane, Sarah C., Notley, Scott V., Gad, Annica K. B, Panoutsos, George

arXiv.org Artificial IntelligenceSep-2-2023

We developed a transparent computational large-scale imaging-based framework that can distinguish between normal and metastasizing human cells. The method relies on fluorescence microscopy images showing the spatial organization of actin and vimentin filaments in normal and metastasizing single cells, using a combination of multi-attention channels network and global explainable techniques. We test a classification between normal cells (Bj primary fibroblast), and their isogenically matched, transformed and invasive counterpart (BjTertSV40TRasV12). Manual annotation is not trivial to automate due to the intricacy of the biologically relevant features. In this research, we utilized established deep learning networks and our new multi-attention channel architecture. To increase the interpretability of the network - crucial for this application area - we developed an interpretable global explainable approach correlating the weighted geometric mean of the total cell images and their local GradCam scores. The significant results from our analysis unprecedently allowed a more detailed, and biologically relevant understanding of the cytoskeletal changes that accompany oncogenic transformation of normal to invasive and metastasizing cells. We also paved the way for a possible spatial micrometre-level biomarker for future development of diagnostic tools against metastasis (spatial distribution of vimentin).

classification, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2309.00911

Country:

North America > United States (0.46)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

A 3D explainability framework to uncover learning patterns and crucial sub-regions in variable sulci recognition

Mamalakis, Michail, de Vareilles, Heloise, AI-Manea, Atheer, Mitchell, Samantha C., Arartz, Ingrid, Morch-Johnsen, Lynn Egeland, Garrison, Jane, Simons, Jon, Lio, Pietro, Suckling, John, Murray, Graham

arXiv.org Artificial IntelligenceSep-2-2023

A B S T R A C T Precisely identifying sulcal features in brain MRI is made challenging by the variability of brain folding. This research introduces an innovative 3D explainability frame-work that validates outputs from deep learning networks in their ability to detect the paracin-gulate sulcus, an anatomical feature that may or may not be present on the frontal medial surface of the human brain. This study trained and tested two networks, amalgamating local explainability techniques GradCam and SHAP with a dimensionality reduction method. The explainability framework provided both localized and global explanations, along with accuracy of classification results, revealing pertinent sub-regions contributing to the decision process through a post-fusion transformation of explanatory and statistical features. Leveraging the TOP-OSLO dataset of MRI acquired from patients with schizophrenia, greater accuracies of paracingulate sulcus detection (presence or absence) were found in the left compared to right hemispheres with distinct, but extensive sub-regions contributing to each classification outcome. The study also inadvertently highlighted the critical role of an unbiased annotation protocol in maintaining network performance fairness. Our proposed method not only o ff ers automated, impartial annotations of a variable sulcus but also provides insights into the broader anatomical variations associated with its presence throughout the brain. The adoption of this methodology holds promise for instigating further explorations and inquiries in the field of neuroscience.1. Introduction While the folding of the primary sulci of the human brain, formed during gestation, is broadly stable across individuals, the secondary sulci which continue to develop post-natally are unique to each individual. Inter-individual variability poses a significant challenge for the detection and accurately annotation of sulcal features from MRI of the brain. Undertaking this task manually is time-consuming with outcomes that depend on the rater. This prevents the e fficient leveraging of the large, open-access MRI databases that are available. While primary sulci can be very accurately detected with automated methods, secondary sulci pose a more di fficult computational problem due to their higher variability in shape and indeed presence or absense [3]. A successful automated method would facilitate investigations of brain folding variation, representative of events occurring during a critical developmental period. Furthermore, generalized and unbiased annotations would make tractable large-scale studies of cognitive and behavioral development, and the emergence of mental and neurological disorders with high levels of statistical power. The folding of the brain has been linked to brain function, and some specific folding patterns have been related to susceptibility to neurological adversities [20].

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2309.00903

Country:

Europe > Norway > Eastern Norway > Oslo (0.25)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)
Research Report > Strength High (0.93)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback