AITopics | multimodal machine

Collaborating Authors

multimodal machine

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Predicting Targeted Therapy Resistance in Non-Small Cell Lung Cancer Using Multimodal Machine Learning

Hua, Peiying, Olofson, Andrea, Farhadi, Faraz, Hondelink, Liesbeth, Tsongalis, Gregory, Dragnev, Konstantin, Savellano, Dagmar Hoegemann, Suriawinata, Arief, Tafe, Laura, Hassanpour, Saeed

arXiv.org Artificial IntelligenceMar-31-2025

Lung cancer is the primary cause of cancer death globally, with non-small cell lung cancer (NSCLC) emerging as its most prevalent subtype. Among NSCLC patients, approximately 32.3% have mutations in the epidermal growth factor receptor (EGFR) gene. Osimertinib, a third-generation EGFR-tyrosine kinase inhibitor (TKI), has demonstrated remarkable efficacy in the treatment of NSCLC patients with activating and T790M resistance EGFR mutations. Despite its established efficacy, drug resistance poses a significant challenge for patients to fully benefit from osimertinib. The absence of a standard tool to accurately predict TKI resistance, including that of osimertinib, remains a critical obstacle. To bridge this gap, in this study, we developed an interpretable multimodal machine learning model designed to predict patient resistance to osimertinib among late-stage NSCLC patients with activating EGFR mutations, achieving a c-index of 0.82 on a multi-institutional dataset. This machine learning model harnesses readily available data routinely collected during patient visits and medical assessments to facilitate precision lung cancer management and informed treatment decisions. By integrating various data types such as histology images, next generation sequencing (NGS) data, demographics data, and clinical records, our multimodal model can generate well-informed recommendations. Our experiment results also demonstrated the superior performance of the multimodal model over single modality models (c-index 0.82 compared with 0.75 and 0.77), thus underscoring the benefit of combining multiple modalities in patient outcome prediction.

artificial intelligence, machine learning, resistance, (14 more...)

arXiv.org Artificial Intelligence

2503.24165

Country:

North America > United States > New Hampshire > Grafton County > Hanover (0.05)
Asia > Middle East > Lebanon (0.04)
Europe > Netherlands > South Holland > Leiden (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
Health & Medicine > Therapeutic Area > Oncology > Lung Cancer (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Leveraging OpenFlamingo for Multimodal Embedding Analysis of C2C Car Parts Data

Rashid, Maisha Binte, Rivas, Pablo

arXiv.org Artificial IntelligenceMar-20-2025

In this paper, we aim to investigate the capabilities of multimodal machine learning models, particularly the OpenFlamingo model, in processing a large-scale dataset of consumer-to-consumer (C2C) online posts related to car parts. We have collected data from two platforms, OfferUp and Craigslist, resulting in a dataset of over 1.2 million posts with their corresponding images. The OpenFlamingo model was used to extract embeddings for the text and image of each post. We used $k$-means clustering on the joint embeddings to identify underlying patterns and commonalities among the posts. We have found that most clusters contain a pattern, but some clusters showed no internal patterns. The results provide insight into the fact that OpenFlamingo can be used for finding patterns in large datasets but needs some modification in the architecture according to the dataset.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2503.17408

Country:

North America > United States > Texas (0.04)
North America > United States > California > Alameda County > Oakland (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.51)

Add feedback

Multimodal Machine Learning in Mental Health: A Survey of Data, Algorithms, and Challenges

Sahili, Zahraa Al, Patras, Ioannis, Purver, Matthew

arXiv.org Artificial IntelligenceJul-23-2024

The application of machine learning (ML) in detecting, diagnosing, and treating mental health disorders is garnering increasi ng attention. Traditionally, research has focused on single modalities, such as text from clinical notes, audio from speech samples, or video of interaction patterns. Recently, multimodal ML, which combines information from multiple modalities, has demonstrated significant promise in offering novel insights into human behavior patterns and recognizing mental health symptoms and risk factors. Despite its potential, multimodal ML in mental health remains an emerging field, facing several complex challenges before practical applications can be effectively developed. This survey provides a comprehensive overview of the data availability a nd current state-of-the-art multimodal ML applications for mental health. It discusses key challenges that must be addressed to advance the field.

dataset, disorder, information, (10 more...)

arXiv.org Artificial Intelligence

2407.16804

Country:

Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
Asia > China (0.04)
South America > Argentina > Pampas > Buenos Aires F.D. > Buenos Aires (0.04)
(8 more...)

Genre: Overview (1.00)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.66)

Add feedback

Review of multimodal machine learning approaches in healthcare

Krones, Felix, Marikkar, Umar, Parsons, Guy, Szmul, Adam, Mahdi, Adam

arXiv.org Artificial IntelligenceFeb-11-2024

Machine learning methods in healthcare have traditionally focused on using data from a single modality, limiting their ability to effectively replicate the clinical practice of integrating multiple sources of information for improved decision making. Clinicians typically rely on a variety of data sources including patients' demographic information, laboratory data, vital signs and various imaging data modalities to make informed decisions and contextualise their findings. Recent advances in machine learning have facilitated the more efficient incorporation of multimodal data, resulting in applications that better represent the clinician's approach. Here, we provide a review of multimodal machine learning approaches in healthcare, offering a comprehensive overview of recent literature. We discuss the various data modalities used in clinical diagnosis, with a particular emphasis on imaging data. We evaluate fusion techniques, explore existing multimodal datasets and examine common training strategies.

fusion, healthcare, learning, (16 more...)

arXiv.org Artificial Intelligence

2402.0246

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Asia > Middle East > Israel (0.04)
Oceania > Australia > Queensland (0.04)
(10 more...)

Genre:

Research Report > Experimental Study (1.00)
Overview (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
Health & Medicine > Therapeutic Area > Oncology (1.00)
(9 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Multimodal machine learning for materials science: composition-structure bimodal learning for experimentally measured properties

Gong, Sheng, Wang, Shuo, Zhu, Taishan, Shao-Horn, Yang, Grossman, Jeffrey C.

arXiv.org Artificial IntelligenceAug-3-2023

The widespread application of multimodal machine learning models like GPT-4 has revolutionized various research fields including computer vision and natural language processing. However, its implementation in materials informatics remains underexplored, despite the presence of materials data across diverse modalities, such as composition and structure. The effectiveness of machine learning models trained on large calculated datasets depends on the accuracy of calculations, while experimental datasets often have limited data availability and incomplete information. This paper introduces a novel approach to multimodal machine learning in materials science via composition-structure bimodal learning. The proposed COmposition-Structure Bimodal Network (COSNet) is designed to enhance learning and predictions of experimentally measured materials properties that have incomplete structure information. Bimodal learning significantly reduces prediction errors across distinct materials properties including Li conductivity in solid electrolyte, band gap, refractive index, dielectric constant, energy, and magnetic moment, surpassing composition-only learning methods. Furthermore, we identified that data augmentation based on modal availability plays a pivotal role in the success of bimodal learning.

composition-structure bimodal, materials science, multimodal machine

arXiv.org Artificial Intelligence

2309.04478

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Multimodal machine learning in precision health: A scoping review - npj Digital Medicine

#artificialintelligenceNov-9-2022, 06:05:06 GMT

Machine learning is frequently being leveraged to tackle problems in the health sector including utilization for clinical decision-support. Its use has historically been focused on single modal data. Attempts to improve prediction and mimic the multimodal nature of clinical expert decision-making has been met in the biomedical field of machine learning by fusing disparate data. This review was conducted to summarize the current studies in this field and identify topics ripe for future research. We conducted this review in accordance with the PRISMA extension for Scoping Reviews to characterize multi-modal data fusion in health. Search strings were established and used in databases: PubMed, Google Scholar, and IEEEXplore from 2011 to 2021. A final set of 128 articles were included in the analysis. The most common health areas utilizing multi-modal methods were neurology and oncology. Early fusion was the most common data merging strategy. Notably, there was an improvement in predictive performance when using data fusion. Lacking from the papers were clear clinical deployment strategies, FDA-approval, and analysis of how using multimodal approaches from diverse sub-populations may improve biases and healthcare disparities. These findings provide a summary on multimodal data fusion as applied to health diagnosis/prognosis problems. Few papers compared the outputs of a multimodal approach with a unimodal prediction. However, those that did achieved an average increase of 6.4% in predictive accuracy. Multi-modal machine learning, while more robust in its estimations over unimodal methods, has drawbacks in its scalability and the time-consuming nature of information concatenation.

multimodal machine, npj digital medicine, precision health, (3 more...)

#artificialintelligence

Country: North America > United States (0.99)

Genre: Overview (1.00)

Industry:

Health & Medicine > Public Health (0.99)
Health & Medicine > Government Relations & Public Policy (0.99)
Government > Regional Government > North America Government > United States Government > FDA (0.99)
Health & Medicine > Therapeutic Area (0.61)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Applications of Multimodal Learning part2(Artificial Intelligence)

#artificialintelligenceOct-22-2022, 05:20:06 GMT

Abstract: We are perceiving and communicating with the world in a multisensory manner, where different information sources are sophisticatedly processed and interpreted by separate parts of the human brain to constitute a complex, yet harmonious and unified sensing system. To endow the machines with true intelligence, the multimodal machine learning that incorporates data from various modalities has become an increasingly popular research area with emerging technical advances in recent years. In this paper, we present a survey on multimodal machine learning from a novel perspective considering not only the purely technical aspects but also the nature of different data modalities. We analyze the commonness and uniqueness of each data format ranging from vision, audio, text and others, and then present the technical development categorized by the combination of Vision X, where the vision data play a fundamental role in most multimodal learning works. We investigate the existing literature on multimodal learning from both the representation learning and downstream application levels, and provide an additional comparison in the light of their technical connections with the data nature, e.g., the semantic consistency between image objects and textual descriptions, or the rhythm correspondence between video dance moves and musical beats.

modality, multimodal learning part2, representation, (12 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.50)

Add feedback

TikTok goes on AI music making and machine learning specialist hiring spree - Music Business Worldwide

#artificialintelligenceAug-5-2022, 21:37:41 GMT

The AI-powered music-making app business is hotting up. In May, music-making platform Splice, reported to be valued at nearly $500 million, launched an artificial intelligence-powered music app called CoSo, which uses what Splice calls its'Complementary Sounds' AI technology to create music "in split-seconds". Bandlab, meanwhile, the social music-making platform that recently raised $65 million, has an AI-powered app called SongStarter that, it claims, can "generate royalty-free music in seconds". Could TikTok and parent ByteDance be one of the sector's next major players? Back in July 2019, ByteDance acquired Jukedeck, a UK-based AI Music startup that specialized in creating royalty-free music for user-generated online videos.

cutting-edge research and development, research and development, specialist, (12 more...)

#artificialintelligence

Country:

Europe > United Kingdom (0.25)
North America > United States > California > Santa Clara County > Mountain View (0.05)
Asia > China > Beijing > Beijing (0.05)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback