AITopics | Patil, Kaustubh R.

Collaborating Authors

Patil, Kaustubh R.

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Region-wise stacking ensembles for estimating brain-age using MRI

Antonopoulos, Georgios, More, Shammi, Eickhoff, Simon B., Raimondo, Federico, Patil, Kaustubh R.

arXiv.org Artificial IntelligenceJan-17-2025

Predictive modeling using structural magnetic resonance imaging (MRI) data is a prominent approach to study brain-aging. Machine learning algorithms and feature extraction methods have been employed to improve predictions and explore healthy and accelerated aging e.g. neurodegenerative and psychiatric disorders. The high-dimensional MRI data pose challenges to building generalizable and interpretable models as well as for data privacy. Common practices are resampling or averaging voxels within predefined parcels, which reduces anatomical specificity and biological interpretability as voxels within a region may differently relate to aging. Effectively, naive fusion by averaging can result in information loss and reduced accuracy. We present a conceptually novel two-level stacking ensemble (SE) approach. The first level comprises regional models for predicting individuals' age based on voxel-wise information, fused by a second-level model yielding final predictions. Eight data fusion scenarios were explored using as input Gray matter volume (GMV) estimates from four datasets covering the adult lifespan. Performance, measured using mean absolute error (MAE), R2, correlation and prediction bias, showed that SE outperformed the region-wise averages. The best performance was obtained when first-level regional predictions were obtained as out-of-sample predictions on the application site with second-level models trained on independent and site-specific data (MAE=4.75 vs baseline regional mean GMV MAE=5.68). Performance improved as more datasets were used for training. First-level predictions showed improved and more robust aging signal providing new biological insights and enhanced data privacy. Overall, the SE improves accuracy compared to the baseline while preserving or enhancing data privacy.

data mining, machine learning, prediction, (20 more...)

arXiv.org Artificial Intelligence

2501.10153

Country: Europe > Germany (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.68)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.69)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.46)

Add feedback

Impact of Leakage on Data Harmonization in Machine Learning Pipelines in Class Imbalance Across Sites

Nieto, Nicolás, Eickhoff, Simon B., Jung, Christian, Reuter, Martin, Diers, Kersten, Kelm, Malte, Lichtenberg, Artur, Raimondo, Federico, Patil, Kaustubh R.

arXiv.org Artificial IntelligenceDec-10-2024

Machine learning (ML) models benefit from large datasets. Collecting data in biomedical domains is costly and challenging, hence, combining datasets has become a common practice. However, datasets obtained under different conditions could present undesired site-specific variability. Data harmonization methods aim to remove site-specific variance while retaining biologically relevant information. This study evaluates the effectiveness of popularly used ComBatbased methods for harmonizing data in scenarios where the class balance is not equal across sites. We find that these methods struggle with data leakage issues. To overcome this problem, we propose a novel approach "PrettYharmonize", designed to harmonize data by pretending the target labels. We validate our approach using controlled datasets designed to benchmark the utility of harmonization. Finally, using real-world MRI and clinical data, we compare leakageprone methods with "PrettYharmonize" and show that it achieves comparable performance while avoiding data leakage, particularly in site-target-dependence scenarios.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2410.19643

Country:

Europe > Germany > North Rhine-Westphalia (0.28)
North America > United States > California (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

The impact of MRI image quality on statistical and predictive analysis on voxel based morphology

Hoffstaedter, Felix, Nieto, Nicolás, Eickhoff, Simon B., Patil, Kaustubh R.

arXiv.org Machine LearningNov-2-2024

Image Quality of MRI brain scans is strongly influenced by within scanner head movements and the resulting image artifacts alter derived measures like brain volume and cortical thickness. Automated image quality assessment is key to controlling for confounding effects of poor image quality. In this study, we systematically test for the influence of image quality on univariate statistics and machine learning classification. We analyzed group effects of sex/gender on local brain volume and made predictions of sex/gender using logistic regression, while correcting for brain size. From three large publicly available datasets, two age and sex-balanced samples were derived to test the generalizability of the effect for pooled sample sizes of n=760 and n=1094. Results of the Bonferroni corrected t-tests over 3747 gray matter features showed a strong influence of low-quality data on the ability to find significant sex/gender differences for the smaller sample. Increasing sample size and more so image quality showed a stark increase in detecting significant effects in univariate group comparisons. For the classification of sex/gender using logistic regression, both increasing sample size and image quality had a marginal effect on the Area under the Receiver Operating Characteristic Curve for most datasets and subsamples. Our results suggest a more stringent quality control for univariate approaches than for multivariate classification with a leaning towards higher quality for classical group statistics and bigger sample sizes for machine learning applications in neuroimaging.

artificial intelligence, image quality, machine learning, (16 more...)

arXiv.org Machine Learning

2411.01268

Country: Europe > Germany (0.15)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.89)
Health & Medicine > Health Care Technology (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.35)

Add feedback

Large language models surpass human experts in predicting neuroscience results

Luo, Xiaoliang, Rechardt, Akilles, Sun, Guangzhi, Nejad, Kevin K., Yáñez, Felipe, Yilmaz, Bati, Lee, Kangjoo, Cohen, Alexandra O., Borghesani, Valentina, Pashkov, Anton, Marinazzo, Daniele, Nicholas, Jonathan, Salatiello, Alessandro, Sucholutsky, Ilia, Minervini, Pasquale, Razavi, Sepehr, Rocca, Roberta, Yusifov, Elkhan, Okalova, Tereza, Gu, Nianlong, Ferianc, Martin, Khona, Mikail, Patil, Kaustubh R., Lee, Pui-Shee, Mata, Rui, Myers, Nicholas E., Bizley, Jennifer K, Musslick, Sebastian, Bilgin, Isil Poyraz, Niso, Guiomar, Ales, Justin M., Gaebler, Michael, Murty, N Apurva Ratan, Loued-Khenissi, Leyla, Behler, Anna, Hall, Chloe M., Dafflon, Jessica, Bao, Sherry Dongqi, Love, Bradley C.

arXiv.org Artificial IntelligenceJun-21-2024

Scientific discoveries often hinge on synthesizing decades of research, a task that potentially outstrips human information processing capacities. Large language models (LLMs) offer a solution. LLMs trained on the vast scientific literature could potentially integrate noisy yet interrelated findings to forecast novel results better than human experts. To evaluate this possibility, we created BrainBench, a forward-looking benchmark for predicting neuroscience results. We find that LLMs surpass experts in predicting experimental outcomes. BrainGPT, an LLM we tuned on the neuroscience literature, performed better yet. Like human experts, when LLMs were confident in their predictions, they were more likely to be correct, which presages a future where humans and LLMs team together to make discoveries. Our approach is not neuroscience-specific and is transferable to other knowledge-intensive endeavors.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2403.0323

Country:

North America > United States (1.00)
Europe > Germany (1.00)
Asia (0.68)
(5 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.98)

Add feedback

Empirical Comparison between Cross-Validation and Mutation-Validation in Model Selection

Yu, Jinyang, Hamdan, Sami, Sasse, Leonard, Morrison, Abigail, Patil, Kaustubh R.

arXiv.org Machine LearningNov-23-2023

Mutation validation (MV) is a recently proposed approach for model selection, garnering significant interest due to its unique characteristics and potential benefits compared to the widely used cross-validation (CV) method. In this study, we empirically compared MV and $k$-fold CV using benchmark and real-world datasets. By employing Bayesian tests, we compared generalization estimates yielding three posterior probabilities: practical equivalence, CV superiority, and MV superiority. We also evaluated the differences in the capacity of the selected models and computational efficiency. We found that both MV and CV select models with practically equivalent generalization performance across various machine learning algorithms and the majority of benchmark datasets. MV exhibited advantages in terms of selecting simpler models and lower computational costs. However, in some cases MV selected overly simplistic models leading to underfitting and showed instability in hyperparameter selection. These limitations of MV became more evident in the evaluation of a real-world neuroscientific task of predicting sex at birth using brain functional connectivity.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Machine Learning

2311.14079

Country:

Europe > Germany > North Rhine-Westphalia (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.95)

Add feedback

On Leakage in Machine Learning Pipelines

Sasse, Leonard, Nicolaisen-Sobesky, Eliana, Dukart, Juergen, Eickhoff, Simon B., Götz, Michael, Hamdan, Sami, Komeyer, Vera, Kulkarni, Abhijit, Lahnakoski, Juha, Love, Bradley C., Raimondo, Federico, Patil, Kaustubh R.

arXiv.org Artificial IntelligenceNov-7-2023

Machine learning (ML) provides powerful tools for predictive modeling. ML's popularity stems from the promise of sample-level prediction with applications across a variety of fields from physics and marketing to healthcare. However, if not properly implemented and evaluated, ML pipelines may contain leakage typically resulting in overoptimistic performance estimates and failure to generalize to new data. This can have severe negative financial and societal implications. Our aim is to expand understanding associated with causes leading to leakage when designing, implementing, and evaluating ML pipelines. Illustrated by concrete examples, we provide a comprehensive overview and discussion of various types of leakage that may arise in ML pipelines.

artificial intelligence, leakage, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2311.04179

Country:

Europe (0.93)
North America > United States > New York > New York County > New York City (0.14)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Julearn: an easy-to-use library for leakage-free evaluation and inspection of ML models

Hamdan, Sami, More, Shammi, Sasse, Leonard, Komeyer, Vera, Patil, Kaustubh R., Raimondo, Federico

arXiv.org Artificial IntelligenceOct-19-2023

The fast-paced development of machine learning (ML) methods coupled with its increasing adoption in research poses challenges for researchers without extensive training in ML. In neuroscience, for example, ML can help understand brain-behavior relationships, diagnose diseases, and develop biomarkers using various data sources like magnetic resonance imaging and electroencephalography. The primary objective of ML is to build models that can make accurate predictions on unseen data. Researchers aim to prove the existence of such generalizable models by evaluating performance using techniques such as cross-validation (CV), which uses systematic subsampling to estimate the generalization performance. Choosing a CV scheme and evaluating an ML pipeline can be challenging and, if used improperly, can lead to overestimated results and incorrect interpretations. We created julearn, an open-source Python library, that allow researchers to design and evaluate complex ML pipelines without encountering in common pitfalls. In this manuscript, we present the rationale behind julearn's design, its core features, and showcase three examples of previously-published research projects that can be easily implemented using this novel library. Julearn aims to simplify the entry into the ML world by providing an easy-to-use environment with built in guards against some of the most common ML pitfalls. With its design, unique features and simple interface, it poses as a useful Python-based library for research projects.

artificial intelligence, machine learning, pipeline, (16 more...)

arXiv.org Artificial Intelligence

2310.12568

Country: North America > United States > California (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.90)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Optimal Teaching for Limited-Capacity Human Learners

Patil, Kaustubh R., Zhu, Jerry, Kopeć, Łukasz, Love, Bradley C.

Neural Information Processing SystemsFeb-14-2020, 09:57:47 GMT

Basic decisions, such as judging a person as a friend or foe, involve categorizing novel stimuli. Recent work finds that people's category judgments are guided by a small set of examples that are retrieved from memory at decision time. This limited and stochastic retrieval places limits on human performance for probabilistic classification decisions. In light of this capacity limitation, recent work finds that idealizing training items, such that the saliency of ambiguous cases is reduced, improves human performance on novel test items. One shortcoming of previous work in idealization is that category distributions were idealized in an ad hoc or heuristic fashion.

artificial intelligence, limited-capacity human learner, machine learning, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Optimal Teaching for Limited-Capacity Human Learners

Patil, Kaustubh R., Zhu, Jerry, Kopeć, Łukasz, Love, Bradley C.

Neural Information Processing SystemsDec-31-2014

Basic decisions, such as judging a person as a friend or foe, involve categorizing novel stimuli. Recent work finds that people’s category judgments are guided by a small set of examples that are retrieved from memory at decision time. This limited and stochastic retrieval places limits on human performance for probabilistic classification decisions. In light of this capacity limitation, recent work finds that idealizing training items, such that the saliency of ambiguous cases is reduced, improves human performance on novel test items. One shortcoming of previous work in idealization is that category distributions were idealized in an ad hoc or heuristic fashion. In this contribution, we take a first principles approach to constructing idealized training sets. We apply a machine teaching procedure to a cognitive model that is either limited capacity (as humans are) or unlimited capacity (as most machine learning systems are). As predicted, we find that the machine teacher recommends idealized training sets. We also find that human learners perform best when training recommendations from the machine teacher are based on a limited-capacity model. As predicted, to the extent that the learning model used by the machine teacher conforms to the true nature of human learners, the recommendations of the machine teacher prove effective. Our results provide a normative basis (given capacity constraints) for idealization procedures and offer a novel selection procedure for models of human learning.

artificial intelligence, machine learning, teaching, (17 more...)

Neural Information Processing Systems

Country: North America > United States > Wisconsin (0.14)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback