AITopics | Berisha, Visar

Collaborating Authors

Berisha, Visar

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Potential Applications of Artificial Intelligence for Cross-language Intelligibility Assessment of Dysarthric Speech

Yeo, Eunjung, Liss, Julie, Berisha, Visar, Mortensen, David

arXiv.org Artificial IntelligenceFeb-4-2025

Purpose: This commentary introduces how artificial intelligence (AI) can be leveraged to advance cross-language intelligibility assessment of dysarthric speech. Method: We propose a conceptual framework consisting of a universal model that captures language-universal speech impairments and a language-specific intelligibility model that incorporates linguistic nuances. Additionally, we identify key barriers to cross-language intelligibility assessment, including data scarcity, annotation complexity, and limited linguistic insights, and present AI-driven solutions to overcome these challenges. Conclusion: Advances in AI offer transformative opportunities to enhance cross-language intelligibility assessment for dysarthric speech by balancing scalability across languages and adaptability by languages.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2501.15858

Country: North America > United States (0.28)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.46)

Add feedback

Automated Extraction of Spatio-Semantic Graphs for Identifying Cognitive Impairment

Ng, Si-Ioi, Ambadi, Pranav S., Mueller, Kimberly D., Liss, Julie, Berisha, Visar

arXiv.org Artificial IntelligenceFeb-2-2025

Existing methods for analyzing linguistic content from picture descriptions for assessment of cognitive-linguistic impairment often overlook the participant's visual narrative path, which typically requires eye tracking to assess. Spatio-semantic graphs are a useful tool for analyzing this narrative path from transcripts alone, however they are limited by the need for manual tagging of content information units (CIUs). In this paper, we propose an automated approach for estimation of spatio-semantic graphs (via automated extraction of CIUs) from the Cookie Theft picture commonly used in cognitive-linguistic analyses. The method enables the automatic characterization of the visual semantic path during picture description. Experiments demonstrate that the automatic spatio-semantic graphs effectively differentiate between cognitively impaired and unimpaired speakers. Statistical analyses reveal that the features derived by the automated method produce comparable results to the manual method, with even greater group differences between clinical groups of interest. These results highlight the potential of the automated approach for extracting spatio-semantic features in developing clinical speech models for cognitive impairment assessment.

artificial intelligence, natural language, text processing, (16 more...)

arXiv.org Artificial Intelligence

2502.01685

Country: North America > United States > Wisconsin (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.69)

Industry: Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.78)

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)

Add feedback

Advanced Tutorial: Label-Efficient Two-Sample Tests

Li, Weizhi, Berisha, Visar, Dasarathy, Gautam

arXiv.org Artificial IntelligenceJan-7-2025

Hypothesis testing is a statistical inference approach used to determine whether data supports a specific hypothesis. An important type is the two-sample test, which evaluates whether two sets of data points are from identical distributions. This test is widely used, such as by clinical researchers comparing treatment effectiveness. This tutorial explores two-sample testing in a context where an analyst has many features from two samples, but determining the sample membership (or labels) of these features is costly. In machine learning, a similar scenario is studied in active learning. This tutorial extends active learning concepts to two-sample testing within this \textit{label-costly} setting while maintaining statistical validity and high testing power. Additionally, the tutorial discusses practical applications of these label-efficient two-sample tests.

artificial intelligence, machine learning, two-sample test, (18 more...)

arXiv.org Artificial Intelligence

2501.03568

Country: North America > United States > Arizona (0.14)

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A Tutorial on Clinical Speech AI Development: From Data Collection to Model Validation

Ng, Si-Ioi, Xu, Lingfeng, Siegert, Ingo, Cummins, Nicholas, Benway, Nina R., Liss, Julie, Berisha, Visar

arXiv.org Artificial IntelligenceOct-28-2024

There has been a surge of interest in leveraging speech as a marker of health for a wide spectrum of conditions. The underlying premise is that any neurological, mental, or physical deficits that impact speech production can be objectively assessed via automated analysis of speech. Recent advances in speech-based Artificial Intelligence (AI) models for diagnosing and tracking mental health, cognitive, and motor disorders often use supervised learning, similar to mainstream speech technologies like recognition and verification. However, clinical speech AI has distinct challenges, including the need for specific elicitation tasks, small available datasets, diverse speech representations, and uncertain diagnostic labels. As a result, application of the standard supervised learning paradigm may lead to models that perform well in controlled settings but fail to generalize in real-world clinical deployments. With translation into real-world clinical scenarios in mind, this tutorial paper provides an overview of the key components required for robust development of clinical speech AI. Specifically, this paper will cover the design of speech elicitation tasks and protocols most appropriate for different clinical conditions, collection of data and verification of hardware, development and validation of speech representations designed to measure clinical constructs of interest, development of reliable and robust clinical prediction models, and ethical and participant considerations for clinical speech AI. The goal is to provide comprehensive guidance on building models whose inputs and outputs link to the more interpretable and clinically meaningful aspects of speech, that can be interrogated and clinically validated on clinical datasets, and that adhere to ethical, privacy, and security considerations by design.

data mining, large language model, machine learning, (25 more...)

arXiv.org Artificial Intelligence

2410.2164

Country:

Europe (1.00)
North America > United States > Maryland (0.27)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Instructional Material > Course Syllabus & Notes (0.99)
Overview (0.86)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
Health & Medicine > Therapeutic Area > Oncology (1.00)
(9 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(6 more...)

Add feedback

Unraveling overoptimism and publication bias in ML-driven science

Saidi, Pouria, Dasarathy, Gautam, Berisha, Visar

arXiv.org Artificial IntelligenceJul-11-2024

Machine Learning (ML) is increasingly used across many disciplines with impressive reported results. However, recent studies suggest published performance of ML models are often overoptimistic. Validity concerns are underscored by findings of an inverse relationship between sample size and reported accuracy in published ML models, contrasting with the theory of learning curves where accuracy should improve or remain stable with increasing sample size. This paper investigates factors contributing to overoptimism in ML-driven science, focusing on overfitting and publication bias. We introduce a novel stochastic model for observed accuracy, integrating parametric learning curves and the aforementioned biases. We construct an estimator that corrects for these biases in observed data. Theoretical and empirical results show that our framework can estimate the underlying learning curve, providing realistic performance assessments from published results. Applying the model to meta-analyses of classifications of neurological conditions, we estimate the inherent limits of ML-based prediction in each domain.

accuracy, artificial intelligence, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2405.14422

Country:

North America > United States (0.67)
Europe (0.67)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Learning Repeatable Speech Embeddings Using An Intra-class Correlation Regularizer

Zhang, Jianwei, Jayasuriya, Suren, Berisha, Visar

arXiv.org Artificial IntelligenceOct-25-2023

A good supervised embedding for a specific machine learning task is only sensitive to changes in the label of interest and is invariant to other confounding factors. We leverage the concept of repeatability from measurement theory to describe this property and propose to use the intra-class correlation coefficient (ICC) to evaluate the repeatability of embeddings. We then propose a novel regularizer, the ICC regularizer, as a complementary component for contrastive losses to guide deep neural networks to produce embeddings with higher repeatability. We use simulated data to explain why the ICC regularizer works better on minimizing the intra-class variance than the contrastive loss alone. We implement the ICC regularizer and apply it to three speech tasks: speaker verification, voice style conversion, and a clinical application for detecting dysphonic voice. The experimental results demonstrate that adding an ICC regularizer can improve the repeatability of learned embeddings compared to only using the contrastive loss; further, these embeddings lead to improved performance in these downstream tasks.

artificial intelligence, learning repeatable speech embedding, machine learning, (1 more...)

arXiv.org Artificial Intelligence

2310.17049

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.53)

Add feedback

Smoothly Giving up: Robustness for Simple Models

Sypherd, Tyler, Stromberg, Nathan, Nock, Richard, Berisha, Visar, Sankar, Lalitha

arXiv.org Artificial IntelligenceFeb-17-2023

There is a growing need for models that are interpretable and have reduced energy and computational cost (e.g., in health care analytics and federated learning). Examples of algorithms to train such models include logistic regression and boosting. However, one challenge facing these algorithms is that they provably suffer from label noise; this has been attributed to the joint interaction between oft-used convex loss functions and simpler hypothesis classes, resulting in too much emphasis being placed on outliers. In this work, we use the margin-based $\alpha$-loss, which continuously tunes between canonical convex and quasi-convex losses, to robustly train simple models. We show that the $\alpha$ hyperparameter smoothly introduces non-convexity and offers the benefit of "giving up" on noisy training examples. We also provide results on the Long-Servedio dataset for boosting and a COVID-19 survey dataset for logistic regression, highlighting the efficacy of our approach across multiple relevant domains.

artificial intelligence, dataset, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2302.09114

Country: North America > United States (1.00)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.54)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.48)
Health & Medicine > Therapeutic Area > Immunology (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Active Sequential Two-Sample Testing

Li, Weizhi, Ramamurthy, Karthikeyan Natesan, Kadambi, Prad, Saidi, Pouria, Dasarathy, Gautam, Berisha, Visar

arXiv.org Artificial IntelligenceFeb-1-2023

Two-sample testing tests whether the distributions generating two samples are identical. We pose the two-sample testing problem in a new scenario where the sample measurements (or sample features) are inexpensive to access, but their group memberships (or labels) are costly. We devise the first \emph{active sequential two-sample testing framework} that not only sequentially but also \emph{actively queries} sample labels to address the problem. Our test statistic is a likelihood ratio where one likelihood is found by maximization over all class priors, and the other is given by a classification model. The classification model is adaptively updated and then used to guide an active query scheme called bimodal query to label sample features in the regions with high dependency between the feature variables and the label variables. The theoretical contributions in the paper include proof that our framework produces an \emph{anytime-valid} $p$-value; and, under reachable conditions and a mild assumption, the framework asymptotically generates a minimum normalized log-likelihood ratio statistic that a passive query scheme can only achieve when the feature variable and the label variable have the highest dependence. Lastly, we provide a \emph{query-switching (QS)} algorithm to decide when to switch from passive query to active query and adapt bimodal query to increase the testing power of our test. Extensive experiments justify our theoretical contributions and the effectiveness of QS.

artificial intelligence, bayesian inference, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2301.12616

Genre: Research Report (1.00)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

Robust Vocal Quality Feature Embeddings for Dysphonic Voice Detection

Zhang, Jianwei, Liss, Julie, Jayasuriya, Suren, Berisha, Visar

arXiv.org Artificial IntelligenceJan-26-2023

Approximately 1.2% of the world's population has impaired voice production. As a result, automatic dysphonic voice detection has attracted considerable academic and clinical interest. However, existing methods for automated voice assessment often fail to generalize outside the training conditions or to other related applications. In this paper, we propose a deep learning framework for generating acoustic feature embeddings sensitive to vocal quality and robust across different corpora. A contrastive loss is combined with a classification loss to train our deep learning model jointly. Data warping methods are used on input voice samples to improve the robustness of our method. Empirical results demonstrate that our method not only achieves high in-corpus and cross-corpus classification accuracy but also generates good embeddings sensitive to voice quality and robust across different corpora. We also compare our results against three baseline methods on clean and three variations of deteriorated in-corpus and cross-corpus datasets and demonstrate that the proposed model consistently outperforms the baseline methods.

accuracy, artificial intelligence, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2211.09858

Country:

North America > United States (0.46)
Europe > Germany > Saarland (0.28)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Does human speech follow Benford's Law?

Hsu, Leo, Berisha, Visar

arXiv.org Artificial IntelligenceDec-21-2022

Researchers have observed that the frequencies of leading digits in many man-made and naturally occurring datasets follow a logarithmic curve, with digits that start with the number 1 accounting for $\sim 30\%$ of all numbers in the dataset and digits that start with the number 9 accounting for $\sim 5\%$ of all numbers in the dataset. This phenomenon, known as Benford's Law, is highly repeatable and appears in lists of numbers from electricity bills, stock prices, tax returns, house prices, death rates, lengths of rivers, and naturally occurring images. In this paper we demonstrate that human speech spectra also follow Benford's Law on average. That is, when averaged over many speakers, the frequencies of leading digits in speech magnitude spectra follow this distribution, although with some variability at the individual sample level. We use this observation to motivate a new set of features that can be efficiently extracted from speech and demonstrate that these features can be used to classify between human speech and synthetic speech.

artificial intelligence, benford, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2203.13352

Genre: Research Report (0.83)

Industry: Health & Medicine (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback