AITopics

2410.14738

Country:

North America > United States > Georgia > Columbia County > Evans (0.04)
North America > United States > California > Orange County > Irvine (0.04)

Genre:

Research Report > Experimental Study (0.91)
Research Report > New Finding (0.89)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.50)

Manivannan, Mithun, Nethrapalli, Vignesh, Cartwright, Mark

EmotionCaps: Enhancing Audio Captioning Through Emotion-Augmented Data Generation

arXiv.org Artificial IntelligenceOct-15-2024

Recent progress in audio-language modeling, such as automated audio captioning, has benefited from training on synthetic data generated with the aid of large-language models. However, such approaches for environmental sound captioning have primarily focused on audio event tags and have not explored leveraging emotional information that may be present in recordings. In this work, we explore the benefit of generating emotion-augmented synthetic audio caption data by instructing ChatGPT with additional acoustic information in the form of estimated soundscape emotion. To do so, we introduce EmotionCaps, an audio captioning dataset comprised of approximately 120,000 audio clips with paired synthetic descriptions enriched with soundscape emotion recognition (SER) information. We hypothesize that this additional information will result in higher-quality captions that match the emotional tone of the audio recording, which will, in turn, improve the performance of captioning models trained with this data. We test this hypothesis through both objective and subjective evaluation, comparing models trained with the EmotionCaps dataset to multiple baseline models. Our findings challenge current approaches to captioning and suggest new directions for developing and assessing captioning models.

caption, large language model, machine learning, (22 more...)

2410.12028

Country:

North America > United States > New Jersey (0.05)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
(2 more...)

Genre: Research Report > New Finding (0.66)

Industry:

Leisure & Entertainment (0.48)
Media > Music (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.46)

arXiv.org Artificial IntelligenceOct-14-2024

Reproducible Machine Learning-based Voice Pathology Detection: Introducing the Pitch Difference Feature

Vrba, Jan, Steinbach, Jakub, Jirsa, Tomáš, Verde, Laura, De Fazio, Roberta, Homma, Noriyasu, Zeng, Yuwen, Ichiji, Key, Hájek, Lukáš, Sedláková, Zuzana, Mareš, Jan

In this study, we propose a robust set of features derived from a thorough research of contemporary practices in voice pathology detection. The feature set is based on the combination of acoustic handcrafted features. Additionally, we introduce pitch difference as a novel feature. We combine this feature set, containing data from the publicly available Saarbr\"ucken Voice Database (SVD), with preprocessing using the K-Means Synthetic Minority Over-Sampling Technique algorithm to address class imbalance. Moreover, we applied multiple ML models as binary classifiers. We utilized support vector machine, k-nearest neighbors, naive Bayes, decision tree, random forest and AdaBoost classifiers. To determine the best classification approach, we performed grid search on feasible hyperparameters of respective classifiers and subsections of features. Our approach has achieved the state-of-the-art performance, measured by unweighted average recall in voice pathology detection on SVD database. We intentionally omit accuracy as it is highly biased metric in case of unbalanced data compared to aforementioned metrics. The results are further enhanced by eliminating the potential overestimation of the results with repeated stratified cross-validation. This advancement demonstrates significant potential for the clinical deployment of ML methods, offering a valuable tool for an objective examination of voice pathologies. To support our claims, we provide a publicly available GitHub repository with DOI 10.5281/zenodo.13771573. Finally, we provide REFORMS checklist.

artificial intelligence, deep learning, machine learning, (17 more...)

2410.10537

Country:

Europe > Germany > Saarland > Saarbrücken (0.14)
Europe > Czechia > Prague (0.04)
North America > United States > Massachusetts (0.04)
(4 more...)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.54)

arXiv.org Artificial IntelligenceOct-14-2024

Intramuscular High-Density Micro-Electrode Arrays Enable High-Precision Decoding and Mapping of Spinal Motor Neurons to Reveal Hand Control

Grison, Agnese, Pereda, Jaime Ibanez, Muceli, Silvia, Kundu, Aritra, Baracat, Farah, Indiveri, Giacomo, Donati, Elisa, Farina, Dario

Decoding nervous system activity is a key challenge in neuroscience and neural interfacing. In this study, we propose a novel neural decoding system that enables unprecedented large-scale sampling of muscle activity. Using micro-electrode arrays with more than 100 channels embedded within the forearm muscles, we recorded high-density signals that captured multi-unit motor neuron activity. This extensive sampling was complemented by advanced methods for neural decomposition, analysis, and classification, allowing us to accurately detect and interpret the spiking activity of spinal motor neurons that innervate hand muscles. We evaluated this system in two healthy participants, each implanted with three electromyogram (EMG) micro-electrode arrays (comprising 40 electrodes each) in the forearm. These arrays recorded muscle activity during both single- and multi-digit isometric contractions. For the first time under controlled conditions, we demonstrate that multi-digit tasks elicit unique patterns of motor neuron recruitment specific to each task, rather than employing combinations of recruitment patterns from single-digit tasks. This observation led us to hypothesize that hand tasks could be classified with high precision based on the decoded neural activity. We achieved perfect classification accuracy (100%) across 12 distinct single- and multi-digit tasks, and consistently high accuracy (>96\%) across all conditions and subjects, for up to 16 task classes. These results significantly outperformed conventional EMG classification methods. The exceptional performance of this system paves the way for developing advanced neural interfaces based on invasive high-density EMG technology. This innovation could greatly enhance human-computer interaction and lead to substantial improvements in assistive technologies, offering new possibilities for restoring motor function in clinical applications.

artificial intelligence, machine learning, motor unit, (19 more...)

2410.11016

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Finland > Uusimaa > Helsinki (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Human Computer Interaction (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.35)

arXiv.org Artificial IntelligenceOct-13-2024

Improving Academic Skills Assessment with NLP and Ensemble Learning

Huang, Xinyi, Wu, Yingyi, Zhang, Danyang, Hu, Jiacheng, Long, Yujian

This study addresses the critical challenges of assessing foundational academic skills by leveraging advancements in natural language processing (NLP). Traditional assessment methods often struggle to provide timely and comprehensive feedback on key cognitive and linguistic aspects, such as coherence, syntax, and analytical reasoning. Our approach integrates multiple state-of-the-art NLP models, including BERT, RoBERTa, BART, DeBERTa, and T5, within an ensemble learning framework. These models are combined through stacking techniques using LightGBM and Ridge regression to enhance predictive accuracy. The methodology involves detailed data preprocessing, feature extraction, and pseudo-label learning to optimize model performance. By incorporating sophisticated NLP techniques and ensemble learning, this study significantly improves the accuracy and efficiency of assessments, offering a robust solution that surpasses traditional methods and opens new avenues for educational technology research focused on enhancing core academic competencies.

arxiv preprint arxiv, assessment, classification, (13 more...)

2409.19013

Country:

North America > United States > Illinois > Cook County > Chicago (0.05)
North America > United States > Texas > Collin County > Frisco (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Asia > China > Yunnan Province > Kunming (0.04)

Genre: Research Report (0.50)

Industry: Education (0.35)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.70)
(2 more...)

Neural Information Processing SystemsOct-11-2024, 15:45:56 GMT

Weston-Watkins Hinge Loss and Ordered Partitions

Multiclass extensions of the support vector machine (SVM) have been formulated in a variety of ways. A recent empirical comparison of nine such formulations [Doǧan et al. 2016] recommends the variant proposed by Weston and Watkins (WW), despite the fact that the WW-hinge loss is not calibrated with respect to the 0-1 loss. In this work we introduce a novel discrete loss function for multiclass classification, the ordered partition loss, and prove that the WW-hinge loss is calibrated with respect to this loss. We also argue that the ordered partition loss is minimally emblematic among discrete losses satisfying this property. Finally, we apply our theory to justify the empirical observation made by Doǧan et al that the WW-SVM can work well even under massive label noise, a challenging setting for multiclass SVMs.

hinge loss and ordered partition, partition loss, weston-watkin hinge loss, (1 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.67)

Neural Information Processing SystemsOct-11-2024, 15:20:34 GMT

On kernel-based statistical learning theory in the mean field limit

In many applications of machine learning, a large number of variables are considered. Motivated by machine learning of interacting particle systems, we consider the situation when the number of input variables goes to infinity. First, we continue the recent investigation of the mean field limit of kernels and their reproducing kernel Hilbert spaces, completing the existing theory. Next, we provide results relevant for approximation with such kernels in the mean field limit, including a representer theorem. Finally, we use these kernels in the context of statistical learning in the mean field limit, focusing on Support Vector Machines.

kernel-based statistical learning theory, machine learning, mean field limit, (1 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.64)

Neural Information Processing SystemsOct-11-2024, 14:53:27 GMT

Statistical Topological Data Analysis - A Kernel Perspective

We consider the problem of statistical computations with persistence diagrams, a summary representation of topological features in data. These diagrams encode persistent homology, a widely used invariant in topological data analysis. While several avenues towards a statistical treatment of the diagrams have been explored recently, we follow an alternative route that is motivated by the success of methods based on the embedding of probability measures into reproducing kernel Hilbert spaces. In fact, a positive definite kernel on persistence diagrams has recently been proposed, connecting persistent homology to popular kernel-based learning techniques such as support vector machines. However, important properties of that kernel which would enable a principled use in the context of probability measure embeddings remain to be explored.

kernel perspective, persistent homology, statistical topological data analysis, (2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.65)

arXiv.org Artificial IntelligenceOct-11-2024

Fragile Giants: Understanding the Susceptibility of Models to Subpopulation Attacks

Gupta, Isha, Lycklama, Hidde, Opel, Emanuel, Rose, Evan, Hithnawi, Anwar

As machine learning models become increasingly complex, concerns about their robustness and trustworthiness have become more pressing. A critical vulnerability of these models is data poisoning attacks, where adversaries deliberately alter training data to degrade model performance. One particularly stealthy form of these attacks is subpopulation poisoning, which targets distinct subgroups within a dataset while leaving overall performance largely intact. The ability of these attacks to generalize within subpopulations poses a significant risk in real-world settings, as they can be exploited to harm marginalized or underrepresented groups within the dataset. In this work, we investigate how model complexity influences susceptibility to subpopulation poisoning attacks. We introduce a theoretical framework that explains how overparameterized models, due to their large capacity, can inadvertently memorize and misclassify targeted subpopulations. To validate our theory, we conduct extensive experiments on large-scale image and text datasets using popular model architectures. Our results show a clear trend: models with more parameters are significantly more vulnerable to subpopulation poisoning. Moreover, we find that attacks on smaller, human-interpretable subgroups often go undetected by these models. These results highlight the need to develop defenses that specifically address subpopulation vulnerabilities.

artificial intelligence, machine learning, subpopulation, (16 more...)

2410.08872

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Switzerland > Zürich > Zürich (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.46)

Neural Information Processing SystemsOct-10-2024, 17:28:34 GMT

Learning from Few Samples: Transformation-Invariant SVMs with Composition and Locality at Multiple Scales

Motivated by the problem of learning with small sample sizes, this paper shows how to incorporate into support-vector machines (SVMs) those properties that have made convolutional neural networks (CNNs) successful. Particularly important is the ability to incorporate domain knowledge of invariances, e.g., translational invariance of images. Kernels based on the \textit{maximum} similarity over a group of transformations are not generally positive definite. Perhaps it is for this reason that they have not been studied theoretically. We address this lacuna and show that positive definiteness indeed holds \textit{with high probability} for kernels based on the maximum similarity in the small training sample set regime of interest, and that they do yield the best results in that regime.

composition and locality, multiple scale, transformation-invariant svm, (6 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.43)