Goto

Collaborating Authors

 Accuracy


Text Injection for Capitalization and Turn-Taking Prediction in Speech Models

arXiv.org Artificial Intelligence

Text injection for automatic speech recognition (ASR), wherein unpaired text-only data is used to supplement paired audio-text data, has shown promising improvements for word error rate. This study examines the use of text injection for auxiliary tasks, which are the non-ASR tasks often performed by an E2E model. In this work, we use joint end-to-end and internal language model training (JEIT) as our text injection algorithm to train an ASR model which performs two auxiliary tasks. The first is capitalization, which is a de-normalization task. The second is turn-taking prediction, which attempts to identify whether a user has completed their conversation turn in a digital assistant interaction. We show results demonstrating that our text injection method boosts capitalization performance for long-tail data, and improves turn-taking detection recall.


Age-Stratified Differences in Morphological Connectivity Patterns in ASD: An sMRI and Machine Learning Approach

arXiv.org Artificial Intelligence

Purpose: Age biases have been identified as an essential factor in the diagnosis of ASD. The objective of this study was to compare the effect of different age groups in classifying ASD using morphological features (MF) and morphological connectivity features (MCF). Methods: The structural magnetic resonance imaging (sMRI) data for the study was obtained from the two publicly available databases, ABIDE-I and ABIDE-II. We considered three age groups, 6 to 11, 11 to 18, and 6 to 18, for our analysis. The sMRI data was pre-processed using a standard pipeline and was then parcellated into 148 different regions according to the Destrieux atlas. The area, thickness, volume, and mean curvature information was then extracted for each region which was used to create a total of 592 MF and 10,878 MCF for each subject. Significant features were identified using a statistical t-test (p<0.05) which was then used to train a random forest (RF) classifier. Results: The results of our study suggested that the performance of the 6 to 11 age group was the highest, followed by the 6 to 18 and 11 to 18 ages in both MF and MCF. Overall, the MCF with RF in the 6 to 11 age group performed better in the classification than the other groups and produced an accuracy, F1 score, recall, and precision of 75.8%, 83.1%, 86%, and 80.4%, respectively. Conclusion: Our study thus demonstrates that morphological connectivity and age-related diagnostic model could be an effective approach to discriminating ASD.


Can we Agree? On the Rash\=omon Effect and the Reliability of Post-Hoc Explainable AI

arXiv.org Artificial Intelligence

The Rash\=omon effect poses challenges for deriving reliable knowledge from machine learning models. This study examined the influence of sample size on explanations from models in a Rash\=omon set using SHAP. Experiments on 5 public datasets showed that explanations gradually converged as the sample size increased. Explanations from <128 samples exhibited high variability, limiting reliable knowledge extraction. However, agreement between models improved with more data, allowing for consensus. Bagging ensembles often had higher agreement. The results provide guidance on sufficient data to trust explanations. Variability at low samples suggests that conclusions may be unreliable without validation. Further work is needed with more model types, data domains, and explanation methods. Testing convergence in neural networks and with model-specific explanation methods would be impactful. The approaches explored here point towards principled techniques for eliciting knowledge from ambiguous models.


Greedy online change point detection

arXiv.org Artificial Intelligence

Standard online change point detection (CPD) methods tend to have large false discovery rates as their detections are sensitive to outliers. To overcome this drawback, we propose Greedy Online Change Point Detection (GOCPD), a computationally appealing method which finds change points by maximizing the probability of the data coming from the (temporal) concatenation of two independent models. We show that, for time series with a single change point, this objective is unimodal and thus CPD can be accelerated via ternary search with logarithmic complexity. We demonstrate the effectiveness of GOCPD on synthetic data and validate our findings on real-world univariate and multivariate settings.


Anomaly Detection in Automated Fibre Placement: Learning with Data Limitations

arXiv.org Artificial Intelligence

Conventional defect detection systems in Automated Fibre Placement (AFP) typically rely on end-to-end supervised learning, necessitating a substantial number of labelled defective samples for effective training. However, the scarcity of such labelled data poses a challenge. To overcome this limitation, we present a comprehensive framework for defect detection and localization in Automated Fibre Placement. Our approach combines unsupervised deep learning and classical computer vision algorithms, eliminating the need for labelled data or manufacturing defect samples. It efficiently detects various surface issues while requiring fewer images of composite parts for training. Our framework employs an innovative sample extraction method leveraging AFP's inherent symmetry to expand the dataset. By inputting a depth map of the fibre layup surface, we extract local samples aligned with each composite strip (tow). These samples are processed through an autoencoder, trained on normal samples for precise reconstructions, highlighting anomalies through reconstruction errors. Aggregated values form an anomaly map for insightful visualization. The framework employs blob detection on this map to locate manufacturing defects. The experimental findings reveal that despite training the autoencoder with a limited number of images, our proposed method exhibits satisfactory detection accuracy and accurately identifies defect locations. Our framework demonstrates comparable performance to existing methods, while also offering the advantage of detecting all types of anomalies without relying on an extensive labelled dataset of defects.


Evaluating the Impact of Social Determinants on Health Prediction in the Intensive Care Unit

arXiv.org Artificial Intelligence

Social determinants of health (SDOH) -- the conditions in which people live, grow, and age -- play a crucial role in a person's health and well-being. There is a large, compelling body of evidence in population health studies showing that a wide range of SDOH is strongly correlated with health outcomes. Yet, a majority of the risk prediction models based on electronic health records (EHR) do not incorporate a comprehensive set of SDOH features as they are often noisy or simply unavailable. Our work links a publicly available EHR database, MIMIC-IV, to well-documented SDOH features. We investigate the impact of such features on common EHR prediction tasks across different patient populations. We find that community-level SDOH features do not improve model performance for a general patient population, but can improve data-limited model fairness for specific subpopulations. We also demonstrate that SDOH features are vital for conducting thorough audits of algorithmic biases beyond protective attributes. We hope the new integrated EHR-SDOH database will enable studies on the relationship between community health and individual outcomes and provide new benchmarks to study algorithmic biases beyond race, gender, and age.


3D Scene Graph Prediction on Point Clouds Using Knowledge Graphs

arXiv.org Artificial Intelligence

3D scene graph prediction is a task that aims to concurrently predict object classes and their relationships within a 3D environment. As these environments are primarily designed by and for humans, incorporating commonsense knowledge regarding objects and their relationships can significantly constrain and enhance the prediction of the scene graph. In this paper, we investigate the application of commonsense knowledge graphs for 3D scene graph prediction on point clouds of indoor scenes. Through experiments conducted on a real-world indoor dataset, we demonstrate that integrating external commonsense knowledge via the message-passing method leads to a 15.0 % improvement in scene graph prediction accuracy with external knowledge and $7.96\%$ with internal knowledge when compared to state-of-the-art algorithms. We also tested in the real world with 10 frames per second for scene graph generation to show the usage of the model in a more realistic robotics setting.


Evaluating the anticipated outcomes of MRI seizure image from open-source tool- Prototype approach

arXiv.org Artificial Intelligence

Clinical neuroscience studies are based on neuroimaging and psychiatric. These studies are very important for genetic data processing, cognitive evaluations, and follow-up. Brain imaging describes the colorful ways to either directly or laterally structure and function the neuroimaging system. Neuroradiologists do the interpretation of structural and functional imaging and clinical analysis of the Brain. Structural imaging deals with the nervous system structure and records the opinion of intracranial complaints of excrescence and injury.


Weighted Sparse Partial Least Squares for Joint Sample and Feature Selection

arXiv.org Artificial Intelligence

Sparse Partial Least Squares (sPLS) is a common dimensionality reduction technique for data fusion, which projects data samples from two views by seeking linear combinations with a small number of variables with the maximum variance. However, sPLS extracts the combinations between two data sets with all data samples so that it cannot detect latent subsets of samples. To extend the application of sPLS by identifying a specific subset of samples and remove outliers, we propose an $\ell_\infty/\ell_0$-norm constrained weighted sparse PLS ($\ell_\infty/\ell_0$-wsPLS) method for joint sample and feature selection, where the $\ell_\infty/\ell_0$-norm constrains are used to select a subset of samples. We prove that the $\ell_\infty/\ell_0$-norm constrains have the Kurdyka-\L{ojasiewicz}~property so that a globally convergent algorithm is developed to solve it. Moreover, multi-view data with a same set of samples can be available in various real problems. To this end, we extend the $\ell_\infty/\ell_0$-wsPLS model and propose two multi-view wsPLS models for multi-view data fusion. We develop an efficient iterative algorithm for each multi-view wsPLS model and show its convergence property. As well as numerical and biomedical data experiments demonstrate the efficiency of the proposed methods.


Learning on Graphs with Out-of-Distribution Nodes

arXiv.org Artificial Intelligence

Graph Neural Networks (GNNs) are state-of-the-art models for performing prediction tasks on graphs. While existing GNNs have shown great performance on various tasks related to graphs, little attention has been paid to the scenario where out-of-distribution (OOD) nodes exist in the graph during training and inference. Borrowing the concept from CV and NLP, we define OOD nodes as nodes with labels unseen from the training set. Since a lot of networks are automatically constructed by programs, real-world graphs are often noisy and may contain nodes from unknown distributions. In this work, we define the problem of graph learning with out-of-distribution nodes. Specifically, we aim to accomplish two tasks: 1) detect nodes which do not belong to the known distribution and 2) classify the remaining nodes to be one of the known classes. We demonstrate that the connection patterns in graphs are informative for outlier detection, and propose Out-of-Distribution Graph Attention Network (OODGAT), a novel GNN model which explicitly models the interaction between different kinds of nodes and separate inliers from outliers during feature propagation. Extensive experiments show that OODGAT outperforms existing outlier detection methods by a large margin, while being better or comparable in terms of in-distribution classification.