AITopics

2109.14099

Country: North America > United States > Illinois > DeKalb County > DeKalb (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.84)
(3 more...)

Non-stationary Gaussian process discriminant analysis with variable selection for high-dimensional functional data

Yu, W, Wade, S, Bondell, H D, Azizi, L

High-dimensional classification and feature selection tasks are ubiquitous with the recent advancement in data acquisition technology. In several application areas such as biology, genomics and proteomics, the data are often functional in their nature and exhibit a degree of roughness and non-stationarity. These structures pose additional challenges to commonly used methods that rely mainly on a two-stage approach performing variable selection and classification separately. We propose in this work a novel Gaussian process discriminant analysis (GPDA) that combines these steps in a unified framework. Our model is a two-layer non-stationary Gaussian process coupled with an Ising prior to identify differentially-distributed locations. Scalable inference is achieved via developing a variational scheme that exploits advances in the use of sparse inverse covariance matrices. We demonstrate the performance of our methodology on simulated datasets and two proteomics datasets: breast cancer and SARS-CoV-2. Our approach distinguishes itself by offering explainability as well as uncertainty quantification in addition to low computational cost, which are crucial to increase trust and social acceptance of data-driven tools.

discriminant analysis, supplementary material, variable selection, (14 more...)

2109.14171

Genre: Research Report (0.50)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.89)
Health & Medicine > Therapeutic Area > Oncology (0.89)
Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (0.57)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(4 more...)

Federated Learning Algorithms for Generalized Mixed-effects Model (GLMM) on Horizontally Partitioned Data from Distributed Sources

Li, Wentao, Tong, Jiayi, Anjum, Md. Monowar, Mohammed, Noman, Chen, Yong, Jiang, Xiaoqian

Objectives: This paper develops two algorithms to achieve federated generalized linear mixed effect models (GLMM), and compares the developed model's outcomes with each other, as well as that from the standard R package (`lme4'). Methods: The log-likelihood function of GLMM is approximated by two numerical methods (Laplace approximation and Gaussian Hermite approximation), which supports federated decomposition of GLMM to bring computation to data. Results: Our developed method can handle GLMM to accommodate hierarchical data with multiple non-independent levels of observations in a federated setting. The experiment results demonstrate comparable (Laplace) and superior (Gaussian-Hermite) performances with simulated and real-world data. Conclusion: We developed and compared federated GLMMs with different approximations, which can support researchers in analyzing biomedical data to accommodate mixed effects and address non-independence due to hierarchical structures (i.e., institutes, region, country, etc.).

approximation, glmm, laplace approximation, (14 more...)

2109.14046

Country:

North America > United States > Pennsylvania (0.04)
North America > United States > Louisiana > Saint John the Baptist Parish > Laplace (0.04)
North America > Canada > Manitoba > Winnipeg Metropolitan Region > Winnipeg (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.93)
Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (0.68)
Health & Medicine > Government Relations & Public Policy (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.96)

Guggilam, Sreelekha, Chandola, Varun, Patra, Abani

Anomaly Detection for High-Dimensional Data Using Large Deviations Principle

Most current anomaly detection methods suffer from the curse of dimensionality when dealing with high-dimensional data. We propose an anomaly detection algorithm that can scale to high-dimensional data using concepts from the theory of large deviations. The proposed Large Deviations Anomaly Detection (LAD) algorithm is shown to outperform state of art anomaly detection methods on a variety of large and high-dimensional benchmark data sets. Exploiting the ability of the algorithm to scale to high-dimensional data, we propose an online anomaly detection method to identify anomalies in a collection of multivariate time series. We demonstrate the applicability of the online algorithm in identifying counties in the United States with anomalous trends in terms of COVID-19 related cases and deaths. Several of the identified anomalous counties correlate with counties with documented poor response to the COVID pandemic.

dataset, detection, time sery, (11 more...)

2109.13698

Country:

North America > United States > New York > Erie County > Buffalo (0.04)
North America > United States > Michigan > Wayne County > Wayne (0.04)
North America > United States > Wyoming > Albany County > Laramie (0.04)
(10 more...)

Genre: Research Report (0.40)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.88)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Markovitch, Benny, Fokkema, Marjolein

Improved prediction rule ensembling through model-based data generation

Prediction rule ensembles (PRE) provide interpretable prediction models with relatively high accuracy.PRE obtain a large set of decision rules from a (boosted) decision tree ensemble, and achieves sparsitythrough application of Lasso-penalized regression. This article examines the use of surrogate modelsto improve performance of PRE, wherein the Lasso regression is trained with the help of a massivedataset generated by the (boosted) decision tree ensemble. This use of model-based data generationmay improve the stability and consistency of the Lasso step, thus leading to improved overallperformance. We propose two surrogacy approaches, and evaluate them on simulated and existingdatasets, in terms of sparsity and predictive accuracy. The results indicate that the use of surrogacymodels can substantially improve the sparsity of PRE, while retaining predictive accuracy, especiallythrough the use of a nested surrogacy approach.

dataset, lasso, surrogate lasso, (14 more...)

2109.13672

Country:

Europe > Netherlands > South Holland > Leiden (0.04)
North America > United States > Wisconsin (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

#artificialintelligenceSep-27-2021, 23:20:22 GMT

4 Ways That Your Accurate Model May Not Be Good Enough

When we were in school and were given a problem to solve, we usually stopped working on the problem as soon as we found the answer and we recorded that answer on our paper. This might be a fair approach for elementary school assignments, but that approach is not good in higher education or in life. Unfortunately, many people continue this learned behavior into adulthood, at the university and/or on their jobs. Consequently, these people miss new opportunities for learning, discovery, recognition, and advancement. In data science, we are trained to keep searching (at least, I hope that this is true for all of us) even after we find that first model from our data that appears to answer our business question accurately.

analytic model, global minimum, rare condition, (10 more...)

#artificialintelligence

Industry: Education > Educational Setting (0.55)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Ajirak, Marzieh, Heiselman, Cassandra, Fuchs, Anna, Heiligenstein, Mia, Herrera, Kimberly, Garretto, Diana, Djuric, Petar

Bayesian Nonparametric Dimensionality Reduction of Categorical Data for Predicting Severity of COVID-19 in Pregnant Women

arXiv.org Artificial IntelligenceSep-27-2021

The coronavirus disease (COVID-19) has rapidly spread throughout the world and while pregnant women present the same adverse outcome rates, they are underrepresented in clinical research. We collected clinical data of 155 test-positive COVID-19 pregnant women at Stony Brook University Hospital. Many of these collected data are of multivariate categorical type, where the number of possible outcomes grows exponentially as the dimension of data increases. We modeled the data within the unsupervised Bayesian framework and mapped them into a lower-dimensional space using latent Gaussian processes. The latent features in the lower dimensional space were further used for predicting if a pregnant woman would be admitted to a hospital due to COVID-19 or would remain with mild symptoms. We compared the prediction accuracy with the dummy/one-hot encoding of categorical data and found that the latent Gaussian process had better accuracy.

categorical data, covid-19, pregnant women, (10 more...)

doi: 10.23919/EUSIPCO54536.2021.9616021

2011.03715

Country:

North America > United States > New York > Suffolk County > Stony Brook (0.25)
Asia > China (0.04)

Genre:

Research Report > Experimental Study (0.69)
Research Report > New Finding (0.67)

Industry:

Health & Medicine > Therapeutic Area > Obstetrics/Gynecology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)

Kent, Jonathan S., Li, Bo

DOODLER: Determining Out-Of-Distribution Likelihood from Encoder Reconstructions

arXiv.org Machine LearningSep-27-2021

Deep Learning models possess two key traits that, in combination, make their use in the real world a risky prospect. One, they do not typically generalize well outside of the distribution for which they were trained, and two, they tend to exhibit confident behavior regardless of whether or not they are producing meaningful outputs. While Deep Learning possesses immense power to solve realistic, high-dimensional problems, these traits in concert make it difficult to have confidence in their real-world applications. To overcome this difficulty, the task of Out-Of-Distribution (OOD) Detection has been defined, to determine when a model has received an input from outside of the distribution for which it is trained to operate. This paper introduces and examines a novel methodology, DOODLER, for OOD Detection, which directly leverages the traits which result in its necessity. By training a Variational Auto-Encoder (VAE) on the same data as another Deep Learning model, the VAE learns to accurately reconstruct In-Distribution (ID) inputs, but not to reconstruct OOD inputs, meaning that its failure state can be used to perform OOD Detection. Unlike other work in the area, DOODLER requires only very weak assumptions about the existence of an OOD dataset, allowing for more realistic application. DOODLER also enables pixel-wise segmentations of input images by OOD likelihood, and experimental results show that it matches or outperforms methodologies that operate under the same constraints.

dataset, detection, reconstruction error, (13 more...)

2109.13237

Country:

North America > United States > Illinois (0.04)
Europe > Poland (0.04)

Genre: Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Feng, Jianxiang, Durner, Maximilian, Marton, Zoltan-Csaba, Balint-Benczedi, Ferenc, Triebel, Rudolph

Introspective Robot Perception using Smoothed Predictions from Bayesian Neural Networks

arXiv.org Artificial IntelligenceSep-27-2021

This work focuses on improving uncertainty estimation in the field of object classification from RGB images and demonstrates its benefits in two robotic applications. We employ a Bayesian Neural Network (BNN), and evaluate two practical inference techniques to obtain better uncertainty estimates, namely Concrete Dropout (CDP) and Kronecker-factored Laplace Approximation (LAP). We show a performance increase using more reliable uncertainty estimates as unary potentials within a Conditional Random Field (CRF), which is able to incorporate contextual information as well. Furthermore, the obtained uncertainties are exploited to achieve domain adaptation in a semi-supervised manner, which requires less manual efforts in annotating data. We evaluate our approach on two public benchmark datasets that are relevant for robot perception tasks.

dataset, prediction, uncertainty estimate, (13 more...)

2109.12869

Country:

Europe > Germany > Bremen > Bremen (0.14)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Svoboda, Emil, Bořil, Tomáš, Rusz, Jan, Tykalová, Tereza, Horáková, Dana, Guttman, Charles R. G., Blagoev, Krastan B., Hatabu, Hiroto, Valtchinov, Vlad I.

Assessing clinical utility of Machine Learning and Artificial Intelligence approaches to analyze speech recordings in Multiple Sclerosis: A Pilot Study

arXiv.org Artificial IntelligenceSep-27-2021

Background: An early diagnosis together with an accurate disease progression monitoring of multiple sclerosis is an important component of successful disease management. Prior studies have established that multiple sclerosis is correlated with speech discrepancies. Early research using objective acoustic measurements has discovered measurable dysarthria. Objective: To determine the potential clinical utility of machine learning and deep learning/AI approaches for the aiding of diagnosis, biomarker extraction and progression monitoring of multiple sclerosis using speech recordings. Methods: A corpus of 65 MS-positive and 66 healthy individuals reading the same text aloud was used for targeted acoustic feature extraction utilizing automatic phoneme segmentation. A series of binary classification models was trained, tuned, and evaluated regarding their Accuracy and area-under-curve. Results: The Random Forest model performed best, achieving an Accuracy of 0.82 on the validation dataset and an area-under-curve of 0.76 across 5 k-fold cycles on the training dataset. 5 out of 7 acoustic features were statistically significant. Conclusion: Machine learning and artificial intelligence in automatic analyses of voice recordings for aiding MS diagnosis and progression tracking seems promising. Further clinical validation of these methods and their mapping onto multiple sclerosis progression is needed, as well as a validating utility for English-speaking populations.

csi, motivated, multiple sclerosis, (13 more...)

2109.09844

Country:

Europe > Czechia > Prague (0.06)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
South America > Uruguay > Maldonado > Maldonado (0.04)
(4 more...)

Genre:

Research Report > Experimental Study (0.69)
Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology > Multiple Sclerosis (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)