AITopics | Krogh, Anders

A primer on synthetic health data

Bartell, Jennifer Anne, Valentin, Sander Boisen, Krogh, Anders, Langberg, Henning, Bøgsted, Martin

arXiv.org Artificial IntelligenceJan-31-2024

Recent advances in deep generative models have greatly expanded the potential to create realistic synthetic health datasets. These synthetic datasets aim to preserve the characteristics, patterns, and overall scientific conclusions derived from sensitive health datasets without disclosing patient identity or sensitive information. Thus, synthetic data can facilitate safe data sharing that supports a range of initiatives including the development of new predictive models, advanced health IT platforms, and general project ideation and hypothesis development. However, many questions and challenges remain, including how to consistently evaluate a synthetic dataset's similarity and predictive utility in comparison to the original real dataset and risk to privacy when shared. Additional regulatory and governance issues have not been widely addressed. In this primer, we map the state of synthetic health data, including generation and evaluation methods and tools, existing examples of deployment, the regulatory and ethical landscape, access and governance options, and opportunities for further development.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2401.17653

Country: Europe > Denmark (0.29)

Genre: Research Report > Experimental Study (1.00)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area (1.00)
Government > Regional Government > Europe Government (0.68)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Biomedical Informatics > Clinical Informatics (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

The Deep Generative Decoder: MAP estimation of representations improves modeling of single-cell RNA data

Schuster, Viktoria, Krogh, Anders

arXiv.org Artificial IntelligenceJul-12-2023

Learning low-dimensional representations of single-cell transcriptomics has become instrumental to its downstream analysis. The state of the art is currently represented by neural network models such as variational autoencoders (VAEs) which use a variational approximation of the likelihood for inference. We here present the Deep Generative Decoder (DGD), a simple generative model that computes model parameters and representations directly via maximum a posteriori (MAP) estimation. The DGD handles complex parameterized latent distributions naturally unlike VAEs which typically use a fixed Gaussian distribution, because of the complexity of adding other types. We first show its general functionality on a commonly used benchmark set, Fashion-MNIST. Secondly, we apply the model to multiple single-cell data sets. Here the DGD learns low-dimensional, meaningful and well-structured latent representations with sub-clustering beyond the provided labels. The advantages of this approach are its simplicity and its capability to provide representations of much smaller dimensionality than a comparable VAE.

artificial intelligence, machine learning, representation, (15 more...)

arXiv.org Artificial Intelligence

2110.06672

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Learning with ensembles: How overfitting can be useful

Sollich, Peter, Krogh, Anders

Neural Information Processing SystemsDec-31-1996

We study the characteristics of learning with ensembles. Solving exactly the simple model of an ensemble of linear students, we find surprisingly rich behaviour. For learning in large ensembles, it is advantageous to use under-regularized students, which actually over-fit the training data. Globally optimal performance can be obtained by choosing the training set sizes of the students appropriately. For smaller ensembles, optimization of the ensemble weights can yield significant improvements in ensemble generalization performance, in particular if the individual students are subject to noise in the training process. Choosing students with a wide range of regularization parameters makes this improvement robust against changes in the unknown level of noise in the training data. 1 INTRODUCTION An ensemble is a collection of a (finite) number of neural networks or other types of predictors that are trained for the same task.

artificial intelligence, neural network, student, (19 more...)

Neural Information Processing Systems

Country: Europe > Denmark (0.14)

Industry: Education (0.36)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.36)

Add feedback

Prediction of Beta Sheets in Proteins

Krogh, Anders, Riis, Soren Kamaric

Neural Information Processing SystemsDec-31-1996

Most current methods for prediction of protein secondary structure use a small window of the protein sequence to predict the structure of the central amino acid. We describe a new method for prediction of the non-local structure called,8-sheet, which consists of two or more,8-strands that are connected by hydrogen bonds. Since,8-strands are often widely separated in the protein chain, a network with two windows is introduced. After training on a set of proteins the network predicts the sheets well, but there are many false positives. By using a global energy function the,8-sheet prediction is combined with a local prediction of the three secondary structures a-helix,,8-strand and coil.

artificial intelligence, health & medicine, prediction, (18 more...)

Neural Information Processing Systems

Country:

Europe > Denmark (0.14)
North America > United States (0.14)

Genre: Research Report (0.46)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

Add feedback

Learning with ensembles: How overfitting can be useful

Sollich, Peter, Krogh, Anders

Neural Information Processing SystemsDec-31-1996

AndersKrogh'" NORDITA, Blegdamsvej 17 2100 Copenhagen, Denmark kroghGsanger.ac.uk Abstract We study the characteristics of learning with ensembles. Solving exactly the simple model of an ensemble of linear students, we find surprisingly rich behaviour. For learning in large ensembles, it is advantageous to use under-regularized students, which actually over-fitthe training data. Globally optimal performance can be obtained by choosing the training set sizes of the students appropriately. Forsmaller ensembles, optimization of the ensemble weights can yield significant improvements in ensemble generalization performance,in particular if the individual students are subject to noise in the training process.

artificial intelligence, machine learning, student, (18 more...)

Neural Information Processing Systems

Country: Europe > Denmark > Capital Region > Copenhagen (0.24)

Industry: Education (0.36)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Prediction of Beta Sheets in Proteins

Krogh, Anders, Riis, Soren Kamaric

Neural Information Processing SystemsDec-31-1996

Most current methods for prediction of protein secondary structure use a small window of the protein sequence to predict the structure of the central amino acid. We describe a new method for prediction of the non-local structure called,8-sheet, which consists of two or more,8-strands that are connected by hydrogen bonds. Since,8-strands are often widely separated in the protein chain, a network with two windows is introduced. After training on a set of proteins the network predicts the sheets well, but there are many false positives. Byusing a global energy function the,8-sheet prediction is combined with a local prediction of the three secondary structures a-helix,,8-strand and coil.

artificial intelligence, health & medicine, prediction, (18 more...)

Neural Information Processing Systems

Country:

Europe > Denmark (0.14)
North America > United States (0.14)

Genre: Research Report (0.46)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.35)

Add feedback

Prediction of Beta Sheets in Proteins

Krogh, Anders, Riis, Soren Kamaric

Neural Information Processing SystemsDec-31-1996

Most current methods for prediction of protein secondary structure use a small window of the protein sequence to predict the structure of the central amino acid. We describe a new method for prediction of the non-local structure called,8-sheet, which consists of two or more,8-strands that are connected by hydrogen bonds. Since,8-strands are often widely separated in the protein chain, a network with two windows is introduced. After training on a set of proteins the network predicts the sheets well, but there are many false positives. By using a global energy function the,8-sheet prediction is combined with a local prediction of the three secondary structures a-helix,,8-strand and coil.

artificial intelligence, health & medicine, prediction, (18 more...)

Neural Information Processing Systems

Country:

Europe > Denmark (0.14)
North America > United States (0.14)

Genre: Research Report (0.46)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

Add feedback

Learning with ensembles: How overfitting can be useful

Sollich, Peter, Krogh, Anders

Neural Information Processing SystemsDec-31-1996

We study the characteristics of learning with ensembles. Solving exactly the simple model of an ensemble of linear students, we find surprisingly rich behaviour. For learning in large ensembles, it is advantageous to use under-regularized students, which actually over-fit the training data. Globally optimal performance can be obtained by choosing the training set sizes of the students appropriately. For smaller ensembles, optimization of the ensemble weights can yield significant improvements in ensemble generalization performance, in particular if the individual students are subject to noise in the training process. Choosing students with a wide range of regularization parameters makes this improvement robust against changes in the unknown level of noise in the training data. 1 INTRODUCTION An ensemble is a collection of a (finite) number of neural networks or other types of predictors that are trained for the same task.

artificial intelligence, neural network, student, (19 more...)

Neural Information Processing Systems

Country: Europe > Denmark (0.14)

Industry: Education (0.36)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.36)

Add feedback

Neural Network Ensembles, Cross Validation, and Active Learning

Krogh, Anders, Vedelsby, Jesper

Neural Information Processing SystemsDec-31-1995

It is well known that a combination of many different predictors can improve predictions. In the neural networks community "ensembles" of neural networks has been investigated by several authors, see for instance [1, 2, 3]. Most often the networks in the ensemble are trained individually and then their predictions are combined. This combination is usually done by majority (in classification) or by simple averaging (in regression), but one can also use a weighted combination of the networks.

artificial intelligence, generalization error, neural network, (16 more...)

Neural Information Processing Systems

Country:

Europe > Denmark (0.29)
North America > United States > California > San Mateo County > San Mateo (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.45)

Add feedback

Neural Network Ensembles, Cross Validation, and Active Learning

Krogh, Anders, Vedelsby, Jesper

Neural Information Processing SystemsDec-31-1995

It is well known that a combination of many different predictors can improve predictions. Inthe neural networks community "ensembles" of neural networks has been investigated by several authors, see for instance [1, 2, 3]. Most often the networks in the ensemble are trained individually and then their predictions are combined. This combination is usually done by majority (in classification) or by simple averaging (inregression), but one can also use a weighted combination of the networks.

artificial intelligence, generalization error, neural network, (16 more...)

Neural Information Processing Systems

Country: