Goto

Collaborating Authors

 electropherogram


Simulating realistic short tandem repeat capillary electrophoretic signal using a generative adversarial network

Taylor, Duncan, Humphries, Melissa

arXiv.org Artificial Intelligence

DNA profiles are made up from multiple series of electrophoretic signal measuring fluorescence over time. Typically, human DNA analysts 'read' DNA profiles using their experience to distinguish instrument noise, artefactual signal, and signal corresponding to DNA fragments of interest. Recent work has developed an artificial neural network, ANN, to carry out the task of classifying fluorescence types into categories in DNA profile electrophoretic signal. But the creation of the necessarily large amount of labelled training data for the ANN is time consuming and expensive, and a limiting factor in the ability to robustly train the ANN. If realistic, prelabelled, training data could be simulated then this would remove the barrier to training an ANN with high efficacy. Here we develop a generative adversarial network, GAN, modified from the pix2pix GAN to achieve this task. With 1078 DNA profiles we train the GAN and achieve the ability to simulate DNA profile information, and then use the generator from the GAN as a 'realism filter' that applies the noise and artefact elements exhibited in typical electrophoretic signal.


Teaching artificial intelligence to read electropherograms - ScienceDirect

#artificialintelligence

Electropherograms are produced in great numbers in forensic DNA laboratories as part of everyday criminal casework. Before the results of these electropherograms can be used they must be scrutinised by analysts to determine what the identified data tells us about the underlying DNA sequences and what is purely an artefact of the DNA profiling process. A technique that lends itself well to such a task of classification in the face of vast amounts of data is the use of artificial neural networks. These networks, inspired by the workings of the human brain, have been increasingly successful in analysing large datasets, performing medical diagnoses, identifying handwriting, playing games, or recognising images. In this work we demonstrate the use of an artificial neural network which we train to'read' electropherograms and show that it can generalise to unseen profiles.