Collaborating Authors

High-resolution comparative analysis of great ape genomes


We sequenced and assembled two human, one chimpanzee, and one orangutan genome using high-coverage ( 65x) single-molecule, real-time (SMRT) long-read sequencing technology. We also sequenced more than 500,000 full-length complementary DNA samples from induced pluripotent stem cells to construct de novo gene models, increasing our knowledge of transcript diversity in each ape lineage. The new nonhuman ape genome assemblies improve gene annotation and genomic contiguity (by 30- to 500-fold), resulting in the identification of larger synteny blocks (by 22- to 74-fold) when compared to earlier assemblies. Including the latest gorilla genome, we now estimate that 83% of the ape genomes can be compared in a multiple sequence alignment. We observe a modest increase in single-nucleotide variant divergence compared to previous genome analyses and estimate that 36% of human autosomal DNA is subject to incomplete lineage sorting.

How to track and visualize data lineage


Data lineage is about tracking the flow of information. It is necessary to guarantee the quality, usability and security of your data. For large organizations, it is also a key conformity requirement. With Linkurious, it is possible to use a graph-based approach to solve these challenges. The success of an organization depends on the quality, usability and security of its data.


AAAI Conferences

An important problem in malware forensics is generating a partial ordering of a collection of variants of a malware program, reflecting a history of the malware's evolution as it is adapted by the original or new authors. We present new work extending our results on the malware lineage problem originally presented at FLAIRS 2013. We provide a new algorithm for reconstructing malware lineages with and without branch and merge events. This algorithm incorporates two innovations – the evaluation of candidate evolutionary traces based on candidate sets of feature accretion events and a machine-learning inspired approach to reducing overexplanation in the final lineage. The evolutionary trace algorithm is evaluated on several small families of malware whose ground truth lineage is known.

Fossils push back origin of key plant groups millions of years


About 250 million years ago, Earth underwent the worst mass extinction event in its history. Researchers have now unearthed fossils of three plant lineages from before this "great dying" at the end of the Permian period. The discovery, of conifers, seed ferns, and a group of cycadlike plants called Bennettitales, is unexpected and may lead to a revision of plant evolution, because all the lineages were thought to have arisen tens of millions of years later. Their uncovering near the Dead Sea also lends support to a 45-year-old idea about why the tropics tend to have more species than higher latitudes. Dry tropical environments may be "cradles" of evolution.

The evolutionary history of dogs in the Americas


Dogs have been present in North America for at least 9000 years. To better understand how present-day breeds and populations reflect their introduction to the New World, Ní Leathlobhair et al. sequenced the mitochondrial and nuclear genomes of ancient dogs (see the Perspective by Goodman and Karlsson). The earliest New World dogs were not domesticated from North American wolves but likely originated from a Siberian ancestor. Furthermore, these lineages date back to a common ancestor that coincides with the first human migrations across Beringia. This lineage appears to have been mostly replaced by dogs introduced by Europeans, with the primary extant lineage remaining as a canine transmissible venereal tumor.