When looking at populations of cells, features such as cell heterogeneity and localization are masked. At the onset of gastrulation, the fly embryo consists of about 6000 cells with distinct gene expression profiles. Karaiskos et al. developed an algorithm to generate an interactive three-dimensional (3D) "virtual embryo," with the expression of more than 8000 genes per cell measured for most cells (see the Perspective by Stadler and Eisen). The virtual embryo offers insights into developmental mechanisms--from local expression of regulators such as transcription factors and long noncoding RNAs to spatial modulation of signaling pathways. Science, this issue p. 194; see also p. 172
We generated single-cell transcriptomes from 38,731 cells during early zebrafish embryogenesis at high temporal resolution, spanning 12 stages from the onset of zygotic transcription through early somitogenesis. We took two complementary approaches to identify the transcriptional trajectories in the data. First, we developed a simulated diffusion-based computational approach, URD, which identified the trajectories describing the specification of 25 cell types in the form of a branching tree. Second, we identified modules of coexpressed genes and connected them across developmental time. Combining the reconstructed developmental trajectories with differential gene expression analysis uncovered gene expression cascades leading to each cell type, including previously unidentified markers and candidate regulators.
We propose a probabilistic model for interpreting gene expression levels that are observed through single-cell RNA sequencing. In the model, each cell has a low-dimensional latent representation. Additional latent variables account for technical effects that may erroneously set some observations of gene expression levels to zero. Conditional distributions are specified by neural networks, giving the proposed model enough flexibility to fit the data well. We use variational inference and stochastic optimization to approximate the posterior distribution. The inference procedure scales to over one million cells, whereas competing algorithms do not. Even for smaller datasets, for several tasks, the proposed procedure outperforms state-of-the-art methods like ZIFA and ZINB-WaVE. We also extend our framework to take into account batch effects and other confounding factors and propose a natural Bayesian hypothesis framework for differential expression that outperforms tradition DESeq2.
Each cell is distinct depending on its age, point in the cell cycle, and composition of genetic mutations. This image hints at our ability to see the genetic "fingerprints" of individual cells, allowing us to discover their subtle differences. Imagine being able to shrink down to a small enough size to peer into the human body at the single-cell level. Now take a deep breath and plunge into that cell to see all of the ongoing biological processes, including the full complement of molecules and their locations within the cell. This has long been the realm of science fiction, but not for much longer.
To resolve cellular heterogeneity, we developed a combinatorial indexing strategy to profile the transcriptomes of single cells or nuclei, termed sci-RNA-seq (single-cell combinatorial indexing RNA sequencing). We applied sci-RNA-seq to profile nearly 50,000 cells from the nematode Caenorhabditis elegans at the L2 larval stage, which provided 50-fold "shotgun" cellular coverage of its somatic cell composition. From these data, we defined consensus expression profiles for 27 cell types and recovered rare neuronal cell types corresponding to as few as one or two cells in the L2 worm. We integrated these profiles with whole-animal chromatin immunoprecipitation sequencing data to deconvolve the cell type–specific effects of transcription factors. The data generated by sci-RNA-seq constitute a powerful resource for nematode biology and foreshadow similar atlases for other organisms.