Population based change-point detection for the identification of homozygosity islands
Prates, Lucas, Lemes, Renan B, Hünemeier, Tábita, Leonardi, Florencia
In diploid organisms, such as humans, each individual's genome is organized into pairs of chromosomes, each half inherited from each parent. When an individual is an offspring of biologically related parents, both chromosomes of the same pair can share identical segments, creating long stretches of consecutive homozygosity, known as runs of homozygosity (ROH). In the last decades, studies on the identification of ROH carried out in human populations have revealed the presence of ROH even in cosmopolitan non-inbred populations, disclosing an increment of inbreeding levels and the consequent reduction of genetic diversity of populations, which is proportional to the walking distance from Africa, as expected by the out-of-Africa model of human colonization (Ceballos et al., 2018; Kirin et al., 2010; Lemes et al., 2018; Leutenegger et al., 2011; Pemberton et al., 2012). The distribution of ROH along the chromosomes is very uneven, resulting in some genomic regions having significant absence (coldspots) or excess of ROH (ROH islands) (Ceballos et al., 2018). The mechanisms for the emergence of these regions are still under discussion. For example, there is evidence that ROH islands could represent regions that harbor genes target of positive selection since low-recombination regions commonly are locations of selective sweeps, in which a new beneficial mutation increases in frequency and becomes fixed, causing the overall reduction in genetic diversity of the region (Ceballos et al., 2018; Pemberton et al., 2012). To detect ROH and ROH islands, the genetic material of individuals from a given population is genotyped, and a set of single nucleotide polymorphisms (SNPs) is obtained. Each SNP entry is codified to 1 if that SNP belongs to an ROH for that individual and to 0 otherwise, where a marker is defined as belonging to an ROH for an individual if it is surrounded by a region with high frequency of homozygous SNPs.
Nov-19-2021