Inherited common variation in DNA
المؤلف:
Cohn, R. D., Scherer, S. W., & Hamosh, A.
المصدر:
Thompson & Thompson Genetics and Genomics in Medicine
الجزء والصفحة:
9th E, P47-50
2025-11-26
103
The original Human Genome Project and the subsequent study of many millions of individuals worldwide have provided vast DNA sequence information. With this information in hand, one can begin to characterize the types and frequencies of common variation found in the human genome and to generate catalogues of the world’s human DNA sequence diversity. Such variants can be classified according to how the DNA sequence differs among the different alleles (Table 1 and Figs. 1 and 2).

Table1. Common Variation in the Human Genome

Fig1. Three polymorphisms in genomic DNA from the segment of the human genome reference assembly shown at the top. The single nucleotide variation (SNV) at position 8 has two alleles, one with a T (corresponding to the reference sequence) and one with a C. There are two indels in this region. At indel A, allele 2 has an insertion of a G between positions 11 and 12 in the reference sequence (allele 1). At indel B, allele 2 has a 2 bp deletion of positions 5 and 6 in the reference sequence.

Fig2. Examples of variation in the human genome larger than single nucleotide variants. Clockwise from upper right: The microsatellite locus has three alleles, with four, five, or six copies of a CAA trinucleotide repeat. The inversion variant has two alleles corresponding to the two orientations (indicated by the arrows) of the genomic segment shown in green; such inversions can involve regions up to many megabases of DNA. Copy number variants involve deletion or duplication of hundreds of kilobase pairs to over a megabase of genomic DNA. In the example shown, allele 1 contains a single copy, whereas allele 2 contains three copies of the chromosomal segment containing the F and G genes; other possible alleles with zero, two, four, or more copies of F and G are not shown. The mobile element insertion variant has two alleles, one with and one without insertion of a ~6- kb LINE repeated retroelement; the insertion of the mobile element changes the spacing between the two genes and may alter gene expression in the region.
Single Nucleotide Variants
The simplest and most common of all variants are SNVs. Those that occur at a high population frequency (typically defined as >1% or >5%) have been called SNPs, but more recently, common SNVs. A polymorphic locus characterized by a common SNV usually has only two alleles, corresponding to two different bases at that particular location (see Fig. 1). Common SNVs are observed, on average, once every 1000 bp. However, their distribution is uneven around the genome; many more are found in noncoding parts of the genome, in introns and in sequences that are some distance from protein- coding genes. Nonetheless, a significant number of SNVs, both common and rare, occur in genes and other known functional elements in the genome. Approximately half of these do not alter the predicted amino acid sequence of the encoded protein and thus are termed synonymous, whereas those that do alter the amino acid sequence are called nonsynonymous. Other SNVs are candidates to have significant functional con sequences, as they introduce or change a stop codon, or alter a known splice site.
The significance for health of the vast majority of common SNVs is unknown and is the subject of ongoing research. The fact that these variants are common does not mean that they are without detrimental or protective effect on health or longevity. What it does mean is that any effect of common SNVs is likely to involve a relatively subtle altering of disease susceptibility rather than be a direct cause of serious illness.
Insertion- Deletion Variants
A second class of variants result from insertion or deletion (indels) of segments that range from a single base pair up to ~1 kb. Over a million indels have been described among human genomes, numbering in the hundreds of thousands for any one individual. Approximately half of all indels are referred to as simple because they have only two alleles— that is, the presence or absence of the inserted or deleted segment (see Fig. 1).
Microsatellite Variants
Other indels, however, are multiallelic due to variable numbers of a segment of DNA in tandem at a particular location. The term satellite comes from the early observation that this fraction of DNA has a different density, causing separation during centrifugation. Sometimes called variable number of tandem repeats, these microsatellites are highly vulnerable to mutation. They consist of DNA cassettes composed of units of several nucleotides— such as TG, CAA, or AAAT— repeated in tandem between one and a few dozen times (see Fig. 2). The numbers of repeated units determine the different alleles, sometimes also referred to as short tandem repeats (STRs). A microsatellite locus often has many alleles (repeat lengths) that can be rapidly evaluated by standard laboratory procedures to distinguish different individuals and to infer familial relationships (Fig. 3). Many tens of thousands of microsatellite loci are known throughout the human genome.

Fig3. A schematic of a hypothetical microsatellite marker in human DNA. The different- sized alleles (numbered 1– 7) correspond to fragments of genomic DNA containing different numbers of copies of a microsatellite repeat, and their relative lengths are determined by separating them by gel electrophoresis. The shortest allele (allele 1) migrates toward the bottom of the gel, whereas the longest allele (allele 7) remains closest to the top. Left, For this multiallelic microsatellite, each of the six unrelated individuals has two different alleles. Right, Within a family, the inheritance of alleles can be followed from each parent to each of the three children.
Microsatellites are particularly useful for genetic map ping. Determining the alleles at multiple microsatellite loci is currently the method of choice for DNA fingerprinting used for identity testing. For example, the US Federal Bureau of Investigation (FBI) currently uses 20 STRs for its DNA fingerprinting panel. Two individuals (other than monozygotic twins) are so unlikely to have exactly the same alleles at all 20 loci that the panel will allow effectively definitive determination of whether samples came from the same individual. The information is stored in the FBI’s Combined DNA Index System (CODIS).
Mobile Element Insertion Variants
Nearly half of the human genome consists of dispersed families of repetitive elements (see Chapter 2). Although most of the copies of these repeats are stationary, some of them are mobile and contribute to human genetic diversity through the process of retrotransposition. As introduced in Chapter 3 in the context of processed pseudogenes, this involves transcription into an RNA, reverse transcription into a DNA sequence, and insertion (i.e., transposition) into another site in the genome. The two most common mobile element families are the Alu and long interspersed nuclear elements (LINE) families of repeats, and nearly 10,000 mobile element insertion variants have been described in different populations. Each polymorphic locus consists of two alleles, one with and one without the inserted mobile element (see Fig. 2). Mobile element variants are found on all human chromosomes; although most are found in nongenic regions, a small proportion of them are found within genes. For many of these loci the insertion allele has a frequency of greater than 10% in various populations.
Copy Number Variants
Another important type of human polymorphism includes CNVs, which are conceptually related to indels and mic rosatellites but involve larger segments of the genome, operationally defined as from 1000 bp to ~3 million bp (i.e., the span between limits of sequencing detection and cytogenetic analysis, respectively). In the general population, variants larger than 500 kb are found in 5% to 10% of individuals, and those encompassing more than 1 Mb in 1% to 2%. The largest CNVs are sometimes in regions of the genome characterized by repeated blocks of homologous sequences called segmental duplications (or segdups). The importance of these regions in mediating duplication and deletion of the corresponding segments is discussed further in Chapter 6 in the context of various chromosomal syndromes.
As with indels, smaller CNVs may have only two alleles (i.e., the presence or absence of a segment). Some large CNVs have multiple alleles due to the presence of different numbers of tandem copies of a DNA segment (see Fig. 2). In terms of genome diversity, the amount of DNA involved in CNVs vastly exceeds the amount that differs because of SNVs. Compared to the reference genome, the content of any given individual’s genome can differ by as much as 30 Mb because of copy number and indel differences.
Notably, since their variable segments can include from one to several dozen genes, CNV loci are frequently implicated in traits that involve altered gene dosage. When a CNV is frequent enough, it represents a background of common variation that must be under stood to properly interpret alterations in copy number for medical purposes. As with all DNA variation, the significance of different CNV alleles in health and disease susceptibility is the subject of intensive investigation.
Inversions
A final group of structural variants is inversions. These regions of the genome, from a few base pairs up to several Mb, are found in either of two orientations (see Fig. 2). Most inversions are characterized by regions of sequence homology at the edges of the inverted segment, implicating a process of homologous recombination in their origin. Regardless of orientation, an inversion that does not involve a gain or loss of DNA is balanced. Some can achieve substantial frequencies in the general population. However, anomalous recombination can result in the duplication or deletion of DNA located between the regions of homology— a process associated with clinical disorders that we will explore further in Chapters 5 and 6.
الاكثر قراءة في الوراثة
اخر الاخبار
اخبار العتبة العباسية المقدسة