The nature of genetic variation
المؤلف:
Cohn, R. D., Scherer, S. W., & Hamosh, A.
المصدر:
Thompson & Thompson Genetics and Genomics in Medicine
الجزء والصفحة:
9th E, P45-47
2025-11-26
63
As described in Chapter 2, a segment of DNA occupying a particular position or location on a chromosome is a locus (plural loci). A locus may be large, perhaps containing many genes, such as the major histocompatibility complex locus involved in the response of the immune system to foreign substances; it may be a single gene, such as the β- globin locus we introduced in Chapter 3; or it may be just a single base in the genome, as in the case of a single nucleotide variant (SNV). Alternative versions of the DNA sequence at a locus are called alleles. For many genes, there is a single prevailing allele, usually present in more than half of the individuals in a population, that geneticists call the wild- type or common allele. (In lay parlance, this is sometimes referred to as the normal allele; however, because genetic variation is itself very much normal, the existence of different alleles in nor mal individuals is commonplace. Thus one should avoid using normal to designate the most common or major allele.) The other versions of the gene are variant alleles that differ from the wild- type allele because of the effect of a mutation having changed the nucleotide sequence or arrangement of DNA. Note that the terms mutation and mutant apply to DNA, but not to individuals. They denote a change in sequence without any connotation with respect to the function or fitness of that change.
The frequency of different variants can vary widely in different populations, as we will explore in Chapter 10. If a locus in a population has two or more relatively common alleles (typically defined by convention as having an allele frequency >1%), the locus is said to exhibit polymorphism (literally “many forms”) in that population; thus such a locus is polymorphic. Most variant alleles, however, are rare; some are so rare as to be found in only a single family and are known as private alleles. Common jargon in genetics came to use polymorphism in reference to a variant rather than a locus, but following expert guidance, for clarity, we suggest use of common variant (rather than polymorphism) or rare variant (rather than mutation). An exception is for the use of single nucleotide polymorphism (SNP) in the context of microarrays, where it is strongly entrenched in the lexicon.
The Concept of Variation
In this chapter we begin by exploring the nature of genomic variation, ranging from the change of a single nucleotide to alterations of an entire chromosome. To recognize a change means that there has to be a gold standard, compared to which the variant shows a difference. As we saw in Chapter 2, there is no single individual whose genome sequence could serve as such a definitive standard for the human species, and thus one arbitrarily designates the most common sequence or arrangement in a population at any one position in the genome as the so- called reference sequence. As more and more genomes from individuals around the globe are sampled (and thus as more and more variation is detected among the currently 7.9 billion genomes that make up our species), this reference genome is subject to constant evaluation and change. Indeed, a number of international collaborations share and update data on the nature and frequency of DNA variation in different populations in the context of the reference human genome sequence and make the data available through publicly accessible databases that serve as essential resources for scientists, physicians, and other health care professionals (Table 1). As we learn more about variation and, in particular, as long- read sequencing allows us to fill holes in the reference genome, updated genome builds are released by the human genome reference committee; the current reference is hGRC38. Because errors are corrected and new sequences added, it is very important to always specify the build used to annotate a genomic variant.

Table1. Useful Databases of Information on Human Genetic Diversity
Variants are sometimes classified by the size of the altered DNA sequence and, at other times, by the functional effect of the change on gene expression. Although classification by size is somewhat arbitrary, it can be helpful conceptually to recognize the spectrum of changes at three different levels:
• Variation in chromosome number that leaves chromosomes intact but changes the number of chromosomes in a cell (aneuploidy)
• Alterations that change only a portion of a chromo some and might involve an unbalanced change of a subchromosomal segment or a structural rearrangement involving parts of one or more chromosomes (regional variation or copy number variation [CNV])
• Alterations of the sequence of DNA, involving the substitution, deletion, or insertion of DNA, range from an SNV through small repetitive units (such as trinucleotide repeats) and insertion- deletion variants (indels) up to an arbitrarily set (and evolving) limit of approximately 1 kb where such a change becomes a CNV. The basis for and consequences of this third type of variation are the principal focus of this chapter, whereas both chromosome and regional variation will be presented at length in Chapters 5 and 6.
The functional consequences of DNA mutations, even those that change a single base pair, run the gamut from being completely innocuous to causing serious illness, all depending on the location, nature, and size of the resulting variant. For example, even a change within a coding exon of a gene may have no effect on how a gene is expressed if the change does not alter the primary amino acid sequence of the polypeptide product; even if it does, the resulting change in the encoded amino acid sequence may not alter the functional properties of the protein. Not all variants, therefore, manifest in a clinical phenotype, though they will be reflected as DNA sequence variants.
The Concept of Common Variants
The DNA sequence of a given region of the genome is remarkably similar among chromosomes carried by many different individuals from around the world. In fact, any randomly chosen segment of human DNA of ~1000 bp in length, on average, will differ by only one base pair between the homologous segments inherited from that individual’s parents (assuming the parents are unrelated). However, across all human populations, hundreds of millions of single nucleotide differences and over a million more complex variants have been identified and catalogued. Because of limited sampling, these figures are likely to underestimate the true extent of genetic diversity in our species. Many populations have yet to be adequately studied. Even in those that have been well studied, the number of individuals examined is too small to reveal most variants with minor allele frequencies below 1% to 2%. Thus, as more people are included in variant discovery projects, additional (and rarer) variants will certainly continue to be uncovered.
Whether a variant is formally considered common or not depends entirely on whether its frequency in a population exceeds a certain threshold, such as 1% of the alleles in that population. It does not depend on what kind of mutation caused it, how large a segment of the genome is involved, or whether it has a demonstrable effect on the individual. Although most common sequence variants are located between genes or within introns and are most often inconsequential to the functioning of any gene, others may be located in the coding sequence of genes themselves and result in different protein variants that may lead in turn to distinctive differences in human populations. Still, others are in regulatory regions and may have important effects on transcription or RNA stability.
One might expect that deleterious variants that cause rare monogenic diseases are unlikely to become considered common variants. Although it is true that the alleles responsible for most clearly inherited clinical conditions are rare, some alleles that have a profound effect on health— such as alleles of genes encoding enzymes that metabolize drugs (e.g., sensitivity to abacavir in some individuals infected with human immunodeficiency virus), the sickle cell allele in African populations and others of African and Mediterranean ancestry, or the p.Phe508del variant in CFTR that causes cystic fibrosis are relatively common. Nonetheless, these are exceptions. As more and more genetic variation is discovered and catalogued, it is clear that the vast majority of variants in the genome whether common or rare reflect differences in DNA sequence that have no overt significance to health.
Common variants are key elements for the study of human and medical genetics. The ability to distinguish different inherited forms of a gene or different segments of the genome provides critical tools for a wide array of applications, both in research and in clinical practice (see Box 1).

Box1. INHERITED VARIATION IN HUMAN AND MEDICAL GENETICS
الاكثر قراءة في الوراثة
اخر الاخبار
اخبار العتبة العباسية المقدسة