The Hardy-Weinberg law

To study the genetic deterioration and its consequences on the process of extinction, we must understand how the genetic structure of a population can change in time. The simplest model that can explain its dynamics is the so-called Hardy-Weinberg law. To illustrate and derive it, it is necessary to introduce some basic concepts of population genetics.

As we said in the previous section, errors may occur in the transcription of a gene during DNA replication. The location of the gene along the chromosome is called locus. Because of mutations, in a certain locus there can be different variants of a gene, which are called alleles. For simplicity, we will assume from now on to have only two alleles denoted by A and a. Suppose we have to deal with organisms with diploid cells - pairs of chromosomes - so that organisms can be, at one locus of the chromosome pair, of type AA or aa or Aa. These variants of an organism of the same species are called genotypes. Genotypes AA and aa are called homozygotes (organisms derived from homogeneous zygotes) while genotypes Aa are called heterozygotes (organisms derived from heterogeneous zygotes ). The expression of the gene pool in the visible characters of an organism is called phenotype. Genotype and phenotype are not always in one-to-one correspondence because different genotypes may sometimes correspond to the same phenotype. For example, in humans the “eye color” phenotype can be very roughly classified into two categories (brown eyes and blue-green eyes) which are controlled by two alleles (gene A for brown and gene a for blue-green). AA homozygotes and Aa heterozygotes both have brown eyes, while only aa homozygotes have blue-green eyes. In this case we say that the A allele is dominant and a is recessive. Usually, one indicates the dominant allele with a capital letter and the recessive with a small letter. Not always is there complete dominance: many times the phenotype of the heterozygote is intermediate between the phenotypes of the homozygotes. For example, the coat of a heterozygous mammal with parents one black-coated and the other one white-coated could be grey.

In a population of total size $N$ denote the number of genotypes AA, Aa and aa respectively with $N_{AA}$, $N_{Aa}$ and $N_{aa}$. Then we define the genotype frequencies of homozygous dominants, heterozygotes and homozygous recessives as

$\displaystyle D = \dfrac{N_{AA}}{N}, \qquad
H = \dfrac{N_{Aa}}{N}, \qquad
R = \dfrac{N_{aa}}{N}.
$

Their sum is obviously equal to one because

$\displaystyle N = N_{AA} + N_{Aa} + N_{aa}.
$

Within the population of size $N$ there are $2N$ genes of type A and a, since the cells are diploid. We call gene frequencies or allele frequencies the proportions of each allele in the population. Obviously it turns out that the frequency p of allele A and q of allele a are given by

$\displaystyle p = \dfrac{{2{N_{AA}} + {N_{Aa}}}}{{2N}}, \qquad
q = \dfrac{{2{N_{aa}} + {N_{Aa}}}}{{2N}}
$

and of course $p + q = 1$.

Therefore the allele frequencies are solely determined by the genotype frequencies because

$\displaystyle p = D + \frac{H}{2}, \qquad
q = R + \frac{H}{2}.
$

Instead, the opposite is not true, that is in general one cannot derive the genotype frequencies from the allelic ones in an unequivocal way. To that end, we have to explicitly introduce some specific assumptions on the demographic and genetic mechanisms that govern the dynamics of the population. The Hardy-Weinberg law is just a relationship between allele frequencies and genotype frequencies which in its simplest formulation is based on the following assumptions. First, suppose we consider a locus with two alleles A and a (the extension to the more complex case of three or more alleles is left to the reader as an exercise). Further assume that

Figure 16: Life cycle scheme of a semelparous population with random union of gametes.
\includegraphics[width=\linewidth]{allee-genetica-schema-cicloHW.eps}

Fig. 16 reports a schematic diagram of the life cycle of a semelparous population with random union of gametes which we will use to derive the Hardy-Weinberg law. Let $N_t$ be the number of adults of the $t$-th generation and let $D_t$, $H_t$ and $R_t$ be the genotype frequencies of adults in the $t$-th generation. Then the allele frequencies in adults are

$\displaystyle {p_t}$ $\displaystyle =$ $\displaystyle {D_t} + \frac{{{H_t}}}{2}$  
$\displaystyle {q_t}$ $\displaystyle =$ $\displaystyle {R_t} + \frac{{{H_t}}}{2} .$  

Since there is no selection and all genotypes are thus equally fertile, the number of gametes (haploid cells) carrying the gene A is $f(D_t + H_t/2) N_t$ and that of gametes carrying the gene a is $f(R_t + H_t/2) N_t$, where $f$ indicates fertility. It follows that the allele frequencies in the gametes are equal to the allele frequencies that characterized adults. As the population is very large we can identify, using the law of large numbers, frequencies with probabilities and state that, because of the random union of gametes,
$\diamond$
the probability that a zygote AA is formed is $p_t^2$
$\diamond$
the probability that a zygote Aa (or equivalently aA) is formed is $2p_tq_t$
$\diamond$
the probability that a zygote aa is formed is $q_t^2$.
Because of the very large number of zygotes (this hypothesis is essential), we can identify these probabilities with the genotype frequencies in zygotes. Finally, since we can state that the genotype frequencies in adults at the generation $t + 1$-th are equal to the frequencies in the zygotes. Therefore we conclude that
$\displaystyle {D_{t + 1}}$ $\displaystyle =$ $\displaystyle p_t^2$  
$\displaystyle {H_{t + 1}}$ $\displaystyle =$ $\displaystyle 2{p_t}{q_t} = 2{p_t}\left( {1 - {p_t}} \right)$  
$\displaystyle {R_{t + 1}}$ $\displaystyle =$ $\displaystyle q_t^2 = {\left( {1 - {p_t}} \right)^2} .$  

Frequencies $p_t^2$, $2p_tq_t$ and $q_t^2$ are called Hardy-Weinberg frequencies. Note that they are different from the genotype frequencies at the generation $t$. Instead, the allele frequencies do not change between generation $t$ and $t + 1$, because
$\displaystyle {p_{t + 1}}$ $\displaystyle =$ $\displaystyle {D_{t + 1}} + \frac{{{H_{t + 1}}}}{2} = p_t^2 + {p_t}\left( {1 - {p_t}} \right) = {p_t}$  
$\displaystyle {q_{t + 1}}$ $\displaystyle =$ $\displaystyle {R_{t + 1}} + \frac{{{H_{t + 1}}}}{2} = {\left( {1 - {p_t}} \right)^2} + {p_t}\left( {1 - {p_t}} \right) = 1 - {p_t} = {q_t} .$  

Owing to this second result, the genotype frequencies at generation $t + 2$ can be deduced with exactly the same reasoning as above, and therefore - given the time invariance of the allele frequencies - they will be equal to the Hardy-Weinberg frequencies. And so on for the subsequent generations.

We can therefore state the Hardy-Weinberg law saying that, under the hypotheses listed above, the genotype frequencies converge, in a single generation, to the equilibrium values

$\displaystyle D = p_0^2\qquad H = 2{p_0}{q_0} \qquad R = q_0^2
$

where $p_0$ and $q_0$ indicate the initial allele frequencies.

The Hardy-Weinberg law is fairly robust: it holds in fact under less restrictive assumptions than those listed above (although the derivation of the law becomes harder). In particular, it is not necessary to assume discrete generations or equal genotype frequencies in males and females; moreover, instead of random union of gametes released in the environment, one can assume random mating of males and females and internal fertilization. With these less restrictive assumptions, there is still convergence towards the Hardy-Weinberg frequencies, although convergence can take longer than a generation. It is important to remark that if a gene is present with very low frequency in the initial population, the Hardy-Weinberg law stipulates that this gene will remain indefinitely in this population and therefore heterozygous individuals will indefinitely be present in the population, thus transmitting both genes to the next generation. However, this conclusion is crucially based on the assumption that the population is very large, which is not at all verified in the case of populations that are threatened by extinction. To analyse the problem of the extinction risk, one must then go beyond the findings summarized in the Hardy-Weinberg law.