As we said in the previous section, errors may occur in the transcription of a gene during DNA replication. The location of the gene along the chromosome is called locus. Because of mutations, in a certain locus there can be different variants of a gene, which are called alleles. For simplicity, we will assume from now on to have only two alleles denoted by A and a. Suppose we have to deal with organisms with diploid cells - pairs of chromosomes - so that organisms can be, at one locus of the chromosome pair, of type AA or aa or Aa. These variants of an organism of the same species are called genotypes. Genotypes AA and aa are called homozygotes (organisms derived from homogeneous zygotes) while genotypes Aa are called heterozygotes (organisms derived from heterogeneous zygotes ). The expression of the gene pool in the visible characters of an organism is called phenotype. Genotype and phenotype are not always in one-to-one correspondence because different genotypes may sometimes correspond to the same phenotype. For example, in humans the “eye color” phenotype can be very roughly classified into two categories (brown eyes and blue-green eyes) which are controlled by two alleles (gene A for brown and gene a for blue-green). AA homozygotes and Aa heterozygotes both have brown eyes, while only aa homozygotes have blue-green eyes. In this case we say that the A allele is dominant and a is recessive. Usually, one indicates the dominant allele with a capital letter and the recessive with a small letter. Not always is there complete dominance: many times the phenotype of the heterozygote is intermediate between the phenotypes of the homozygotes. For example, the coat of a heterozygous mammal with parents one black-coated and the other one white-coated could be grey.
In a population of total size denote the number of genotypes AA, Aa and aa respectively with
,
and
. Then we define the genotype frequencies of homozygous dominants, heterozygotes and homozygous recessives as
Therefore the allele frequencies are solely determined by the genotype frequencies because
Fig. 16 reports a schematic diagram of the life cycle of a semelparous population with random union of gametes which we will use to derive the Hardy-Weinberg law. Let be the number of adults of the
-th generation and let
,
and
be the genotype frequencies of adults in the
-th generation. Then the allele frequencies in adults are
We can therefore state the Hardy-Weinberg law saying that, under the hypotheses listed above, the genotype frequencies converge, in a single generation, to the equilibrium values
The Hardy-Weinberg law is fairly robust: it holds in fact under less restrictive assumptions than those listed above (although the derivation of the law becomes harder). In particular, it is not necessary to assume discrete generations or equal genotype frequencies in males and females; moreover, instead of random union of gametes released in the environment, one can assume random mating of males and females and internal fertilization. With these less restrictive assumptions, there is still convergence towards the Hardy-Weinberg frequencies, although convergence can take longer than a generation. It is important to remark that if a gene is present with very low frequency in the initial population, the Hardy-Weinberg law stipulates that this gene will remain indefinitely in this population and therefore heterozygous individuals will indefinitely be present in the population, thus transmitting both genes to the next generation. However, this conclusion is crucially based on the assumption that the population is very large, which is not at all verified in the case of populations that are threatened by extinction. To analyse the problem of the extinction risk, one must then go beyond the findings summarized in the Hardy-Weinberg law.