BIO323H1F - Population Genetics


Population Genetics

Assume that there is polymorphism in a population, i.e., there are two alleles (alternate forms of a gene controlling a trait) at a single locus, which we will call A and B.

By convention, let p equal the frequency of A, and q the frequency of B. Because there are only two alleles at this locus, p + q = 1 (or 100%). Note, however, p and q can be any frequency between 0 and 1 (or 0% and 100%) [if they are either 0 or 1 then the locus is not polymorphic in that population].

Similarly, the frequency of A in male gametes (sperm) produced in that population will be p, and the frequency of A in female gametes (unfertilized eggs) will also be p. Thus, p X p percentage of the time a sperm carrying A will fertilize an egg carrying A, and therefore p X p, or p2, of the offspring produced in the population will be homozygous for the A allele. Likewise, q2 of the offspring will be homozygous for B, and 2 pq will be heterozygous. This is true regardless of what the values of p or q are.

E.g., in humans there are several blood type polymorphisms. One, the M-N polymorphism, has been well studied. In a sample of 1200 Swedes it the frequency of M was found to be 0.596 (i.e., 59.6% of the gametes produces were carrying the M allele); the frequency of N was 0.404. Thus, in that population we would expect 0.355 (or 35.5%) of the individuals to be homozygous MM, 0.163 (16.3%) to be homozygous NN, and 0.482 (48.2%) (i.e., 2pq) to be heterozygous. Note that 0.355+0.163+0.482 = 1.00--everyone is accounted for.

Further, those allele frequencies and genotype frequencies would stay the same, from generation to generation, in the absence of processes that would make them changes, processes such as:

   1. non-random mating

   2. natural selection

   3. mutation

   4. gene flow or migration (immigration and/or emigration between populations with different allele frequencies)

   5. chance deviations or drift.

This is the Hardy-Weinberg Law or Hardy-Weinberg Principle, and can be stated briefly as follows:

HARDY-WEINBERG LAW: In the absence of the evolutionary processes--mutation, migration, drift, and selection--gene or allele frequencies will remain constant generation to generation, and if mating is random, genotype frequencies also will remain constant, and equilibrium genotype frequencies are attained in a single generation of random mating.

SOME DEFINITIONS:

Population genetics: Examines the statistical consequences of Mendelism.

Locus: A site on a chromosome, or the gene that occupies that site.

Gene: A nucleotide sequence on a DNA molecule sequence that encodes a product that has a distinct function in an organism; the functional unit of heredity.

Allele: One of two or more forms of a gene, presumably differing in DNA nucleotide sequence.

In a cross between two individuals, there are a limited number of possible expected ratios of genotypes or of allele frequencies (e.g., p = 1.0, 0.0, 0.5, 0.25, 0.75) in offspring. However, in a population p can be any value.

 Number having blood group 
PopulationMMNNTotal
Eskimo475895569
Pueblo Indian834611140
Russians19521579489
English121200101422
Papuans1448138200

Could use these data to calculate allele frequencies as well as H-W Expected genotype frequencies. E.g., p for Eskimo = [475 + (0.5)(89)]/569 = 0.91 and q = 0.09

and for Papuans p = [14 + (0.5)(48)]/200 = 0.19 and q = 0.81

Expected number of MN in Eskimo, then, would be 2pq (569), or (2)(0.91)(0.09)(569) = 93

and for Papuans, (2)(0.19)(0.81)(200) = 62.

What would be the effect of intense outbreeding, e.g. a all matings between a Papuan

and an Eskimo? I.e., how important is random mating?

Evolutionary Processes:

   1. Mutation

   2. Drift (chance deviations from H-W expected)

   3. Gene flow (immigration + emigration)

   4. Selection

Mutation: Mutation rates are low, and although mutation is the ultimate source of genetic variation in populations, in any given generation it has little affect on H-W frequencies.

Genetic Drift: (Random fixation; founder effects; bottlenecks). Small populations.

H-W is a stochastic law. Allele frequencies should follow a binomial distribution, and the variance of a binomial distribution is:

s = pq/n

Thus, with p = q = 0.5, if n = 10, s = 1.58 (16%)

and n = 100, s = 5.00 (5%)

n = 10,000, s = 50.0 (1%)

Monte Carlo simulation, and Drosophila experiment.

Frequency of alleles at two loci relative to population size in house mice (Mus musculus) from Austin, Texas
Population SizeN (of pops.)Allele FreqVariance of Allele Freq.
  Est-3bHbb(s)Est-3bHbb(s)
Small (median size 10) 290.4180.8490.05060.1883
Large (median size 200) 130.3720.8430.01250.0083

Selander ("Behavior and genetic variation in natural populations," Am. Zool. 10:53-66, 1970)

Founder effects (and bottlenecks) y
Blood types of Dunkers and non-Dunkers from Pennsylvania and German
FrequenciesDunkersnon-Dunkers
  AmericansGermans
A382529
B347
M665455

Differences in the frequency of the M allele in three generations of Dunkers in Pennsylvania
Age Group (years)Frequency of M
3-270.74
28-550.66
55+ 0.55

Frequencies of A and B do not differ among generations (i.e., a founder effect).

Effective population size and inbreeding

   The concept of an effective population size allows consideration of an ideal population of the N in which all parents have an equal expectation of being the parents of any progeny.

   The probability that any one breeding individual produces a particular gamete in a hypothetical pool is 1/N when N is the number of breeding individuals in a population. [We assume here that we are dealing with a diploid monoecious individual, that is, that self-fertilization is a possibility. This is not a necessary assumption if we go back one generation.]

   The probability that an allele in an individual in generation t came from a given individual in the generation t - 1 is 1/N, where N is the number of individuals in the population; the probability that two alleles came from the same individual in generation t - 1 is (1/N)2, and because there are N individuals in the population, that is N(1/N)2 or 1/N. Ne (effective population size) is the number of individuals contributing genes to the following generation. The probability that 2 alleles in generation t came from a male in generation t - 1 is (1/2)(1/2)= 1/4 and the probability that they came from the same male is 1/4Nm where Nm is the number of males in the population. Similarly, the probability that 2 alleles came from the same female is 1/4 Nf. Thus, 1/Ne = 1/4Nf + 1/4Nm, or Ne = 4NfNm/(Nf + Nm).

Gene Flow (Migration)

   The inbreeding coefficient is: F = 1/(2Ne)

where F is the proportion of individuals in a population that are identical by descent at a given locus. In a large population, where Ne is a large number, F is small. When F is large, the population is inbred.

   There is a second part to the equation, for some of the Ne individuals in the population, i.e., 1 - 1/(2Ne)F1 are not newly-formed identical homozygotes, but are identical homozygotes from a previous generation. Thus, in the second generation:

   F2 = 1/(2Ne) + 1 + F1 ( 1 - 1/2Ne)

and in the tth generation:

   Ft = 1/(2Ne) + Ft-1 (1 - 1/2Ne)

This will go to 1 at a rate depending on Ne.

Migration: If immigration occurs at a rate M (= the number of immigrants/Ne), then only (1 - M) of the individuals in a population carry the native genes, and only (1 - M)2 of the zygotes could possibly be identical homozygotes, reducing the size of F thus:

   Ft = [1/(2Ne) + Ft-1(1-1/2Ne)](1-M)2

With immigration there will be an equilibrium situation where the rate at which identical homozygotes produced by inbreeding are balanced by immigration introducing non-identical alleles. At the equilibrium value of F:

At equilibrium:

    F = 1/(2Ne) + (1-1/(2Ne)F(1-M)2

reduces to:

   F = 1/(1 + 4NeM -2M -2NeM2 +M2)

If M is low, which is probable, then M2 and 2M are very small, which means that F can be approximated by:

   F = 1(1+4NeM)

Since M is a rate (number of migrants in the population of size Ne, say 1/1000) the product NeM will be the number of migrants. Thus, if there is only 1 migrant/generation,

   F = 1/(1+4) or 0.20. If there are 5 migrants, F is 1/21, or 0.05.

Gene flow (Migration): Can have a huge effect if it is among populations where allele frequencies are different. See Papuans and Eskimo above.

Individuals move from one population to another & interbreed.

Let m be the proportion of individuals in a population that are migrants (immigrants). In the next generation (1 - m) of the genes in the population are descendants of residents, and m are descendants of migrants.

Assume that in the surrounding populations (from which the immigrants come) a certain allele, A1, has an average frequency of P, while in the local population it has a frequency of p0. In the next generation the frequency of A1 in the local population will be:

   p1 = (1 - m)p0 +mP

   STRONG> = p0 -m(p0 - P)

that is, the new allelic frequency will be the original frequency multiplied by the proportion of reproducing individuals that are native (1 - m), plus the proportion of reproducing migrant individuals (m) multiplied by the difference in allelic frequency between the residents and migrants, i.e. (p0 - P).

The change (( p)) in allelic frequency is:

   p = p1 - p0

    = p0 - m(p0 -P) -p0

    = - m(p0 - P)

   Hence, the greater the proportion of migrants and the greater the difference in the frequencies of the A1 alleles in the local population and in the immigrant population, the greater p will be.

   p will be 0 either when (1) m = 0 (i.e. there is no migration) or

      when (2) p0 - P = 0 (i.e. the frequency of A1) is the same in both the native and migrant populations.

Since   p1 = p0 -m(p0 - P)

    p1 - P = p0 -m(p0 -P) -P

   

    = p0 -mp0 -P +mP

    = (1-m)p0 - (1 - m)P

    = (1 - m)(p0 - P)

and   p2 - P = (1 - m)2(p0 -P)

and after t generations:

   pt - P = (1 - m)t(p0 - P)

    pt = (1-m)t(p0 -P) + P

and

    (1 - m)t = (pt - P)/(p0 - P)

EXAMPLE:   Blacks were brought to the US ca. 300 years ago (ca. 10 generations) as slaves. There has been a certain amount of racial mixture on this Continent, and children of mixed parents are conventionally called "black."

   In west African blacks the frequency of the Rh- allele (Ro) (= d) = 0.630

   In American blacks from Oakland, CA, = 0.446

   In American Caucasians = 0.028

Substituting into (1 - m)t = (pt - P)/(p0 - P) where t = 10

   (1 - m)10 = (0.446 - 0.028)/(0.630 - 0.028)

    = 0.418/0.602 = 0.694

    1 - m = 0.6940.1 = 0.964

    - m = 0.964 - 1

    m = 0.036

Therefore, the gene flow from U.S. Caucasians into U.S. Blacks has occurred at a rate equivalent to an average of 3.6% per generation.

   Ten generations have left (1 - m)10 = 0.694 of all the genes in U.S. Blacks derived from their African ancestors, and 1 - 0.694 = 0.306 derived from their Caucasian ancestors--or slightly less more than 30% of the total. These numbers vary geographically, with more Caucasian genes in populations in Oakland, CA than, say, in rural Georgia.

   Blacks (Claxton, Georgia) Ro = 0.533

   Whites (Claxton, Georgia) Ro = 0.022

    m = (1 - m)10 = (0.533 - 0.022)/(0.630 - 0.022) = 0.511 / 0.608

    = 0.8400.1 = 0.983

    and     -m = 0.983 - 1 , m = 0.017 or 1.7% per generation

    1 - m = 1 - 0.840 = 0.26 = 26% of the genes in Claxton Blacks are of white origin.