Dr. Rob Dillon, Coordinator

Monday, January 14, 2019

The best estimate of the effective size of a gastropod population, of any sort, anywhere, ever

First, a quick refresher on week #4 of your population genetics class.  One of the assumptions of Hardy-Weinberg equilibrium is that population size is effectively infinite.  If that assumption is met (together with all the other ones) gene frequencies will not change.  But if the population is small, gene frequencies will change by sampling error, for the same reason that if I flip a fair coin twice, there a 50% chance of that either the head will not be represented, or the tail.  That phenomenon is called “genetic drift.”

So, suppose we were standing on the edge of a huge field of peas, say 100,000 plants, polymorphic at the seed color locus – green and yellow – meeting all the Hardy-Weinberg assumptions (no selection, no migration, random mating and all that, Note 1).  And suppose we sampled 100 plants, and shucked a bunch of pods, and estimated the frequencies [2] of the green and yellow genes in the remaining 99,900.  And then we waited a year, let all 99,900 plants reproduce, cover the field with a new generation, and sampled another 100.  And we didn’t obtain exactly the same gene frequencies in year 2 that we did in year 1.

Much of the reason that the year-2 gene frequencies didn’t match the year-1 frequencies would certainly be our own sampling error – that we sampled a finite number from year 1 and a finite number from year 2.  But part of that allele frequency variance (new term) might also be due to the sampling error of the peas themselves – only a finite number of peas reproduced to yield the year-2 population.  Might we have been wrong about that 99,900 estimate?  Maybe, effectively, there weren’t 100,000 peas in the field at all?

The Duck Pond at Quarterman Park
So effective population size (Ne) is the size of an idealized population that demonstrates the same allelic frequency variance as the population under study.  There are many possible reasons why Ne is always less than or equal to N, the headcount (or stem count) population size.  Often much less.

Well, I hate to be pedantic, but there’s another (equivalent, but somewhat more difficult) definition of Ne.

Suppose we were to sample 100 peas from our population at a single date but analyzed gene frequencies at two polymorphic loci, not just one.  So, say seed color (green/yellow) and seed shape (wrinkled/smooth).  There is no reason to expect any relationship between seed color and seed shape – knowing green shouldn’t allow you to predict wrinkled.  That is true regardless of whether the seed color locus and the seed shape locus are on the same chromosome or not, because (in a large, randomly-breeding population) crossing-over would ultimately erase any initial relationship between the alleles.

But what if you did find a relationship between seed color and seed shape in the pea field?  This is often called “gametic phase disequilibrium” to distinguish it from the (hard) linkage disequilibrium you might discover between loci on the same chromosome using a controlled cross.  That would mean that the population wasn’t infinitely large and randomly-breeding.

So finally.  Effective population size is the size of an idealized population that demonstrates the same allelic frequency variance or the same gametic phase disequilibrium as the population under study.  Geeze, that turned out to take longer to explain than I imagined when I started this essay eight paragraphs ago.  I appreciate your forbearance.

Effective population size is a parameter in every model of theoretical population genetics ever published, neutral or otherwise.  That number is really, REALLY important.  And also, really difficult to obtain.

I only know of ten estimates of Ne in gastropod populations ever published, in total, for all environments: 2 marine, 4 terrestrial, and 4 freshwater [3].  And frankly, many of those ten are pretty darn spurious.

OK, I’m going to change subjects entirely here.  But don’t forget all the boring population genetics stuff you had to slog through above, because you’re going to need it again, shortly.

I taught Genetics Laboratory 305L at The College of Charleston for 33 years.  And Investigation #9 in my Genetics 305L lab manual was “The analysis of genetic polymorphism in a natural population using allozyme electrophoresis.”  I needed a sample of 31 individual somethings from some polymorphic population for each section, which toward the latter half of my career [4], became three sections per semester, two semesters per year, that’s N = 186 somethings.  And after years of messing around with pleurocerid snails and Mercenaria hard clams, those somethings became Physa acuta.

Now my second-favorite population of Physa in the world [5] inhabits the Duck Pond at Quarterman Park in North Charleston.  Amy Wethington and I first sampled that population (“NPK”) in 1990 in conjunction with our study of sea island biogeography [6], and I knew it was polymorphic at three allozyme-encoding loci: Isocitrate dehydrogenase (Isdh, 3 alleles), 6-phosophogluconate dehydrogenase (6pgd, two alleles) and Esterase-3 (Est-3, two alleles, Note 7).

So, in May of 2009 I collected my first big sample of Duck Pond Physa for the students working on Genetics Lab Investigation #9.  In many respects the one-hectare pond at Quarterman Park is perfect Physa habitat – shallow, quiet, and very rich, fed by runoff from the neighborhood.  Every day the kids bring bags of stale bread to feed the ducks, which they empty and throw into the pond with all the other picnic garbage.  Occasional whiffs of sewage.  Physa paradise.

Physa paradise
Only three factors keep that pond from becoming a block of Physa solid enough to walk across.  One is the flocks of ducks and geese, which probably prefer the Physa over the bread.

A second is the general lack of habitat.  Some 15 – 20 years ago the City of North Charleston undertook a complete renovation of the Duck Pond, draining it and shoring up the walls with bulkheads.  So, the modern pond has no vegetation of any sort, nor indeed any littoral zone.  Physa are common only on the floating allochthonous debris and garbage that accumulates at the eastern end, by the drain, and rather rare elsewhere.

And a third is the summer maximum temperatures, which can be brutal.  The City of North Charleston also installed a couple fountain pumps during those renovations a few years ago, but during the hot months, I feel certain that all aerobic life must be confined to the top centimeter or two of the pond.  And concentrated at the eastern end.

All those stipulations registered, I had no problem collecting several hundred Physa at the Quarterman Park Duck Pond in the spring of 2009.  I knelt on the walkway at the eastern end of the pond and hand-plucked Physa off the floating debris.  Or sometimes I found it easier to wash higher concentrations of snails off large sticks and bread bags into my bucket.  The entire sample didn’t take more than 30 minutes to collect.

And in the fall of 2009, the students enrolled in my three sections of Genetics Lab 305L estimated gene frequencies at three loci in 93 of them.  And ditto for another N = 93 in the spring of 2010.  And I went back to the Duck Pond to fetch more.

This went on for seven years, from 2009 to 2015.  The actual date of the sampling varied a bit from year to year, depending on the density of the snail population, which in turn, seemed to vary with the weather.  I could always find at least a few Physa in the Duck Pond – any day, 12 months per year.  But to collect the hundreds I needed annually, I needed a bloom.

After a few years of experience, I began to notice a relationship between Physa blooms and the blooming of the azaleas.  In the Charleston area, as I am sure elsewhere, azaleas bloom in response to a pulse of warmth and sunshine – the stronger the pulse, the more brilliant the display.  The bloom typically takes place in March here and lasts for several weeks.  My annual observations suggested that the Physa bloom at the Quarterman Park Duck Pond typically commenced around the week the azaleas dropped their flowers – in early to mid-April.  The population typically expanded through May, contracted in June, and died back almost entirely in the summer.

The only exception happened in 2012, when no Physa bloom occurred at all.  The spring of 2012 was exceptionally hot in Charleston – the mean March temperature (65.3 degrees F) the second-highest value in the 80-year record of the National Weather Service.  I was never able to make a collection that year.  I visited the pond every couple weeks from March to July, and could always find a few snails, but never in the quantity that would prompt me to get on my knees and start washing bread bags into buckets.

So I was relieved of my duties at the College of Charleston in February of 2016, and ultimately banned from campus for a Woodrow Wilson quote [8], bringing my study to an end with the 2015 field season.  The paper reporting the results obtained by myself and my team of 540 undergraduates was published early last year in Ecology and Evolution, citation from Note [9] below.

I should thank my good friend and former student Dr. John Robinson for putting me onto a really excellent freeware resource called “NeEstimator,” developed by Chi Do and colleagues [10].  The software calculates effective population size using both allelic frequency variance (which are called “two-sample methods”) and gametic phase disequilibrium (“one-sample methods”).  All the statistics and other gory details are available in my paper.

The bottom line turned out to be that Ne for the Physa population at the Quarterman Park Duck Pond was infinite in 2009 and 2010, dipped in 2011 to somewhere around Ne = 100, dipped again between 2011 and 2013 to around Ne = 50, popped up to around Ne = 200 in 2014, and then rebounded to infinite again in 2015.  These results are remarkably consistent across both my one-sample and the two-sample analyses, which are independent, and really tend to strengthen their mutual credibility.

I suppose the first explanation that might occur to one would be a population bottleneck in the 2012 year.  But bottleneck effects are notoriously long-lasting… once allelic diversity is lost, it takes many, many generations to regain it.

I think the key factor in the volatility of Ne demonstrated in this study may be cryptic population subdivision.  In retrospect, the striking dip in apparent population census size I observed in 2012 may have been localized at the east end of the pond, and its subsequent recovery due to immigration from elsewhere within a Physa population subdivided by distance.

Some of the most influential studies of population subdivision published ever have been conducted using land snail models. Cain and Currey (1963) described small-scale variation in the frequencies
of shell color morphs in the English land snail Cepaea as “area effects,” attributing the phenomenon to genetic drift [11].  Could freshwater gastropod populations perceive their environments much differently than Cepaea?

At minimum, these results should be received as a cautionary tale by those researchers, including yours truly, a sinner, who would represent the evolutionary relationships between populations by single samples, even as large as 200, even with multiple polymorphic loci, collected from single sites at single dates.

And as for the practice of sampling individual genes from individual snails from individual populations to represent an entire biological species?  That’s just plain White-House-stupid.


[1] I know that garden peas are self-pollinating.  Give me this one, for the sake of the example, OK?  Geeze, you must have really irritated your tenth-grade biology teacher.

[2] And I also realize that, because of dominance, you’d have to assume HWE to get gene frequencies at the seed color locus in garden peas.  Doggone it, now you’re beginning to piss me off.

[3] See the introduction section of my paper from note [9] below for the references.

[4] From 1983 into the mid-1990s, I only taught one section per semester – perhaps 15 students.  But the number of lab sections I taught per semester increased from two to three in the latter half of my career, as my lecture sections were assigned to adjunct faculty more sensitive to the self-esteem of the customers.

[5] My first-favorite Physa population inhabits the pond at Charles Towne Landing State Park.  See:
  • To Identify a Physa, 1989 [3Oct18]
  • Albinism and sex allocation in Physa [5Nov18]
[6] Dillon, R.T., and A.R. Wethington (1995) The biogeography of sea islands: Clues from the population genetics of the freshwater snail, Physa heterostropha. Systematic Biology 44:401-409.  [PDF]

[7] Dillon, R.T., and A.R. Wethington (1994) Inheritance at five loci in the freshwater snail, Physa heterostropha. Biochemical Genetics 32:75-82. [PDF]

[8] Who Decides What Must Be on a Syllabus?  Inside Higher Ed, 8Aug16.  [html]

[9] Dillon, R. T. (2018) Volatility in the effective size of a freshwater gastropod population.  Ecology and Evolution 8: 2746 - 2751. [https://doi.org/10.1002/ece3.3912]  [PDF]

[10] Do, C., Waples, R., Peel, D., Macbeth, G., Tillett, B., & Ovenden, J.  (2014) NeEstimator v2: Re-implementation of software for the estimation of contemporary effective population size (Ne) from genetic data. Molecular Ecology Resources, 14, 209–214. https://doi.org/10.1111/1755-0998.12157

[11] Cain, A. J. & Currey, J. D. (1963) Area effects in Cepaea. Phil. Trans. R. Soc. London Series B 246: 1-81.  Cain, A. J. & Currey, J. D. (1968) Studies on Cepaea III: Ecogenetics of a population of Cepaea nemoralis (L) subject to strong area effects.  Phil. Trans. R. Soc. London Series B: 253, 447-482.  Ochman, H., J. S. Jones & R. K. Selander (1983) Molecular area effects in Cepaea.  PNAS 80: 4189 – 4193.