Dr. Rob Dillon, Coordinator

Tuesday, March 9, 2021

A Gene Tree for the Worldwide Viviparidae

Editor’s Note – This essay was subsequently published as: Dillon, R.T., Jr. (2023c)  A gene tree for the worldwide Viviparidae.  Pp 75 – 84 in The Freshwater Gastropods of North America Volume 7, Collected in Turn One, and Other EssaysFWGNA Project, Charleston, SC.

I may be mellowing in my old age.  I’m not sure.  Isaac Lea and his 505 species of pleurocerids used to infuriate me, as opposed to merely driving me nuts [1].  Frank Collins Baker's 30 species and subspecies of fossarine lymnaeids were a source of tremendous frustration to most of my 40-year career, now they are but a mild irritant [2].  I have come to understand that we scientists, no different from everybody else, can only operate by the standards of our day.  Isaac Lea’s science was good by the standards of 1862, and Baker’s great by the standards of 1911.  I’m trying to give everybody the same breaks I hope I myself will be granted 100 years from now.

So I used to find gene trees very nearly intolerable.  They started popping up in the peer-reviewed literature of the 1990s, and forests of them were disgorged from the printers of earnest graduate students in the 2000s and straight into some of the best journals publishing at that time, each pocked and wilted by some or all of the same seven cankers, which I will enumerate below, on my to overlooking at least some of the rot:

(1) The typical gene tree is typological.  During the first 10 – 15 years of the fad, the community of molecular phylogenetics rarely demonstrated any appreciation of intrapopulation or interpopulation genetic variance whatsoever.  The authors of gene trees all seemed to follow what I have described as the U1S2NMT3 rule [3], with usually one, sometimes two, never more than three individuals per population, and usually one, sometimes two, never more than three populations per species, and so forth.  At these tiny sample sizes, the significance of any sequence variation that may be brought to light cannot be assessed. 

Figure S7 of Stelbrink et al. [7]

(2) By the mid-2000s, however, sample sizes were occasionally N > 3.  At that point it became clear that intrapopulation sequence variation could surpass interspecific variation, at least in the mtDNA of freshwater and terrestrial gastropods [4].  The authors of U1S2NMT3 gene trees could not accommodate this important finding and so ignored it.

(3) Typical gene trees are rarely standardized.  Their authors often demonstrate no appreciation for original descriptions or type localities, identifying the U1S2NMT3 snails they select for sequencing by unspecified criteria.

(4) Computational phylogenetics is a nightmare fifty-years running, from which evolutionary science cannot seem to wake.  Dozens of methods have been proposed to germinate and fertilize gene trees, each with a unique set of assumptions, through which the phylogenetics community, riven by fad and fashion, cycles endlessly.  Even unto this day, the authors of gene trees typically run multiple data analyses, and publish multiple trees, from which they select their favorite.

(5) Gene trees are not species trees, or population trees, for that matter [5].  The phenomenon of incomplete lineage sorting guarantees that some fraction of the gene trees produced for any set of populations will yield a misleading picture of their evolutionary relationship, and indeed coalescence theory can be used to predict what fraction of gene trees will be flat wrong, and the molecular phylogenetics community has no way to deal with this problem, so they have doubled-down on it.

(6) Finding themselves unable to accommodate reproductive isolation in their evolutionary models, the phylogenetic community has redefined the word “species” away from a rigorously-scientific biological concept to something (anything!) their technique can actually measure.  By 2007, the USNM cladist Kevin De Queiroz was listing five species concepts, either appropriated by molecular phylogeneticists or ginned up afresh, all of which depended on arbitrary judgement calls about the significance of branch lengths and clumps [6].

(7) And then they argued that the usually-one, sometimes-two, and never-more-than-three random snails they plucked out of random creeks and identified arbitrarily constituted endangered species by criteria only they themselves could judge and wrote grant proposals to natural resource agencies to find more such precious jewel boxes of biodiversity and got them funded.  And repeat.

By the mid-2000s, the technology had reached the point that grad students were sequencing 40 -50 individual snails per dissertation, from which they were generating 40 – 50 gene trees under 40-50 different analytical assumptions, picking the result most likely to lead to additional funding.  I would sit in darkened seminar rooms for hours, watching gigantic gene trees labelled with tiny fonts, branches lit up in T-Mobile fuchsia and Mountain Dew green, and I would get such a headache.  Molecular phylogenetics is the rap music of evolutionary science, I thought – a tragic fad, a quick and easy path to job security in academia, surely it will go away.  But it hasn’t.  Neither has.

But in the last 5-10 years I have come to understand that the community of molecular phylogenetics looks at our science from a different perspective.  I look up and cannot imagine how one would begin to construct an evolutionary model of higher taxa until one had a strong model of the evolution of populations and species.  They look down and think of species and populations as following from higher taxa.  I want to make the gene tree the last chapter of my dissertation, they want to make it the first of theirs.  I think of gene trees as dependent variables, they think of them as independent variables.

From Fig. 2 of Stelbrink et al. [7]

And sample sizes have improved, as sequencing technology has advanced.  I do think I see more standardization by type locality, more calibration by intra- and interpopulation variance, methodologies to identify “bar-coding gaps,” and so forth.  At least some fraction of the molecular phylogenetics community now seem to offer their gene trees as weak, null models of population relationship, not as dowsing rods to wellsprings of imperiled biodiversity or guideposts to the promised land of rank-free classification systems.  And as that they are useful.   I guess I’ve become more forgiving.

So early last year a U1S2NMT3 gene tree of the old school crossed my desk, and it did not infuriate me nearly as much as it would have ten years ago [7].  The sequence data analyzed came from a sample of viviparids worldwide in scope, which is always commendable, collected by 16 authors from 11 different countries.  And I guess I can forgive its young first author, Björn Stelbrink (who lists both German and Swiss addresses) for ignoring intraspecific variance, given 61 nominal species in 24 nominal genera over five continents.  Doggone tree is big enough.  T-Mobile fuchsia would have been next [8].

Stelbrink and his colleagues sequenced three genes, mitochondrial CO1 and nuclear 28S and H3, for 193 individual snails.  And depicted above is the bottom half of their “BEAST-MCC” tree [9] from concatenated sequences, with “Bellamyinae – clade B” trimmed off.  The branches I have deleted – 40 tips assigned to 13 genera, are entirely Asian and African, and do not concern us in North America.

And here is the first thing to notice.  The two big oriental species widely introduced in the USA, labelled Cipangopaludina chinensis and Cipangopaludina japonica above, did not cluster with Bellamya (ss) in that mixed Asian/African "Bellamyinae - clade B" I trimmed off the top.  They clustered in the (entirely Asian) “Bellamyinae – clade A.”

Cipangopaludina was proposed by Harold Hannibal in 1912 as a subgenus of Pilsbry’s (1901) Idiopoma for Reeve’s (1863) Paludina malleata of Japan [10].  Hannibal was working from California populations of what we would today call C. japonica [11].  Clench and Fuller (1965) considered Idiopoma a simple synonym of Viviparus, continuing to accord Cipangopaludina subspecific rank [12].  Gary Pace [13] was the first to elevate Cipangopaludina to the full genus level, working with the native fauna of Taiwan.  When Burch [14] followed Pace, the rank of Hannibal’s Cipangopaludina was firmly fixed at the genus level.  And our two introduced species entered the canon as Cipangopaludina japonica and Cipangopaludina chinensis.

Those two species were universally assigned to the genus Cipangopaludina from around 1980 until 2000, when Doug Smith [15] pointed out that the characters by which Hannibal had described his genus were weak and variable, and that the rest of the world, including most workers in the countries from whence our two large viviparids must have been exported, were referring them to the genus Bellamya.

The Frenchman Félix Pierre Jousseaume proposed the genus Bellamya in 1886 to hold his typical B. bellamya, a large viviparid sent to him from Senegal [16].  Stelbrink and colleagues did not sample Jousseaume’s typical species (darn it), but they did include on their tree 15 other Bellamya samples from around Africa, and all of them clustered tightly in Clade B.

Cipangopaludina, again [17]

So the gene tree reproduced above now convinces me that our introduced species are best allocated back to Cipangopaludina, as they were classified from 1980 – 2000, rather than to Bellamya (ss), a genus name that probably ought to be reserved for African taxa.  It bothers me, a little bit [18], that Hannibal never visited Japan, proposing Cipangopaludina from America with no reference to any other viviparid taxa that might inhabit the waters of their native East Asia.  But subjectively, looking at the Stelbrink gene tree, I do see the appeal of separating Bellamyinae Clade A and Clade B as different genera.  And of the eight genus names applied to Clade A (Torotaia, Angulyagra, Sinotaia, Anularya, Celetaia, Margarya, Heterogen and Cipangopaludina), Hannibal’s Cipangopaludina is second-oldest [19].  And certainly, the first-most familiar here in America.

One other little feature of Stelbrink’s gene tree we might notice, before closing this month’s essay.  Viviparus georgianus was described by Isaac Lea from Hopeton, Georgia in 1834, and up until the 20th century was unknown in waters further north.  Clench [20] dated its arrival in the northeastern United States to Boston in 1916, from whence it has spread widely, across New England and New York through Michigan, Illinois and Wisconsin, even into southern Canada. 

Arthur Clarke [21] noted the striking similarity between Canadian populations of V. georgianus and the European Viviparus viviparus, speculating that Canadian Viviparus populations might represent a cryptic invasion from Europe.  The sequence data do not bear this hypothesis out, however.  The gene tree above depicts three European species, V. viviparus, V. ater, and V. contectus, as strikingly divergent from the North American V. georgianus, V. subpurpureus, and Tulotoma magnifica.  The Stelbrink results do not directly address Clarke’s cryptic invasion hypothesis, as the individual V. georgianus Stelbrink sequenced seems to have been collected from Alabama [22].  But GenBank also holds several CO1 sequences from V. georgianus populations sampled in Canada and New York, and all those demonstrate 98-99% matches to Stelbrink’s Alabama sequence, roughly 85% similar to the nearest European Viviparus.

So gene trees, even old-school U1S2NMT3 specimens such as we have reviewed in the present blog post, can offer some insight into coarse genetic relationships among far-flung taxa.  But how about more interesting and important questions?  Stelbrink and his colleagues also seem to have included six nominal species of Campeloma in their analysis, one snail each, in that orange block near the bottom of their tree.  Can their cute little data set tell us anything about population relationships among the North American Campeloma?  Tune in next time.


[1] This is a reference back to my series of essays on Pleurocera troostiana:

  • Isaac Lea Drives Me Nuts [5Nov19]

[2] We’re going to have much more to say about the crappy little amphibious lymnaeids often referred to as “fossarines” in coming months.  But for now, see:

  • The Legacy of Frank Collins Baker [20Nov06]
  • The Classification of The Lymnaeidae [28Dec06]
  • The Lymnaeidae 2012: Fossarine Football [7Aug12]

[3] For coinage of the “U1S2NMT3 Rule,” see:

  • The Lymnaeidae 2012: Stagnalis yardstick [4June12]

[4] For all you could ever want to know about intrapopulation mtDNA sequence divergence, see:

  • Mitochondrial Superheterogeneity: what we know [15Mar16]
  • Mitochondrial Superheterogeneity: What it means [6Apr16]
  • Mitochondrial Superheterogeneity and Speciation [3May16]
  • Mitochondrial heterogeneity in Marstonia lustrica [3Aug20]

[5] For more about gene trees and reproductive isolation, see:

[6] Five of the 11 species concepts listed in de Queiroz Table 1: Evolutionary, Phylogenetic (Hennigian), Phylogenetic (Monophyletic), Phylogenetic (Genealogical), and Phylogenetic (Diagnosable).  See: De Queiroz, K. (2007)  Species concepts and species delimitation.  Syst. Biol. 56: 879 – 886.

[7] Stelbrink, B., R. Richter, F. Köhler, F. Riedel, E. Strong, B. Van Bocxlaer, C. Albrecht, T. Hauffe, T. Page, D. Aldridge, A. Bogan, L-N. Du, M. Manuel-Santos, R. Marwoto, A Shirokaya, and T. Von Rintelen (2020)  Global diversification dynamics since the Jurassic: Low dispersal and habitat-dependent evolution explain hotspots of diversity and shell disparity in river snails (Viviparidae).  Systematic Biology 69: 944 – 961.

[8] In fact, both Stelbrink’s Figure 3 (unconstrained versus fossil-constrained tree) and his Figure 4 (Ancestral state estimation tree) did go T-Mobile-Fuchsia on us, plus Mountain-Dew-Green and a Change-of-Life-Blue that reminded me of a pants suit my wife bought at a yard sale in 1997.  Ouch.

[9] The most recent cycle of fad and fashion in computational phylogenetics seems to have brought to the top a cross-platform program called “BEAST,” Bayesian Evolutionary Analysis Sampling Trees.  The Stelbrink tree was generated using the “MCC” algorithm, Maximum Clade Credibility.  Oh good, that’s what's wanted around here.  Maximum credulity.

[10] Hannibal, H. (1912)  A synopsis of the Recent and Tertiary land and fresh water Mollusca of the Californian Province, based on an ontogenetic classification.  Proceedings of the Malacological Society of London 10: 112 – 211.

[11] Large, exotic viviparids, initially identified as Paludina japonica, were first recorded in San Francisco fish markets around 1892.  The first record of a naturalized population seems to be Stern’s 1901 report from the San José area.  For more, see:  Hannibal H. (1911) Further notes on Asiatic Viviparas in California.  Nautilus 25: 31 – 32.

[12] Clench, W.J. & S.L.H. Fuller (1965) The genus Viviparus (Viviparidae) in North America. Occ. Pap. Moll., 2(32): 385-412.

[13] Pace, G. L. (1973) Freshwater Snails of Taiwan (Formosa).  Malacological Review Supplement 1: 1 – 118.

[14] This is a difficult work to cite.  J. B. Burch's North American Freshwater Snails was published in three different ways.  It was initially commissioned as an identification manual by the US EPA and published by the agency in 1982.  It was also serially published in the journal Walkerana (1980, 1982, 1988) and finally as stand-alone volume in 1989 (Malacological Publications, Hamburg, MI).

[15] Smith, D.G. (2000) Notes on the taxonomy of introduced Bellamya (Gastropoda: Viviparidae) species in northeastern North America. Nautilus 114: 31-37.

[16] Jousseaume, F. (1886) Coquilles du Haut-Senegal.  Bulletin de la Societe Zoologique de France 11: 471 – 502.

[17] The photo above of the impressive Cipangopaludina japonica die-off was snapped 12Jan21 by Mr. Brandon Jones, the Catawba Riverkeeper, on the muddy shore of the Fishing Creek Dam tailrace, at Great Falls, SC.  This point is about 30 km upstream from the Wateree Dam, which I described as:

[18] It shouldn’t.  Remote description of distant biotas has been standard operating procedure in systematic biology for 200 years.  The Europeans got all our pretty seashells, darn them.

[19] Actually, Margarya was described by Neville in 1877.  If I were a Pharisee, I would demand that our two big viviparid guests here in North America be identified as Margarya japonica and Margarya chinensis.  But let’s see how opinion evolves in The Orient.  I suppose it is possible that some actual biology might come into play here, as well.

[20] Clench, W.J. (1962) A catalogue of the Viviparidae of North America with notes on the distribution of Viviparus georgianus Lea. Occ. Pap. Moll., 2(27): 261-287.

[21] Clarke, A.  (1981)  The Freshwater Molluscs of Canada.  National Museums of Canada.  445 pp

[22] I am not entirely sure on Stelbrink’s V. georgianus locality.  There seems to be an error on that line in his Supplementary Table S1.


  1. If Bellamyinae Clade A is to be a single genus the oldest available name is Margarya Nevill, 1877. If you break Clade A into two genera (which seems reasonable) you could have Cipangopaludina(with the type japonica in that clade and Margarya which would include chinensis.

    1. Both Margarya AND Cipangopaludina? Interesting idea. On the downside, however, the sequence data only seem to suggest 4 - 5 species in Clade A. How many genera do we need for four species?