The first FWGNA project was the “digitization” of museum collections. The year was 1999, and at the dawn of the worldwide web, only two national collections of freshwater gastropods were searchable online: those of the Academy of Natural Sciences of Philadelphia (ANSP) and the Florida State Museum (FLMNH). So the first NSF proposal a committee of nine of us wrote – Phase I of three projected phases – was to unify the freshwater gastropod holdings of 21 North American museums into a single, online database of approximately 200,000 records. A Phase II field survey and a Phase III monographic review were set to follow .
That proposal was not funded. But progress in the national museums continued, difficult though it was for me to understand at the time. I revisited the subject of the online availability of freshwater gastropod collections ten years later, in April of 2009 . And at that point, the number of national or regional mollusk collections searchable online had grown to ten. To quote myself directly: “I’m impressed!”
Which of those ten might be the most useful for the FWGNA Project? Research suggests that the world is inhabited by mollusks that are not freshwater snails. And although Class = Gastropoda is a common search criterion in almost all malacological databases, Habitat or Environment = freshwater is surprisingly rare. So my first idea was to compare the freshwater gastropod fractions of the ten museums by searching for “Family = Physidae.” But as of 2009, several important museum databases were not even effectively searchable by family. So as a crude estimate of the freshwater gastropod holdings of the ten databases available online as of 2009, I executed a simple search for “Campeloma.”
My results, published on this blog 15Apr2009, showed the University of Michigan Museum of Zoology (UMMZ) in the lead at Campeloma = 2,456 records, followed by the FLMNH Campeloma = 1,414, then ANSP = 890, and Harvard’s Museum of Comparative Zoology (MCZ) = 488.
Brand new, as of 2009, was the Global Biodiversity Information Facility (GBIF), hosted at Copenhagen . Many of our colleagues felt as though the GBIF was the wave of the future. And quite a few prominent North American museums were cooperating, including the USNM, ANSP, and FLMNH. My query of the GBIF database (executed 26May09) returned Campeloma = 3,210.
So another ten years have passed. And as impressive as C = 3,210 most certainly is, how does that statistic compare with the online freshwater gastropod records retrievable today? Spoiler alert! C = 11,874.
In February of 2010 a workshop was held at the National Evolutionary Synthesis Center in Durham, NC, ultimately yielding “A Strategic Plan for Establishing a Network Integrated Biocollections Alliance” . And shortly thereafter, the NSF began accepting proposals to its new “Advancing Digitization of Biological Collections Program.” The effect has been remarkable.
Initial projects were “thematic” around the various Kingdoms and Phyla of Biology, rather reminiscent of the chromosomally-based approach pioneered by the Human Genome project. The “Thematic Collection Network” most directly relevant to our interests here was “Invert-E-Base,” kick-started in 2014 by a consortium that included our colleagues Rudiger Bieler of the Field Museum of Natural History in Chicago (FMNH), Diarmaid O’Foighil of the UMMZ, and Elizabeth Shea of the Delaware Museum of Natural History (DMNH) . Ultimately Invert-E-Base grew to involve 18 US museums, universities and other institutions, including many with substantial freshwater gastropod holdings, such as the Carnegie Museum of Natural History (CMNH), the Illinois Natural History Survey (INHS), and the North Carolina Museum of Natural Sciences (NCSM).
Meanwhile, NSF also funded the “Advancing Digitization of Biodiversity Collections Program” (iDigBio) to serve as a hub for all the data being collected by the various thematic collection networks . Additional contributions rolled in from all the other museums where digitization efforts had been proceeding independently, such as ANSP, USNM, FLMNH, MCZ, and so forth.
Today the iDigBio database includes 4.3 million gastropod records held by scores of institutions worldwide, including The Canadian Museum, the British Museum, the Australian Museum, all over Europe – everywhere. My search of the iDigBio database for Genus = Campeloma last week returned that eye-popping 11,874 figure quoted above. And how about Family = Physidae AND Continent = North America? Drum roll, please. The iDigBio database boasts 31,417 records of the North American Physidae.
Here are the top-ten museums, ranked by their North American physid holdings, as I retrieved them through iDigBio last week. The links are to their local online search facilities, if maintained, which tend to hold more current data.
- UMMZ 5,492
- NMNH 5,417
- MCZ 3,619
- ANSP 3,415
- FMNH 2,726
- FLMNH 2,083
- INHS 1,717
- CMNH 1,051 (No local search)
- NCSM 824
- DMNH 653 (No local search)
“Of the 6.2 million cataloged lots (of mollusks), 4.5 million (73%) have undergone some form of data digitization (which includes all forms of digitization, e.g. ledger records entered, transcribed, or imported into word processor, spread sheet, or relational database formats). About 1.1 million (25%) of digitized records have been georeferenced, which represents 18% of all cataloged lots. Only 20 collections (less than 25%) claim to be fully Darwin Core compliant, however, 34 of the 66 collections with some form of digitization are searchable online through iDigBio, Arctos, or other portals, or directly through institutional websites.”
That's great, but there’s certainly still work to be done. Prominent among those institutions not searchable online at present is the Ohio Museum of Biodiversity (OSUM) in Columbus, which boasts very significant freshwater gastropod holdings. Our good buddy Tom Watters is gittin’ ‘er done, even as we speak.
I should conclude with a word of warning. One of the many criticisms leveled at our FWGNA proposal way back in 1999 was simply the question of data quality. How would we know that all those collections of freshwater snails we were proposing to digitize really were what their museum labels said they were? Darn good question.
The reason that I have become such an avid customer of online databases over the last 20 years is that I am preparing to visit the actual collections themselves. I print shopping lists, fasten them to an old-fashioned clipboard, and walk around the actual, physical collections, inspecting every lot personally. Only after I have personally verified a record is it added to the FWGNA database. One at a time.
The more powerful a tool, the more dangerous it becomes. You can hurt yourself with a saw, you can kill yourself with a chainsaw. I feel sure that 100% of my readership is acutely aware that a simple Google search is simultaneously very powerful, and very dangerous, and all of us know how to use Google safely. Exactly the same caveats pertain to the NCBI GenBank, and to the iDigBio database. Like my Momma used to say, “You be careful with that thing now, you hear?”
 For more about the history of the FWGNA Project, see:
- Twenty Years of FWGNA [11July18]
 My 2009 museum survey was a two-parter. See:
 A Strategic Plan for Establishing a Network Integrated Biocollections Alliance:
NIBA Brochure [pdf]
 Sort-of obsolete, but nevertheless interesting:
- Welcome to Invert-E-Base [html]
 Integrated Digitized Biocollections:
- iDigBio [html]
 Sierwald, P. R. Bieler, E.K. Shea and G. Rosenberg (2018) Mobilizing Mollusks: Status Update on Mollusk Collections in the U.S.A. and Canada. American Malacological Bulletin 36(2):177-214. https://doi.org/10.4003/006.036.020
Invert-E-Base is probably the most important of the data aggregators but you label it as "Sort-of-Obsolete." Dude, please, get with the program.ReplyDelete