Editor’s Note – This essay was subsequently published as: Dillon, R.T., Jr. (2023c) Twenty Years of Progress in the Museums. Pp 9 – 13 in The Freshwater Gastropods of North America Volume 7, Collected in Turn One, and Other Essays. FWGNA Project, Charleston, SC.
The first FWGNA project was the “digitization” of museum
collections. The year was 1999, and at
the dawn of the worldwide web, only two national collections of freshwater
gastropods were searchable online: those of the Academy of Natural Sciences of
Philadelphia (ANSP) and the Florida State Museum (FLMNH). So the first NSF proposal a committee of nine
of us wrote – Phase I of three projected phases – was to unify the freshwater
gastropod holdings of 21 North American museums into a single, online database
of approximately 200,000 records. A Phase
II field survey and a Phase III monographic review were set to follow [1].
That proposal was not funded. But progress in the national museums
continued, difficult though it was for me to understand at the time. I revisited the subject of the online
availability of freshwater gastropod collections ten years later, in April of
2009 [2]. And at that point, the number
of national or regional mollusk collections searchable online had grown to
ten. To quote myself directly: “I’m
impressed!”
Which of those ten might be the most useful for the FWGNA Project? Research suggests that the world is inhabited
by mollusks that are not freshwater snails.
And although Class = Gastropoda is a common search criterion in almost all
malacological databases, Habitat or Environment = freshwater is surprisingly
rare. So my first idea was to compare
the freshwater gastropod fractions of the ten museums by searching for “Family =
Physidae.” But as of 2009, several
important museum databases were not even effectively searchable by family. So as a crude estimate of the freshwater
gastropod holdings of the ten databases available online as of 2009, I executed
a simple search for “Campeloma.”
My results, published on this blog 15Apr2009, showed the
University of Michigan Museum of Zoology (UMMZ) in the lead at Campeloma = 2,456
records, followed by the FLMNH Campeloma = 1,414, then ANSP = 890, and Harvard’s
Museum of Comparative Zoology (MCZ) = 488.
Brand new, as of 2009, was the Global Biodiversity
Information Facility (GBIF), hosted at Copenhagen [2]. Many of our colleagues felt as though the
GBIF was the wave of the future. And
quite a few prominent North American museums were cooperating, including the
USNM, ANSP, and FLMNH. My query of the
GBIF database (executed 26May09) returned Campeloma = 3,210.
So another ten years have passed. And as impressive as C = 3,210 most certainly
is, how does that statistic compare with the online freshwater gastropod
records retrievable today? Spoiler
alert! C = 11,874.
In February of 2010 a workshop was held at the National
Evolutionary Synthesis Center in Durham, NC, ultimately yielding “A Strategic
Plan for Establishing a Network Integrated Biocollections Alliance” [3]. And shortly thereafter, the NSF began
accepting proposals to its new “Advancing Digitization of Biological
Collections Program.” The effect has
been remarkable.
Initial projects were “thematic” around the various Kingdoms
and Phyla of Biology, rather reminiscent of the chromosomally-based approach pioneered
by the Human Genome project. The “Thematic
Collection Network” most directly relevant to our interests here was “Invert-E-Base,”
kick-started in 2014 by a consortium that included our colleagues Rudiger
Bieler of the Field Museum of Natural History in Chicago (FMNH), Diarmaid
O’Foighil of the UMMZ, and Elizabeth Shea of the Delaware Museum of Natural
History (DMNH) [4]. Ultimately Invert-E-Base grew to involve 18 US museums,
universities and other institutions, including many with substantial freshwater
gastropod holdings, such as the Carnegie Museum of Natural History (CMNH), the
Illinois Natural History Survey (INHS), and the North Carolina Museum of
Natural Sciences (NCSM).
Meanwhile, NSF also funded the “Advancing Digitization of
Biodiversity Collections Program” (iDigBio) to serve as a hub for all the data
being collected by the various thematic collection networks [5]. Additional contributions
rolled in from all the other museums where digitization efforts had been
proceeding independently, such as ANSP, USNM, FLMNH, MCZ, and so forth.
Today the iDigBio database includes 4.3 million gastropod
records held by scores of institutions worldwide, including The Canadian
Museum, the British Museum, the Australian Museum, all over Europe –
everywhere. My search of the iDigBio
database for Genus = Campeloma last week returned that eye-popping 11,874
figure quoted above. And how about
Family = Physidae AND Continent = North America? Drum roll, please. The iDigBio database boasts 31,417 records of
the North American Physidae.
Here are the top-ten museums, ranked by their North American
physid holdings, as I retrieved them through iDigBio last week. The links are to their local online search
facilities, if maintained, which tend to hold more current data.
- UMMZ 5,492
- NMNH 5,417
- MCZ 3,619
- ANSP 3,415
- FMNH 2,726
- FLMNH 2,083
- INHS 1,717
- CMNH 1,051 (No local search)
- NCSM 824
- DMNH 653 (No local search)
“Of the 6.2 million cataloged lots (of mollusks), 4.5 million (73%) have undergone some form of data digitization (which includes all forms of digitization, e.g. ledger records entered, transcribed, or imported into word processor, spread sheet, or relational database formats). About 1.1 million (25%) of digitized records have been georeferenced, which represents 18% of all cataloged lots. Only 20 collections (less than 25%) claim to be fully Darwin Core compliant, however, 34 of the 66 collections with some form of digitization are searchable online through iDigBio, Arctos, or other portals, or directly through institutional websites.”
That's great, but there’s certainly still work to be
done. Prominent among those institutions
not searchable online at present is the Ohio Museum of Biodiversity (OSUM) in
Columbus, which boasts very significant freshwater gastropod holdings. Our good buddy Tom Watters is gittin’ ‘er
done, even as we speak.
I should conclude with a word of warning. One of the many criticisms leveled at our
FWGNA proposal way back in 1999 was simply the question of data quality. How would we know that all those collections
of freshwater snails we were proposing to digitize really were what their
museum labels said they were? Darn good
question.
The reason that I have become such an avid customer of
online databases over the last 20 years is that I am preparing to visit the
actual collections themselves. I print shopping
lists, fasten them to an old-fashioned clipboard, and walk around the actual,
physical collections, inspecting every lot personally. Only after I have personally verified a record
is it added to the FWGNA database. One
at a time.
The more powerful a tool, the more dangerous it
becomes. You can hurt yourself with a saw, you can kill
yourself with a chainsaw. I feel sure
that 100% of my readership is acutely aware that a simple Google search is
simultaneously very powerful, and very dangerous, and all of us know how to use
Google safely. Exactly the same caveats pertain
to the NCBI GenBank, and to the iDigBio database. Like my Momma used to say, “You be careful
with that thing now, you hear?”
Notes
Notes
[1] For more about the history of the FWGNA Project, see:
- Twenty Years of FWGNA [11July18]
[2] My 2009 museum survey was a two-parter. See:
[3] A Strategic Plan for Establishing a Network Integrated
Biocollections Alliance:
NIBA Brochure [pdf]
[4] Sort-of obsolete, but nevertheless interesting:
- Welcome to Invert-E-Base [html]
[5] Integrated Digitized Biocollections:
- iDigBio [html]
[6] Sierwald, P. R. Bieler, E.K. Shea and G. Rosenberg
(2018) Mobilizing Mollusks: Status Update on Mollusk Collections in the U.S.A.
and Canada. American Malacological
Bulletin 36(2):177-214. https://doi.org/10.4003/006.036.020
Invert-E-Base is probably the most important of the data aggregators but you label it as "Sort-of-Obsolete." Dude, please, get with the program.
ReplyDelete