Ideal Interfaces to Natural History Collections Databases

Introduction

Great movements are afoot in natural history museums across the country and around the world to digitize the valuable information that is represented by extensive specimen collections and field notes, and to jumpstart research to an extent not before possible by making this information available online. An excellent example is the MaNIS project, which provides access to the mammal specimen collection databases of over 16 museums from just one website.

But while pulling all of this data together is a truly marvelous feat of technical and social engineering, it is also vitally important that researchers be able to find the data that they need in these vast cyber systems. The MaNIS Interface Design project is an example of how careful design of the interfaces to support browsing and searching facilitates effective use of the collections databases by scientists. And wouldn't it be even better if other folks - teachers, students, naturalists - could also find it useful?

Motivations for the Project

The overall motivation is to find the best way to display all this data that has been pulled together. "A huge motivation for the BNHM website (and BNHM in general) is to create a public presence for the stodgy old research museums. And how can we provide ways for people to appreciate what is inside our doors, without actually letting them in our doors? " says John Deck, BNHM informatics coordinator.

"The purpose [of the BNHM mapping tool] is to give people an idea of what we do by offering up some easy-cheezy online tools and for the serious user, provide the download data option and a link to DIVA," says John. DIVA is a desktop application for comparing collection localities with climate data in order to determine the climactic parameters that might define good habitat for a species, and then to predict its range. Part of the purpose of this project is to determine which mapping functions belong online, and which should be relegated solely to the desktop.

Similarly, while the databases hold specimen info, amateur naturalists may be first interested in species accounts. Another purpose of this project is to identify who is interested in what, and how an amateur's natural interests might bring them to cross meaningfully to specimen information and thus to appreciate the existence of these collections.

The BNHM search interface focuses on providing a public face to the collections, and a breadth of search for researchers. For search features that are specific to a particular collection and of interest to their scientific research users, the individual members of the BNHMs also provide their own search interfaces, such as the MVZ and UCMP. Projects like MaNIS are again designed for these scientific researchers, pulling together collections around the world with the same type of specimens. What does each of these interfaces do best? What can be standard across them? and how can the people using them be helped to understand which one they want to use?

Project Goals

The task this summer, June to August 2004, is to generalize beyond the MaNIS Interface Design project in order to identify characteristics and functions required of the ideal online interface for natural history collections at UC Berkeley, including MaNIS, the MVZ online interface, and the BNHM interface.

User groups interviewed for this study will be:

1) museum curators
2) scientists with research interests in the museums
3) amateur naturalists
4) curriculum developers and high-school students

Data for this study will be derived from:

1) interviews with users using interface mock-ups and discussion of needs. Includes finding out what browsers are in use, necessary fields for query, and most useful output format.

2) analysis of web-usage logs to determine most popular queries (using IP stats and knowledge of existing interfaces to filter data 'noise').

The report will include sample prototype mock-ups that would support each of the user communities. The prototype mock-ups will include discussion of the preferred placement of taxonomic, temporal, and spatial query and display functions to support each user community.

The prototype mock-ups delivered at the end of the summer will come at a good time as interface development will be starting for HerpNet, ORNIS, and Invasive Species RCN. User interface development is also continuing for BNHM and MVZ online interface, especially with the coming integration of GIS services to existing query engines. This study will be applicable as well to the UCJEPS, Essig, Paleontology, and Botanical Garden online interfaces.

The Nature of the Beast

Database records about the specimens held in a natural history museum's collection can be described as metadata. <unfinished> Describe the types of data we're talking about here. Use the descriptors from Marti's paper about Flamenco - it's somewhat faceted, some facets are nested (hierarchical), others are flat...Describe the metadata/facets of taxonomic databases using the terms in "Faceted Metadata for Image Searching & Browsing" paper - p. 221 of the 213 reader. <unfinished/>

Interesting Interface Design Questions

Key, interesting questions motivating this work with search/browse interfaces for natural history specimen collection data include:

What can the interface contribute to the goal of making the collections meaningful for the public?
What is the best way to specify a taxon? This includes asking how the interface can best a) support navigation within the nested structure of taxonomic names, and b) support the location of records using synonyms for the same taxa, an issue particularly when databases from multiple institutions with varying resources for updating records are brought together in one interface, such as with MaNIS.
What is the best way to specify a location? Text-based geography metadata has similar spelling and update issues to taxonomy, and has traditionally been addressed with gazeteers. Georeferencing now allows identification of synonymous locations much more readily, but opens up new, interesting questions - how to design an interface - with maps or nested standardized terms -that allows the user to specify the geographic area of interest?
How can numerous results be presented effectively? This is a question all search interfaces either confront or ignore. The structure of the data in the specimen databases in combination with task analysis suggests that with thoughtful design of conditional formatting and hyperlinked objects within the display, it may be possible to provide meaningful summary views of large result sets which simultaneously support users in refining their queries to specify smaller result sets.

The questions of specifying a taxon and a geography both have some relationship to the issue of navigating within a tree, hierarchy or nested structure. This issue has been addressed in a variety of ways, as it comes up for filesystems, faceted metadata terminology taxonomies (taxonomy in the Library Science sense), the content of informational websites,and a variety of other places, I'm sure. In this context, we can assume neither intense familiarity with the structure (as one might with the file system on your own computer) nor ease with predicting the contents of a category based on the term describing it (as one often can with hierarchies of English terms, where the term describes the general case of which the meanings of the nested terms are subsets - terminology taxonomies and website navigation use this). While some users may be very familiar with some parts of the taxonomic and geographic structures, the system should work for those who aren't - it should not assume expert users - and yet it must use something other than these better-known strategies. For the non-expert user, finding the correct term may be a situation roughly analogous to trying to find a file on someone else's computer by navigating the file system.

Also, the structure of the scientific taxonomy and even a geographic hierarchy changes over time. The tree that the search interface helps the user to navigate must be generated from the data & standard authorities, but it cannot be invariant. Users need to be supported gracefully through these changes even if they had expertise in the relevant section of the hierarchy. And finally, some users will be interested in these changes over time, and some collections available through the interface will not always be up-to-date with the changes, so the system as a whole cannot assume the simplicity of just one most-current, up-to-date structure, but must be robust to these changes over time.

Finally, for those users coming from the public, common names represent alternatives to the scientific terminology where the terms are drawn from common language, and can provide some predictability. The mapping between the structure of common names and scientific names is inconsistent and irregular. Common names provide a reasonable strategy for public users, and the system must do a lot of work to support them well. <unfinished>

Defining the ideal interface

to search and browse online databases of natural history specimen collections.

Rebecca Shapley
Berkeley Natural History Museums and Museum of Vertebrate Zoology
Summer 2004

The Project

Introduction

Motivations for the Project

Project Goals

The Nature of the Beast

Interesting Interface Design Questions

Scope

Biodiversity Informatics

Defining the ideal interface

to search and browse online databases of natural history specimen collections.

Rebecca Shapley Berkeley Natural History Museums and Museum of Vertebrate Zoology Summer 2004

The Project

Introduction

Motivations for the Project

Project Goals

The Nature of the Beast

Interesting Interface Design Questions

Scope

Biodiversity Informatics

Rebecca Shapley
Berkeley Natural History Museums and Museum of Vertebrate Zoology
Summer 2004