Digitization Strategy for the Scientific Collections at the ZFMK
The digitization strategy describes the planning and implementation of the digitization of the scientific collections and research data to build up a digital catalogue as part of the virtual collection and research environment at the ZFMK.
The digitisation strategy aims for long-term and sustainable processes and methods for the creation and the handling of the digital assets. It describes the goals, infastructures, states the resourcesnecessary, and defines responsibilities. The strategic paper is adapted to the ongoing digitization process and is further developed in accordance with increased knowledge, technical changes and changing resources. In addition the strategie at ZFMK is discussed with other natural history collections.
What is Collection Digitization?
Digitisation means the collection of information about individual collection objects, collective samples or a collection. Digitization in the context of scientific collections is the process and the computer supported availability of object-related data (labels, images, inventory books) from collections and collection based research using a digital catalogue (collection management and digital asset managament systems). A virtual collection is build by adding additional documents, data (e.g. gene sequences), and media (e.g. ct stacks) to the catalogue. Digitisation not only comprises the transformation of data about physical objekts into bytes but includes also collection revision and improvement of the inventory as well as collection based research using the digital data as basis. Digitization has impacts for the access and the daily interaction with the collections.
Digitization eases the access to the collections and their objects and provides equal access to the knowledge indepently of cultural, geopgraphical or economic boundaries. Future generations of reserchers can build upon former results in the long-term, as all associated knowledge is accessible in the virtual research environment. Digital representations allow access to the "deep data" of an object, like the details and the history of collection objects without touching the real things. The daily curational interaction is facilitated with collection objects, including among others the documentation refering to legal spects (ABS/Nagoya), loan trafic, reporting, etc.
The digital catalogue together with the research results will be made available to the public in order to make the collections and the collection-based research at the ZFMK visible to other scientists, the interested public and political decision-makers. The creation of virtual collections under different aspects is possible.
Vision for a Collection Digitization at ZFMK
- is an integral part of collection-based research and collection work and is closely interlinked with research at the ZFMK
- makes the collections digitally accessible
- increases the visibility of the collections for users, stakeholders, and decision-makers
- serves for the national and international networking of research museums into a virtual research infrastructure by collection content, research topics, etc.
- is sustainable because the content is maintained and kept up to date
- enables individual object biographies to be traced
- is supported by physical and virtual collection management
- allows the deposit and administration of documents for the acquisition and legal status of the objects
- permits institute-based adherence to good scientific practice according to the FAIR Principles (Wilkinson et al. (2016): The FAIR Guiding Principles for scientific data management and stewardship, Scientific Data 3, doi: 10.1038/sdata.2016.18).
Consequences of Digitisation
Each digitally available object increases the visibility and usefulness of the collection for the public and specific users. Digitisation has consequences for the daily work and handling of the objects, as access to them is primarily computer-based. Digitisation, according to its efficient and well planned and proportionate implementation, will be sustainable by adapting the data and work flows in the individual collections in such a way that data does not become obsolete immediately and new entries are quickly available.
Digital collection information must be maintained, similar to the curatorial management of physical collection records. This results in additional work through the curation of the digital copies (e.g. after sorting and collection merging, loan consignments, changes in destination). In the course of the curatorial and research activities at the ZFMK (e.g. taxonomic revisions, selected collections worldwide), the collections are continuously digitised, thus expanding the digital catalogue of holdings and maintaining the existing data.
Aims and Dimension of Digitization at ZFMK
Data on individual collection items typically include labels, entries in collection books, and field diaries that contain information about the collection event (collector name, date, location, etc.) and the individual collection item itself (species name, numbers, etc.). Digitization of label information includes the recording of content (and its exact form, e.g. handwriting) and storage in a database. Digitization can also include the recording of images, barcodes and other media that are added to the database entries as references (G Nelson, D Paul, G Riccardi, AR Mast (2012): Five task clusters that enable efficient and effective digitization of biological collections. ZooKeys 209: 19 45. DOI: 10.3897/zookeys.209.3135).
Digitization at the ZFMK also includes the routine generation of barcodes from new and existing collection objects (types, selected collections). The infrastructure built up within the GBOL project is used for this purpose. This is part of the "Collection Accession Unit" (CAU), which will be developed in parallel.
- Completion of a species catalogue of all ZFMK collections within the next three years (2022).
- Linking research data (molecular, morphometric, microscopic, acoustic, etc.) and digital representations (descriptive texts, photographs, sound recordings) with the digital inventory catalogue.
- Digital collection data are accessible and searchable via the online web portal "ZFMK Digital Collection Catalogue" and programmable interfaces.
- Establishment of a loan system linked to the digital collection data, via which the collections can be loaned.
- Digital curation of the catalogue should be simple and sustainable wherever possible.
The proposed objectives can only be fully achieved in the long term. Digitisation is an iterative process: depending on the status of the respective collection and the specific conditions in the collection, short to medium-term partial objectives must be defined for the realisation of digitisation, which are continuously revised and adapted to new conditions. New ones will be defined as soon as the previous ones are reached. The subgoals determine the degree of digitisation, the resources required and the measures to be taken.
The following sub-objectives are envisaged:
- Type catalogue: types are recorded and checklists published.
- Partial inventory: specimens are grouped into larger groups (e.g. system boxes, insect boxes or individual collections), the data for which are recorded (large taxonomic groups, geographical regions). Possibly (high-resolution) pictures of the groups exist.
- Species catalogue: complete list of all species represented in the collection.
- Full inventory: Determination of the vouchers at species level. A set of metadata (e.g. locality, geo references, collectors) are recorded (digitisation of locality data).
- Deep digitisation: Recording of all available information. Images of labels and specimens exist (full informational capture) and barcodes, e.g. of holotypes.
Deep digitisation cannot be carried out for all collection areas, but for collections with a high degree of completeness (e.g. local, regional and national collections), or for certain well represented systematic groups (e.g. fish, trembling spiders, cockchafer, etc.). The scope of metadata collection follows standards as proposed in e.g. TDWG for the Digital Specimen (AA Hardisty, M Keping, G Nelson, J Fortes (2019): 'openDS' - A New Standard for Digital Specimens and Other Natural Science Digital Object Types. BISS 3: e37033. DOI: 10.3897/biss.3.37033).
If not done before, tissue samples can be given to the biobank during digitization to avoid repeated handling of the material.
We distinguisch three different approaches for the implementation of digitization: In the Inventory Digitization (or "Retro Digitization" in DCOLL) existing collection vouchers are digitized. In contrast to this, but also based on this is the Access-Digitization. Here all new incoming objects are imediately digitized. For the latter characteristic section-secific work- and dataflows must be developed. In the third approach, the Revision-Based Digitization, complete taxonomic groups are full digitized. The recording will be carried out by internal or external co-workers, in the course of taxonomic or eco-faunistic revisions, supported e.g. by SYNTHESIS+.
Inventory Digitization at ZFMK is carried out indifferent phases and is feasable and reasonable in different variants depending on section and collection type:
- The goal of the first phase is to get an overview over the species and types available from ZFMK (Type catalogue and Partial inventory).
- In the second phase, the collections are fully inventoried.
- The third phase includes in-depth digitisation of selected collections (e.g. local regional collections, molecular collections, digitisation on demand, certain well-represented systematic groups). The deep digitisation of all parts of the collection is not possible in the short and medium term.
Depending on collection size and available resources and potential external funding the individual phases will be approached separately.
Additional fundings must be raised for indiviudual rich sections, like Lepidoptera or Coleoptera collections.
Digital Infrastructure and Collection Management System
The digital infrastructure at zfmk is composed as follows:
- Collection Management Framework Diversity Workbench
- ZFMK Digital Collection Catalogue
- Multimedia Asset Management Software easyDB
- Research data repository MorphDBase
- BioCASe Provider Software
- Various scripts for import, transform and provision of data (publicly available on github and ZFMK-gitlab
All these systems are maintained by the biodiversity informatics section at ZFMK.
All data refering to collection objects are stored in the Diversity Workbench's collection management module Diversity Collection. Multimedia data are stored in the the digital asset manegement system easyDB.
All collection related data will be published under the Creative Commons License CC BY SA. Metadata, i.e. information about the data, are published using a license weaver (cc0 license). Specific licenses will be choosen for media.
The number of entries in the collection database is calculated on a monthly basis and can be accessed at https://collections.zfmk/statistics.
Access to Data
The Collection Management Framework Diversity Workbench is used to collect data and manage the digital collection information. This requires the installation of the Diversity Workbench client and a login to the database. There is an internal and an external access point to the database. Digital collection data are published and searchable via the web in the ZFMK Digital Collection Catalogue. It represents the public portal of the collection data of the ZFMK.
There is also an interface for the machine controlled retrieval of collection data: http://id.zfmk.de/collection_ZFMK/ (Example). The BioCASe Provider Software provides the collection data for GBIF, GFBio and Europeana.
Distribution of Tasks for the Digitization
The efficiency and implementation of the digitisation strategy depends on the technical and personnel infrastructure. This results in the following framework conditions and tasks for the different departments and sections at ZFMK:
- All collection sections of the zte, as well as Biobank and Biohistorikum
- Collection Managers
- Biodiversity Informatics
- Directorate and Administration
Tasks of the Sections with Scientific Collections (zte)
- Curation of the collections.
- Updating the processing status of collection vouchers material (with the help of external taxonomic specialists).
- Decision on implementation and subgoals/depth of digitisation in the individual parts of the collection (in cooperation with the Collection Managers).
- Cross-museum revision-based digitisation.
- Determining the focus for meaningful deep digitization.
- Curation of the digital collection data.
- Input/import of data into the DiversityWorkbench.
- Identification and assistance in the provision of taxonomic catalogues as a reference system for collection data.
- Networking at national and international level for collection management and digitization.
- Development of research approaches/ideas for the use of digital representations
Tasks of Biodiversity Informatics Section
- Provision of access to the collection management systems.
- User introduction, training and helpdesk for collection management and digital asset management system.
- Support for digitization and import/input in DiviersityWorkbench.
- Support in the provision of taxonomic catalogs (import, interfaces to other systems).
- Publication and dissemination of collection data on GBIF, GFBio, Europeana, etc.
- Creation of statistics for digitization.
- Operation and further development of the relevant IT infrastructure, the applications, and the online ZFMK Collection Catalogue.
- Privacy relevant aspects (in cooperation with the Collection Managers and the Data Security Officer).
- Support ABS/Nagoya implementation (in cooperation with the Collection Managers).
- Cooperation at national (DCOLL and NFDI) and international level (CETAF-ISTC): Digitisation projects, standards, joint developments.
- Provision of the technical infrastructure (intranet, hardware component, storage space).
- Operation of servers for collection management system and digital asset management.
- Updating the operating systems.
- Backup and long-term archiving of data.
- Access to the collection management system from the intranet and Internet.
Directorate and Administration
Provision of resources for the implementation of digitisation. This currently includes (as of 2019):
Assistants for collection inventory and data entry (at least one assistant per scientific collection),
Financing of essential hardware (cameras, insect boxes, etc.).