The 'German Network for Bioinformatics Infrastructure – de.NBI' is a national, academic and non-profit infrastructure supported by the Federal Ministry of Education and Research providing bioinformatics services to users in life sciences research and biomedicine in Germany and Europe. The partners organize training events, courses and summer schools on tools, standards and compute services provided by de.NBI to assist researchers to more effectively exploit their data.
Data Observation Network for Earth (DataONE) is the foundation of new innovative environmental science through a distributed framework and sustainable cyberinfrastructure that meets the needs of science and society for open, persistent, robust, and secure access to well-described and easily discovered Earth observational data.
This document specifies a method of organising file-based data with associated metadata, known as DataCrate in both human and machine readable formats, based on the schema.org linked-data vocabularly, supplemented with terms from the SPAR ontologies and [PCDM] where schema.org does not have coverage. The motivation for this work comes from the research domain.
A DataCrate is a dataset a set of files contained in a single directory. There are two ways of organizing a DataCrate.
For working data or data that does not need to be distributed with checksums, a Working DataCrate is a plain-old directory containing payload data files, with two metadata files at the root; one for humans and one for machines.
For distribution, or archiving; where integrity is important, a Bagged DataCrate is a BagIt bag conforming to the DataCrate BagIt profile with the payload files in the /data directory. A Bagged DataCrate has a clear separation between metadata and payload, and can be integrity-checked using the checksums in the BagIt manifest.
We want to help make data more accessible and more useful; our purpose is to develop and support methods to locate, identify and cite data and other research objects.
Why is it so important to cite data? Books and journal articles have long benefited from an infrastructure that makes them easy to cite, a key element in the process of research and academic discourse. We believe that you should cite data in just the same way that you can cite other sources of information, such as articles and books.