This document specifies a method of organising file-based data with associated metadata, known as DataCrate in both human and machine readable formats, based on the schema.org linked-data vocabularly, supplemented with terms from the SPAR ontologies and [PCDM] where schema.org does not have coverage. The motivation for this work comes from the research domain.
A DataCrate is a dataset a set of files contained in a single directory. There are two ways of organizing a DataCrate.
For working data or data that does not need to be distributed with checksums, a Working DataCrate is a plain-old directory containing payload data files, with two metadata files at the root; one for humans and one for machines.
For distribution, or archiving; where integrity is important, a Bagged DataCrate is a BagIt bag conforming to the DataCrate BagIt profile with the payload files in the /data directory. A Bagged DataCrate has a clear separation between metadata and payload, and can be integrity-checked using the checksums in the BagIt manifest.
D. Dolzycka, K. Biernacka, K. Helbig, and P. Buchholz. Zenodo, (March 2019)Diese Publikation wurde im Rahmen des Verbundprojekts "FDMentor" vom Bundesministerium für Bildung und Forschung gefördert (Förderkennzeichen 16FDM010 und 16FDM011)..
A. Schreiber, and R. Struminski. Universal Access in Human--Computer Interaction. Design and Development Approaches and Methods: 11th International Conference, UAHCI 2017, Held as Part of HCI International 2017, Vancouver, BC, Canada, July 9--14, 2017, Proceedings, Part I 11, page 444--455. Springer, (2017)
O. Gundersen, and S. Kjensmo. Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI-18), Association for the Advancement of Artificial Intelligence, (2018)