This document specifies a method of organising file-based data with associated metadata, known as DataCrate in both human and machine readable formats, based on the schema.org linked-data vocabularly, supplemented with terms from the SPAR ontologies and [PCDM] where schema.org does not have coverage. The motivation for this work comes from the research domain.
A DataCrate is a dataset a set of files contained in a single directory. There are two ways of organizing a DataCrate.
For working data or data that does not need to be distributed with checksums, a Working DataCrate is a plain-old directory containing payload data files, with two metadata files at the root; one for humans and one for machines.
For distribution, or archiving; where integrity is important, a Bagged DataCrate is a BagIt bag conforming to the DataCrate BagIt profile with the payload files in the /data directory. A Bagged DataCrate has a clear separation between metadata and payload, and can be integrity-checked using the checksums in the BagIt manifest.
O. Gundersen, and S. Kjensmo. Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI-18), Association for the Advancement of Artificial Intelligence, (2018)
A. Schreiber, and R. Struminski. Universal Access in Human--Computer Interaction. Design and Development Approaches and Methods: 11th International Conference, UAHCI 2017, Held as Part of HCI International 2017, Vancouver, BC, Canada, July 9--14, 2017, Proceedings, Part I 11, page 444--455. Springer, (2017)
D. Dolzycka, K. Biernacka, K. Helbig, and P. Buchholz. Zenodo, (March 2019)Diese Publikation wurde im Rahmen des Verbundprojekts "FDMentor" vom Bundesministerium für Bildung und Forschung gefördert (Förderkennzeichen 16FDM010 und 16FDM011)..