OpenRefine (formerly Google Refine) is a powerful tool for working with messy data: cleaning it; transforming it from one format into another; and extending it with web services and external data.
Schema.org is a collaborative, community activity with a mission to create, maintain, and promote schemas for structured data on the Internet, on web pages, in email messages, and beyond.
The PREMIS maintenance activity is responsible for maintaining, supporting, and coordinating future revisions to the PREMIS data dictionary. The Preservation Metadata: Implementation Strategies Working Group, convened by OCLC and RLG, initially developed the PREMIS data dictionary as a specification with the goal of creating an implementable set of "core" preservation metadata elements, with broad applicability within the digital community. a supporting xml schema allows for implementation of element set and is maintained in network development marc standards office library congress.
Within GOKb, participants can work on creating high-quality data in areas that mesh with their skills and priorities. The data can then be reused by anyone, for any purpose. Potential use cases include knowledge bases providers looking to supplement their data, libraries building open source software, and individuals experimenting with open data.
N. Micic, D. Neagu, I. Campean, and E. Habib Zadeh. (2017)Every industry has significant data output as a product of their working process, and with the recent advent of big data mining and integrated data warehousing it is the case for a robust methodology for assessing the quality for sustainable and consistent processing. In this paper a review is conducted on Data Quality (DQ) in multiple domains in order to propose connections between their methodologies. This critical review suggests that within the process of DQ assessment of heterogeneous data sets, not often are they treated as separate types of data in need of an alternate data quality assessment framework. We discuss the need for such a directed DQ framework and the opportunities that are foreseen in this research area and propose to address it through degrees of heterogeneity..
P. Missier, K. Belhajjame, and J. Cheney. Proceedings of the 16th International Conference on Extending Database Technology, page 773--776. New York, NY, USA, ACM, (2013)