In the context of the contemporary push for “big data” in many fields, we review recent experiences building large databases for turbulence research. We consider data from direct numerical simulations (DNS) of various canonical flows and from experimental studies and related numerical simulations of wall-bounded turbulence, where the data storage needs are particularly challenging due to the very large range of length and time scales that exists in these flows at high Reynolds numbers. The focus is on a move from the traditional approach of data-handling and analysis where datasets are moved to individual computers, to one where much of the analysis is moved to the hosting system that stores these data. In this context we give a summary of a unique open numerical laboratory that archives over 200 Terabytes of DNS data, including full spatio-temporal flow fields of various canonical flows. Particular attention is given to the unique access requirements for large datasets to become open to the research community and the success the system has had in democratizing access to large datasets.