Details about EASYDAB

Scientific outcome of publicly financed projects should be published in a way that the results are reusable and comprehensible. Providing documented data to data users ensures that the data sets are more widely assessed and eventually better known. Additionally, the data originator becomes more known through the data provision. 

Unfortunately, data users often have problems to identify data sets that might be useful and available for their purpose. Many of the datasets that are stored together with their mandatory metadata in a repository are often not reusable, e.g. because: 

 

  • Detailed discipline-specific metadata are missing.
arrow.png Even the knowledgeable user does not know what the data set includes.
  • No information is provided in the metadata, whether and how data and metadata are checked for accuracy and completeness.
arrow.png The data user does not know how reliable the data are.
  • Machine-readability and associated software information is not given.
arrow.png The data cannot be found in automated searches.
  • Data are saved in proprietary and undocumented file formats.
arrow.png  The data reading is difficult.
  • File formats are depending on the version of the writing program.
arrow.png The data can only be read with a specific version of the program.
  • The rights for the reuse of the data are not specified.
arrow.png

It is unclear if the data can be used by the data user.


Therefore, it is reasonable to publish the simulation output in a standardized way to foster reuse.

 


With EASYDAB it is directly visible which data complies to earth system science discipline-specific standards!

 

easydab workflow

 


 

Participating datasets or dataset collections are marked with the EASYDAB logo which is registered at and protected by the German patent and trade mark office.

 


 

EASYDAB will be assigned to matured1 datasets in repositories. Datasets with EASYDAB follow research-field dependent standards which define requirements for:

  • rich metadata with controlled vocabulary,
  • the landing pages,
  • file formats (netCDF),
  • the structure within the files and
  • the license.

 

These datasets follow the FAIR principles2, should be easy to find, e.g. with search-engines.  Their reuse should be made as easy as possible. The maturity controls will be performed by the repositories. All requirements should be easy to implement.

 

A standard for Atmospheric Model Data has been established within the AtMoDat project. Details about the standard can be found here.

 

You have a discipline-specific standard which should be incorporated? Please contact us: contact@easydab.de .

 


References

  1. Maturity describes the degree of the formalization and standardization of a data object (data + metadata) with respect to FAIRness and quality of the (meta-) data. Data objects mature as they pass through the different data post-production steps. The higher the maturity, the easier it is to reuse the data.
  2. see e.g. Wilkinson et al. (2016): The FAIR Guiding Principles for scientific data management and stewardship. https://doi.org/10.1038/sdata.2016.18
  3. https://opensource.org/licenses