7. Edmond – open research data repository#

Note

The section below assumes you are an employee of the Max Planck Society (MPG).

7.1. Mission statement for Edmond#

Edmond is the Max Planck Society’s open research data repository. It mission statement reads:

  • Edmond – Open Research Data Repository of the Max Planck Society

  • Edmond – Serves the publication of research data from all disciplines

  • Edmond – Offers scientists the ability to create citable research objects

  • Edmond – Enables secure preservation: provision, describing, documentation, linking, publishing and archiving of all kinds of data

  • Edmond – Supports standardized metadata profiles

  • Edmond – Gives an insight into the work of scientists at the Max Planck Institutes

7.2. Typical scenario: publishing data with manuscript#

The primary use case of Edmond for scientists is to publish data sets. Anticipated typical scenario:

Before submission of publication manuscript for review

  • A manuscript is prepared for submission, and is based on/makes use of a data set.

  • The data set can be uploaded to Edmond and a Digital Object Identifier (DOI) is created (by Edmond).

  • The manuscript can now reference the data set through the DOI.

After submission and during review

  • Reviewers may use the data set as part of the review. (A private link to the data set can be used for this.)

At publication time

  • The data set becomes publicly accessible. (It can be public or not public during the review process.)

  • Readers of the publication may access the data.

7.3. Frequently asked questions#

See Edmond’s help pages at https://edmond.mpdl.mpg.de/guides/help.html for the most recent and complete collection of questions and answers.

1. Is there a size limit for data sets that I can publish through Edmond?

Not as such. If the data is too large, one needs to question if the user of the data can meaningfully download and process it. Recommendations regarding file numbers and sizes:

  • a data set should not have more than 1000 files (order of magnitude). If your data set has more files, consider putting them in an archive such as a zip or tar file. See also Practical aspects storing data for long-term archival.

  • a single file should not be significantly larger than 1TB.

  • For lager data sets, it might be advisable to contact Edmond support before depositing the data.

2. If I have notebooks in my Edmond data set, can they be executed through mybinder.org?

Yes, Jupyter notebooks in Edmond data sets can be executed on the mybinder.org service. This is a fantastic way to prove the reproducibility (of code running in the notebook). Example: fangohr/reproducibility-repository-example

Please (Contact) the SSU if you like further details and support.

3. What is the difference between Keeper and Edmond?

The key difference is that Keeper provides archival of data whereas Edmond publishes the data. In more detail:

Edmond:

  • one snapshot of the data is assigned a DOI

  • the data is publicly visible under the DOI, i.e. published.

  • the publication of the data means that the data is also archived.

Keeper:

  • Keeper “libraries” can be archived and a DOI can be assigned

  • the DOI and an associated web page with the metadata will be public, but the data set itself (i.e. the keeper library) is not published (only archived).

  • Only the owner of the data can access the archived data.

Keeper provides additional functionality, such as cloud-hosted file sharing services (comparable to owncloud, nextcloud, dropbox, onedrive, …).

7.4. Getting started#

Note

Feel free to approach the SSU for advice and to share your experience using Edmond. (Contact)