External repositories

There are disciplinary, generic and institutional repositories. A search for suitable repositories can be conducted in the  re3data.org portal. Disciplinary repositories provide good-quality datasets usually available in ready-to-use formats. Institutional repositories offer a direct way to publish research results which may not otherwise be easily publishable like technical reports, theses and other forms of grey literature.

Generic repositories (e.g. Figshare, Zenodo or Dryad) are popular because they offer data creators a simple way to store heterogeneous, multidisciplinary data sets. These use persistent identifiers and APIs that can automatically harvest contents to other repositories to increase the visibility and findability of their data holdings.

Responsible InstitutionCalifornia Digital Library (CDL)CERN – OpenAireDigital Science (MacMillan-NPG)FIZ-Karsruhe
Costs120 USD per DatasetFree – until 50 GBFree until – 20 GBContract costs: 595 p.a. or 7,56 per GB one-time payment
Hosting of the dataUC3 Merritt (California Digital Library) USACERN Data CentreAmazon Web Services S3 in Dublin (Data push to local server possible)FIZ Karlsruhe

When selecting a repository, pay attention to the following minimum requirements:

  • A repository should be supported by an institution which guarantees the long-term availability of the service
  • The use of persistent identifiers (DOI, Handle, or URNs) is supported
  • Your deposited data are indexed by various open access directories to ease searching and discovery

The ULB provides advice on finding a suitable repository. If you have specific questions, do not hesitate to contact us: openscience@bibliothek.uni-halle.de or find out whether you would be eligible to publish your data in our publication and research data repository Share_it.

Other important things to consider are:

  • Focus on using repositories which have some form of certification such as the Core Trust Seal, or the DINI certificate. These certificates give an indication of the quality of the repository and its trustworthiness.
  • Some data types (e.g. non-anonymized personal or medical data sets) may not be suitable for storage outside the EU or event your institution (see table). In this case, your institutional repository may offer a suitable free-cost alternative for securing your data.
  • Think of the potential future usage of your data and your target community – are you choosing the right formats?.
  • Consider the requirements of publishers, funding institutions and specialized community in advance – you may be requested to submit/deposit your results in specific repositories
  • Generic repositories may not offer a data curation process or quality control step over the deposited data. Make sure you distribute your research outputs in stable formats which are suitable for long-term archiving and are thus likely to withstand hardware / software obsolescence
  • The for-profit nature of some generic repositories may mean that some advance features may only be available for pay-costumers.