FAIR (meta)data

You are here

Good research data management produces FAIR (meta)data: (meta)data that are Findable, Accessible, Interoperable and Reusable. We use the term (meta)data because metadata and data are two separate entities, and both should adhere to the FAIR principles. More information on (the distinction between) data and metadata can be found here.

In the following section, we explain the basics of FAIR. If you would like to gain in-depth knowledge, we refer you to the GO FAIR initiative and FAIRsFAIR. For the booklovers among us, we recommend reading the FAIRy tale.

How can you make your (meta)data findable?
How can you make your (meta)data accessible?
How can you make your (meta)data interoperable?
How can you make your (meta)data reusable?
Tools

How can you make your (meta)data findable?

Assign globally unique and persistent identifiers to your dataset(s). Globally unique means that the identifier will refer to one specific item, and to that item alone. Persistent means that the identifier will always lead you to the item that it has been assigned to, even if the location of the item (e.g. the URL) changes over time. Usually, when you upload a dataset in a repository, that repository will assign a persistent identifier to your dataset automatically.

An example of a persistent identifier is a DOI. If we want to refer to the FAIRy tale mentioned above, which is uploaded in the repository Zenodo, we prefer to cite the DOI (https://doi.org/10.5281/zenodo.2248200) instead of the URL (https://zenodo.org/record/2248200#.YMnea2gzZPZ) which is a less stable link.

Another example of a persistent identifier is an ORCID iD. It is a unique number that you claim to distinguish yourself from others, especially from those who have the same name (or initials) as you. The authors of the FAIRy tale all have an ORCID iD, e.g. Karsten Kryger Hansen (https://orcid.org/0000-0002-2407-8764).

As we have mentioned, the metadata and the data are two separate entities, so your metadata should explicitly mention the identifier/reference of the dataset that they describe. That way, the metadata are connected to the data.

To further improve the findability of your data, add rich metadata (e.g. creator(s), publication date, description, related identifier(s), language, keywords, etc.). Try searching for a dataset that would be relevant for your own research: which search facets do you use? That will help you determine which metadata you should assign to your data.

As a UHasselt researcher, you are expected to upload the metadata of the datasets underlying your peer-reviewed publications in the UHasselt metadata repository.

 

How can you make your (meta)data accessible?

Accessible means that humans as well as machines should be able to retrieve the (meta)data by means of a standardized communication protocol. Mostly, this is the protocol called tcp, which translates in laymen’s terms to “you click on the link and the (meta)data will appear”. It might also be a phone number or an e-mail address of the person that needs to be contacted to gain access to the data, as long as this contact protocol is stated clearly in the metadata.

Therefore, accessible does not mean that the data have to be Open Access. There may be valid reasons to not make your data available to the public (more information). In that case, you can choose to make your data available in restricted access, adding an authentication and authorization procedure to the protocol, or not make them available at all (closed access). However, when the data are not (or no longer) available, the metadata should always be accessible. UHasselt researchers can deposit their metadata in the metadata repository of the institution, guaranteeing their accessibility for the long term.

 

How can you make your (meta)data interoperable?

Your (meta)data are interoperable when they can be combined and exchanged with other (meta)data by humans as well as computers. To make this happen, you should:

  • Use standardized language in your (meta)data: controlled vocabularies, ontologies and thesauri that are common (in your discipline) and well documented. 
    • E.g. the MeSH thesaurus is a controlled and hierarchically-organized vocabulary produced by the National Library of Medicine.
  • Use a (meta)data model that is common (in your discipline) and well documented.
    • E.g. Dublin Core and DataCite are generic metadata standards.
    • E.g. MIFlowCyt is a metadata standard used in flow cytometry.
    • E.g. FITS is a standard data format used in astronomy.
  • Use standard, open file formats instead of proprietary ones (more information).

If your dataset is built on another dataset or vice versa, or if your dataset underlies a publication, you should make these connections explicit in the metadata, preferably by using qualified references and links (e.g. a persistent identifier).

 

How can you make your (meta)data reusable?

When fellow researchers understand your data, they can reuse them. Therefore, you should add metadata that describe the context, content and structure of the data: purpose, provenance, methodology, procedures, limitations, variables, etc. Based on that documentation, an outsider can assess whether the dataset is relevant for his/her study and he/she can interpret – and as a result reuse – the data. Ideally, these metadata should be structured according to a metadata standard, too (cf. interoperability).

Finally, a dataset can only be reused if there is a reuse license attached to it stating what others can and cannot do with it. This should also be in a machine-readable format. An example of a reuse license is the Creative Commons license. More open licenses can be found on the Open Definition website. If there is no license accompanying your dataset, copyright rules apply and no one will be able to reuse it.

 

Tools

Hereby the full list of tools and manuals that we recommend.

Top