The FAIR Principles
The FAIR Principles are a “set of standards that connects researchers, publishers, and data repositories in Earth, space, and environmental sciences to … accelerate scientific discovery and enhance the integrity, transparency, and reproducibility of scientific data on a large scale” (COPDESS). Essentially, scientific data should be Findable, Accessible, Interoperable, and Reusable.
Why FAIR Principles Matter
More important than knowing what the FAIR goals are is understanding why they matter. The nature of performing science is changing, including shifts in scientific publication and peer review. Key changes include:
Scientific analyses are being encoded in repeatable, shareable workflows.
Publications are moving away from static print documentation to interactive demonstrations online.
These capabilities rely completely on the availability of the data underpinning the research.
The FAIR principles were originally articulated by FORCE11, an organization founded on the belief that “semantically enhanced, media-rich digital publishing will be more powerful than traditional print media or electronic copies of printed works.”
FAIR principles have been adopted by researchers, publishers, and data repositories affiliated with COPDESS, the Coalition on Publishing Data in the Earth and Space Sciences. Partners include:
Publications such as Nature and Science
Funding agencies like NASA, USGS, NOAA, and NIH
Professional groups like AGU
For a full list of FAIR partners, see the COPDESS FAIR Data Project. To view the list of signatories committed to FAIR data, visit the Statement of Commitment.
FAIR Principles
The following are synopsized descriptions of the FAIR principles, adopted from GO FAIR.
Findable
The first step in (re)using data is to find them. Metadata and data should be easy to find for both humans and computers. Machine-readable metadata are essential for automatic discovery of datasets and services.
Findable characteristics include:
Data and metadata are assigned globally unique and persistent identifiers.
Data are described by rich metadata that clearly includes the identifier of the data they describe.
Data and metadata are registered or indexed in a searchable resource.
Accessible
Once the user finds the required data, they need to know how to access them, including details about authentication and authorization.
Characteristics of being accessible include:
Data and metadata are retrievable using common protocols.
The protocol is open and free.
Authentication and authorization procedures are applied where necessary.
Metadata remain accessible even when data are no longer available.
Interoperable
Data usually need to be integrated with other data. Additionally, data must interoperate with applications or workflows for analysis, storage, and processing.
Interoperable data and metadata:
Use a formal, accessible, broadly applicable language for knowledge representation.
Use vocabularies that follow FAIR principles.
Include qualified references to other data and metadata.
Reusable
The ultimate goal of FAIR is to optimize data reuse. To achieve this, metadata and data must be well-described to enable replication and/or combination in different settings.
Reusable data and metadata:
Have clear, accessible data usage licenses.
Are associated with detailed provenance.
Meet domain-relevant community standards.
How to Apply FAIR Principles
Adopt FAIR-Compliant Practices:
Assign persistent identifiers to datasets and metadata.
Use rich metadata that describe datasets thoroughly.
Register Metadata:
Index metadata in searchable repositories to enhance discoverability.
Implement Standards for Access and Interoperability:
Ensure retrieval protocols are open and free.
Use FAIR-aligned vocabularies and knowledge representation languages.
Provide Reuse Guidance:
Include detailed provenance information.
Apply clear licenses for data usage.
Collaborate with FAIR Partners:
Follow practices adopted by FORCE11, COPDESS, and similar organizations.
Useful Links
Acronyms
FAIR = Findable, Accessible, Interoperable, Reusable
FORCE11 = The Future of Research Communication and e-Scholarship
COPDESS = Coalition on Publishing Data in the Earth and Space Sciences
AGU = American Geophysical Union
NASA = National Aeronautics and Space Administration
NIH = National Institutes of Health
NOAA = National Oceanic and Atmospheric Administration
Credit: Content taken from a Confluence guide written by Anne Wilson, and modified by Shawn Polson in 2019