DataStream
Data Policy

The long-term success of DataStream depends on the development of appropriate policies and practices for the management of data. These policies and practices must be informed by national and international best practices and—with the help of local leaders and subject matter experts—adapted for the local and regional context.

Background

The Data Policy for DataStream is in line with national and international best practices in data management. Specifically, this policy draws upon identified principles laid out by organizations including (but not limited) to the following:

  • Organization for Economic Co-operation and Development (OECD)
  • Polar Knowledge Canada (formerly the Canadian High Arctic Research Station)
  • International Arctic Science Committee
  • Convention on Biodiversity (Article 8)
  • Canadian Tri-council policy on research
  • First Nations Principles of OCAP®

Use of this Policy

This Data Policy will inform governance of DataStream and the approach to managing the data contained within it. This Data Policy does not/will not override existing, applicable data sharing agreements.

Data Ownership and Licensing

Data Contributors maintain complete ownership of their data and make it openly accessible through an open data license (see DataStream Terms of Use for more information).

Principles

The principles outlined below set the foundational values and concepts for how data is managed and will guide the governance and operation of DataStream as it evolves. The following four principles are proposed as key components for DataStream's Data Policy:

  1. Ethically open access
  2. Data quality
  3. Interoperability
  4. Security and sustainability

1. Ethically Open Access

Data are made available on an equal basis, fully, freely and openly in a timely way. Exemptions to this open data policy are allowable for ethical reasons.

Making data broadly available without restriction (open access) is a growing movement worldwide. It is particularly relevant for data collected in the public interest, using public funds. Open access involves minimizing or eliminating barriers to data access (e.g. the use of restrictive licensing or proprietary formats or high cost tools).

The aim of DataStream is to open up access to the data contained within it as much as possible. Since the site contains public data that is not of a sensitive nature, this information will be shared in an unrestricted way.

Any exemptions from the open sharing of data (other than ethical exemptions described below) must be justified and requested in a dataset-specific Data Management Plan.

2. Quality

Strive for completeness and the absence of errors in datasets. Datasets are accompanied by standard documentation.

Efforts are needed to ensure datasets are complete and quality controlled so that data are reliable and can confidently be used to better understand freshwater quality in Canada.

Quality standards for datasets should be explicitly stated to avoid confusion. Where limitations to data quality against a particular standard exist, these limits are clearly articulated in accompanying documentation.

Data should be linked with related datasets if any exist in order to show relationships among initiatives and datasets. This enhances the utility of research products and contributes to quality assurance.

Metadata (data that serves to provide context or additional information about other data) that accompany datasets should adhere to existing standards for cataloguing data and evaluating fitness for use in a particular application. This contributes to quality assurance, encourages appropriate use of datasets and helps ensure data are discoverable. As such, metadata accompanying datasets should contain key information required for evaluating, understanding and finding datasets including:

  • full information on the entity responsible for collecting and stewarding data;
  • information on data quality including limitations to allow users to determine fitness for use/how the data can be used;
  • where possible a Digital Object Identifier (DOI) so that data can be easily tracked and referenced;
  • data descriptors to ensure data are discoverable;
  • reference to data collection protocols, instruments or any other relevant information needed to replicate data collection

3. Interoperability

Strive for technological and semantic interoperability with other initiatives.

The international data ecosystem is dependent upon the cooperative flow of data across data centres. The movement towards widespread cooperation and exchange has the potential to support addressing complex and large-scale scientific problems.

A prerequisite for such cooperation is ‘interoperability’. Interoperability allows diverse systems and entities to work together (inter-operate) towards shared objectives (Pulsifer, 2013). As this definition implies, interoperability has both technological and human elements. From a technological standpoint, interoperability involves, among other things, releasing data in open, machine readable formats and the use of commonly owned, professional and non-proprietary standards. The human and organizational side of interoperability requires consistent communication among participants and engagement at key decision-making junctures to inform the evolution of the system.

Sustaining interoperability requires a high level of flexibility to adapt with the rapid and often unpredictable changes in information technology, the characteristics of various research approaches and the cultural diversity across Canada.

DataStream was designed to be as open and compatible as possible with the majority of today’s web technology. As the system evolves, efforts will be made to maintain and, where possible, further the system’s interoperability with existing and emerging web technologies. This will position DataStream to realize the benefits of cooperation with other programs and systems within the broader international data ecosystem.

4. Security and Sustainability

The integrity and security of the data must be safeguarded against corruption and loss to ensure fitness for use over the short and long-term.

Planning over a long-term horizon ensures initial investments in data collection and management have enduring impacts and contribute to baseline datasets.

Data security strategies will be clearly articulated and include a description of the division of responsibility among parties to ensure accountability in data stewardship. Long-term sustainability and preservation of data will be achieved through the development and implementation of a systematic data preservation plan.

Future Work

Partners involved in the development of DataStream are committed to contributing to the development of a long-lived tool which will promote collaboration among various monitoring efforts and foster a regionally networked approach to water stewardship.

DataStream launched for the Mackenzie Basin in 2016 after a pilot year. As DataStream expands to new regions, all partners are committed to continuously improving DataStream to meet its growing community's needs.

References