Skip to content

Glossary

Terms Definitions
BAM Binary Alignment Map. A file format used to encode aligned genomic data.
CARE Guiding Principles CARE Principles for Indigenous Data Governance. Designed to complement the FAIR Guiding Principles, these people- and purpose-oriented principles and supporting concepts (Collective benefit, Authority to control, Responsibility, Ethics) reflect the crucial role of data in advancing innovation, governance, and self-determination among Indigenous Peoples (Carroll et al. 2020; 2021).
data life cycle The steps in the research process specifically pertaining to data, from planning, collection and generation, analysis and collaboration, evaluation, storage, dissemination, access, and reuse, which can contribute to the planning for new data generation. The data and research life cycles are distinct but interrelated.
data management The processes and practices associated with the documentation and storage of and access to data and associated metadata throughout the research life cycle.
DMP Data management plan. A document describing the data that will be generated during a research project, and how it will be used, accessed, and stored during the research life cycle. Also known as a data management and sharing plan, though in the definition of data management used here, data sharing is inherently included in data access.
DSI Digital sequence information.
eResearch The use of digital tools and techniques to advance research.
eResearch and libraries staff A broad group that includes research software engineers, research infrastructure developers, data scientists, data stewards, and other professional services staff that deliver library, IT, bioinformatics, and high-performance computing support.
FAIR Guiding Principles FAIR Guiding Principles for scientific data management and stewardship, aiming to improve the Findability, Accessibility, Interoperability, and Reuse of data.
GPU Graphics processing unit. Often used to accelerate data processing.
HDD Hard disk drive.
HPC High performance computing.
Indigenous data The tangible and/or intangible cultural materials, belongings, knowledge, digital data, and information about Indigenous Peoples or that to which they relate.
Indigenous data sovereignty The expression of a legitimate right of Indigenous Peoples to control the access, the collection, ownership, application and governance of their own data, knowledge, and/or information that derives from unique cultural histories, expressions, practices, and contexts (https://localcontexts.org/indigenous-data-sovereignty/).
kaitiakitanga guardianship, protection (te reo Māori).
metadata Data that provides information about other data. For biodiversity genomic data, metadata can provide information regarding context (e.g., taxonomic, spatial, temporal, and associated permissions) as well as used technologies/methodologies.
MIGS Minimum Information about a Genome Sequence.
MIxS Minimum Information about any (X) Sequence.
NCBI National Centre for Biotechnology Information. Part of the United States National Library of Medicine, and host of various genomic databases such as GenBank and the Sequence Read Archive.
Open data Data anyone can use and share, typically publicly accessible and with an open licence.
research life cycle The steps in the process of scientific research from inception (research planning, design, and funding) to completion (dissemination of results and real-world impact), which often leads back to development of new related projects. The research and data life cycles are distinct but interrelated.
SRA NCBI's Sequence Read Archive, the largest global repository of genomic data.
SSD Solid state drive.
VM Virtual machine. A software-based computer system emulating that of a different physical machine. VMs are often used to run a different operating system than that of the primary system of the physical computer.