The National Computational Infrastructure (NCI), Australia’s peak supercomputing and big data facility, and the National Centre for Indigenous Genomics (NCIG) at The Australian National University are working to give Indigenous donors of genomic data full control over how their data is used.
NCI and NCIG are developing a platform to solve one of the foundational challenges of ethical genomic science, one that has until now been too hard to properly implement: donor sovereignty over the use of and access to their genomic data. Many studies have shown people’s desire for careful handling of their genomic data, and yet the link between donors and the databases holding their genetic sequences is frequently lost. This is because legacy databases are researcher-centric and not donor-centric, and it is difficult to retrofit individual donor requirements into existing data management systems.
NCIG is an Indigenous-led research organisation charged with managing over 7,000 human blood samples, the genomic data they contain, and associated archival documents dating from the 1960s to the 1990s when the samples were collected.
NCIG Deputy Director Mrs Azure Hermes says, “We as Indigenous people understand the importance of protecting our data. The sample we give to researchers is our story, our history, it’s a link to our ancestors. But make no mistake, this is not just a problem for Indigenous people. It’s an everybody problem. Every single person should ask the question, where is my sample, who is using it and for what purpose?”
NCI’s technical capacity and NCIG’s robust Indigenous-led governance are working hand-in-hand to bring donor control of their genomic data to reality. The transparency and accountability that is being built into the data management platform reflect the desires of donors and their communities.
NCI Director Professor Sean Smith says, “First Nations people must be in control of their data. Real and meaningful consent from data owners is a requirement of any Indigenous genomics platform, and NCI is proud to provide solutions that meet the needs of Indigenous Australians and enable directly beneficial research collaborations.”
A key step in creating a donor-centric genomic data platform is to embed donor preferences about the use of their data for research at the repository design level. NCI and NCIG have implemented this using a dedicated software tool called Mediaflux. Importantly, this work aligns with the metadata standards of the European Genome-phenome Archive (EGA) and the Global Alliance for Genomics and Health (GA4GH).
Mediaflux is a mature and rich software platform designed in Australia that NCI uses to curate, manage, protect and disseminate large volumes of structured and unstructured data across its many service offerings. The Mediaflux platform allows for metadata-driven granular control for data access, tracks the entire data transaction history with detailed audit trails, implements data encryptions and automated workflows through APIs, and is built to manage petabyte-scale data. This provides the required foundation for implementing NCIG’s use case using industry-standard security and leverages a range of NCI storage and compute services for end-to-end data workflows.
NCIG and NCI have been collaborating for several years to develop the capability to securely manage genomic information in this way. While the push to ethically handle sensitive genomic information is strong across all genomic research areas, for NCIG and its unique collection, it is an inarguable requirement. Along with the development of the required technologies and protocols, the organisations are exploring appropriate ways of representing the machine-readable consent model using a Data Use Ontology (DUO) framework. True to its values, NCIG’s data collection protocols are developed alongside the Indigenous communities who the data belongs to.
Together, NCIG and NCI have innovated in this area of national and international significance. Research bodies around the world are working on the challenge of controlled and appropriate access to genomic data collections, and this partnership has provided a key step forward. Future collaborations, including with NCRIS-funded infrastructure platform, the Australian BioCommons, and internationally with the Broad Institute Data Science Platform, will ensure FAIR (Findable, Accessible, Interoperable and Reusable) and CARE (Collective benefit, Authority to control, Responsibility and Ethics) implementations for genomics data.
Read more about how NCIG research at NCI is bringing the benefits of genomic medicine to Indigenous Australians.