Case Studies

See how Overture is tackling diverse challenges across multiple projects.

ICGC-ARGO logoVirusSeq logoKids First Data Portal logoIHCC logoHuman Cancer Models Initiative logo

ICGC-ARGO

The ICGC ARGO Data Platform platform builds on the legacy of the ICGC 25K Data Portal by harmonizing molecular and high-quality clinical data from global genomics efforts into a collective and unified knowledge base. ICGC ARGO will improve patient outcomes by enabling discovery through the responsible sharing of this curated data set with researchers worldwide.

    • 63,116 committed donors, 26 programs representing 13 countries and 20 tumour types
    • ICGC ARGO aims to analyze specimens from 100,000 cancer patients
    • IGGC DACO governs the responsible sharing of this data for use in research

How was Overture used?

    • Song: Validates all submitted sequence meta(data) against a custom data model
    • Score: Manages file transfers and object storage with added SamTools functionalities to help handle large WGS files
    • Maestro: Indexes multiple song repositories into one elastic search instance
    • Arranger: Facilitates filtering and querying
    • Ego: Provides stateless authentication and authorization

VirusSeq

The Canadian VirusSeq Data Portal is an open-source and open-access data portal for all Canadian SARS-CoV-2 sequences and associated non-personal contextual data. VirusSeq harmonizes, validates, and automates submission to international databases, providing critical information for public health and policy decisions, testing and tracing strategies, virus detection and surveillance methods, vaccine and drug development, and understanding susceptibility, disease severity, and clinical outcomes.

    • Built-in 4 weeks with Overture
    • Hosts 474,215 viral genomes, surpassing the projection of 150K
    • Horizontally scaled with replica Score, Song, and Maestro instances

How was Overture used?

    • Score: Managed file transfers and object storage
    • Song: Modified for the validation and tracking of viral sequencing metadata
    • Maestro: Indexed sample data for downstream search
    • Arranger: Responsible for all search capabilities, including faceted search and data tables
    • Ego: Governed the authorization of applications

Kids First Data Portal

The Kids First Data Resource Center brings together clinical and genetic data from pediatric cancer and structural birth defect cohorts into a centralized, cloud-based discovery portal. We created a collaborative, community focused portal that brings together researchers, health professionals, and patients to accelerate discoveries that improve the lives of pediatric patients and their families.

    • Data collected across distributed 32 projects
    • 1.7 Petabytes, 30.5k Participants, 28k families, 94.9k samples, 187.4k Files
    • Query and Filter 72 data types with 16 clinical fields

How was Overture used?

    • Song: Validation and tracking of genomic metadata
    • Score: Managed file transfers and object storage
    • Arranger: With the faceted search and customizable data table, arranger enabled users to filter and query this large dataset efficiently

IHCC

The International HundredK+ Cohorts Consortium (IHCC) is improving clinical care and population health by aggregating large genomic data cohorts to help translational researchers uncover the biological and genetic factors of disease. With exception to underrepresented cohorts & populations, all hosted member cohorts are disease-agnostic and have available biospecimens and longitudinal follow-up activities. Most notably, hosted member cohorts comprise one hundred thousand participants or more.

    • There are a total of 70 cohorts participating in the study
    • These cohorts come from 39 different countries around the world
    • Each cohort provides data across 39 distinct metadata fields

How was Overture used?

    • Arranger: With Arranger, users are able to filter and query the database through an intuitive UI with a customizable table and faceted search

Human Cancer Models Initiative

The Human Cancer Models Initiative (HCMI) is a catalogue of unique cancer models alongside clinical, biospecimen, and molecular data. It also includes protocols, consent templates, and clinical data forms, making it a comprehensive resource for researchers determining which cancer models to use within their studies. The ultimate goal of the HCMI is to support translational cancer research and improve personalized patient treatment plans.

    • 275 unique cancer models across 25 Primary sites
    • All models are annotated with genomic and clinical data
    • Enabling researchers to browse and shop for innovative cancer models

How was Overture used?

    • Arranger: Enables search by filtering and querying the database through an intuitive UI