A MICCAI 2021 Tutorial


NIH Cancer Imaging Data Repositories for Biomedical Data Science Research

Date: Sept 27, 2021

Time: 10am-2pm (EDT) / 14:00-18:00 (UTC)

The National Cancer Institute (NCI), of the National Institutes of Health (NIH), has made significant investments in the creation and development of public data repositories to enable and promote sharing and secondary analysis of cancer imaging data through an open science approach. Since its inception in 2011, The Cancer Imaging Archive (TCIA) has provided the imaging research community with a stable and reliable resource for sharing de-identified clinical radiology (DICOM), and more recently digital pathology images of a variety of cancers. A growing number of collections in TCIA contain clinical metadata, annotations, and third-party analysis results.

Recently NCI has launched the Cancer Research Data Commons (CRDC), an enterprise of cloud-based data repositories and resources dedicated to key information modalities in cancer research (including genomics, proteomics, and imaging) to provide the research community with a virtual and expandable infrastructure to enable cross-domain data analysis and archival. Adhering to FAIR principles in cancer informatics, CRDC aims to further enable innovations in data science in oncology. Imaging Data Commons (IDC), the imaging node in CRDC, publicly released in October 2020, connects researchers with image collections from TCIA and beyond, through a robust infrastructure containing metadata, image-derived data (segmentations, annotations, and image-based analysis results), allowing image browsing and connectivity to Cloud Resources for image computation, data analysis, and archival of the results. Direct connectivity of IDC data to robust Google analytics and ML tools is an important feature of this platform.

This tutorial aims to familiarize attendees with TCIA, IDC, their similarities and differences in features and capabilities, including search, browsing, downloading (through TCIA), viewing and cohort collection (through IDC), and cloud computation through the NCI Cloud Resources. All levels are welcome. Attendees will be able to follow the demonstration session at their own pace, or during the tutorial, following the learning materials included in the IDC documentation. To be able to follow all of the exercises and demonstration notebooks the attendees will need to have a Google Cloud Project with billing activated. IDC provides free cloud credits to facilitate exploration of the resource. The attendees are encouraged to request these credits in advance using the IDC cloud credit application form.


10:00 – 10:50 am Cancer Imaging Data Repositories 11 am – 11:55 am Demonstration session: TCIA Features 12 pm – 12:30 pm Project MONAI and public data from TCIA and IDC – S. Aylward 12:45 pm – 1:35 pm Demonstration session: IDC Features and Cloud Compute 1:30 pm – 2:00 pm Open Discussion


MICCAI 2020 tutorial page