Public Cancer Imaging Data Repositories for Biomedical Data Science Research
Date: Oct 8
Time: 10 am EST (2 pm UTC)
The National Cancer Institute (NCI), of the National Institutes of Health (NIH), has made significant investments in creation and development of public data repositories to enable and promote sharing and secondary analysis of cancer imaging data through an open science approach. Since its inception in 2011, The Cancer Imaging Archive (TCIA) has provided the imaging research community with a stable and reliable resource for sharing de-identified clinical radiology (DICOM), and more recently digital pathology images of a variety of cancers. A growing number of collections in TCIA contain clinical metadata, annotations and third party analysis results.
Recently NCI has launched the Cancer Research Data Commons (CRDC), an enterprise of cloud-based data repositories and resources dedicated to key information modalities in cancer research (including genomics, proteomics, and imaging) to provide the research community with a virtual and expandable infrastructure to enable cross-domain data analysis and archival. Adhering to FAIR principles in cancer informatics, CRDC aims to further enable innovations in data science in oncology. Imaging Data Commons (IDC), the imaging node in CRDC, due for public release in October 2020, will connect researchers with image collections from TCIA and beyond, through a robust infrastructure containing metadata, image-derived data (segmentations, annotations, and image based analysis results), allowing image browsing and connectivity to Cloud Resources for image computation, data analysis, and archival of the results.
This tutorial aims to familiarize the attendees with TCIA, IDC, their similarities and differences in features and capabilities, including search, browsing, downloading (through TCIA), viewing and cohort collection (through IDC), and cloud computation through the NCI Cloud Resources.
The learning objectives of this tutorial include:
- To learn about the NCI public image repositories, their similarities and differences.
- To learn about the capabilities and features offered by each repository.
- To be able to search for and identify imaging collections of interest and prepare them for further analysis.
Overall introduction - Keyvan Farahani | 5 min
Session 1: The Cancer Imaging Archive (TCIA) | 50 min
- Introduction - John Freymann | 5 min
- Publishing data - John Freymann | 5 min
- Browsing Rad/Path Data - Justin Kirby | 10 min
- Searching/filtering/displaying data - Lawrence Tarbox | 15 min
- Additional tools to access the data - Fred Prior | 10 min
- Q&A - Fred Prior | 5 min
Coffee break | 15 min
Session 2: Imaging Data Commons (IDC) | 50 min
- Introduction to NCI Cancer Research Data Commons (CRDC) - Todd Pihl | 5 min [video]
- Introduction to IDC - Andrey Fedorov | 20 min [video]
- IDC hands-on demo: IDC Portal - Andrey Fedorov | 10 min [video]
- IDC hands-on demo: IDC Cohorts - Andrey Fedorov | 10 min [video]
- Q&A - Andrey Fedorov | 5 min
Summary and wrap-up - Keyvan Farahani | 10 min
- Keyvan Farahani, National Cancer Institute
- Andrey Fedorov, Brigham and Women's Hospital / Harvard Medical School
- John Freymann, Fredrick National Laboratory for Cancer Research
- Justin Kirby, Fredrick National Laboratory for Cancer Research
- Todd Pihl, Fredrick National Laboratory for Cancer Research
- Fred Prior, University of Arkansas Medical School
- Lawrence Tarbox, University of Arkansas Medical School
- Bill Longabaugh, Institute for Systems Biology
- David Pot, General Dynamics IT
- Ron Kikinis, Brigham and Women's Hospital / Harvard Medical School