Skip to main content
Full AccessReview article

Imaging genomics in cancer research: limitations and promises

Published Online:


Recently, radiogenomics or imaging genomics has emerged as a novel high-throughput method of associating imaging features with genomic data. Radiogenomics has the potential to provide comprehensive intratumour, intertumour and peritumour information non-invasively. This review article summarizes the current state of radiogenomic research in tumour characterization, discusses some of its limitations and promises and projects its future directions. Semi-radiogenomic studies that relate specific gene expressions to imaging features will also be briefly reviewed.


In recent years, a new direction in cancer research has emerged to address high-throughput methods of associating imaging features with genomic data.13 This approach is referred to as radiogenomics or imaging genomics. The imaging characteristics of a disease are also called its imaging phenotype or radiophenotype, while the genomic information defines the molecular phenotype or genotype of the disease. Research to uncover the underlying genetic causes of individual variation in sensitivity to radiation using high-throughput genomic methods has also been referred to as “radiogenomics” and is not discussed in this review.4

Much of the discussion of personalized medicine has focused on molecular characterization using genomic and proteomic technologies.5 However, a limitation of these approaches is the need to acquire tissue samples through invasive surgery or biopsy.6 Although some genetic analyses have been incorporated into clinical practice in recent years, large-scale genome-based cancer characterization is not routinely performed owing to its cost, turnaround time and technical complexity required for data analysis and interpretation.7 In addition, samples are often obtained from a small portion of a heterogeneous lesion and may not accurately represent the lesion's anatomic, functional and physiologic properties.8 Even more importantly, it is not feasible to obtain the tissue multiple times during treatment in order to monitor response. Consequently, it is still a challenge to incorporate genomics or proteomics into routine clinical practice.

Imaging has great potential for in vivo tumour characterization because it can provide a more comprehensive view of the entire tumour than biopsy samples alone.9 For example, imaging can provide information on peritumoral regions, which are typically not surgically removed and thus not analysed in the laboratory.1 Human tissues often exhibit a diversity of distinctive traits on radiographic images, many of which currently have no known clinical significance. Furthermore, routine clinical practice often includes follow-up imaging to monitor treatment response and disease progression.10 Advances in imaging technologies now provide better anatomic localization and allow for non-invasive measurements of functional and physiologic tissue- and lesion-specific properties.11 Potentially, one would benefit tremendously from radiogenomic biomarkers that measure gene expression at frequent intervals during therapy.

Oncologic diagnosis is quickly moving from the traditional histology-based approaches to molecular stratification.12 Therefore, the traditional radiology–pathology paradigm alone is no longer sufficient to radiologists. Radiogenomics represents the evolution of the radiology–pathology correlation from the histology level to the subcellular level.2 This systematic association between imaging traits and gene expression allows useful inference in both directions: imaging traits can be used to predict gene expressions in human cancers; conversely, image features can be predicted from gene signatures.2,3 The predictive capabilities of these signatures not only enable immediate translational potential, but also suggest potential molecular mechanisms that may give rise to imaging phenotypes.13

In order for personalized medicine to transpire, biomarkers must accurately reflect the underlying molecular cancerous machinery.14 Given the growing number of genomic, imaging and clinical biomarkers that were identified in patients with various types of cancers, there is a need to create integrative biomarkers to link multiple types of data and measurements.14 The objective of this study was to provide a comprehensive review of radiogenomic research in tumour characterization.


We searched multiple electronic databases for original research studies that correlated imaging features by manual, semi-automatic or automatic assessment with the whole genome data. Our search terms included variations of1 different imaging modalities including “MR”, “scintigraphy” and “nuclear medicine”, “CT” or “PET”; and2 molecular signatures such as “genome”, “genomics”, “molecular profiling”, “mutation”, “sequence”, “gene”, “genetic” and “signature”. Studies that contained the word radiogenomic or imaging genomics were identified separately. We excluded studies that associated imaging features with patient response to radiation therapy, since this refers to a different field of research called radiation genomics. Radiomics involves extraction of many quantitative imaging features with computer algorithms. The extracted features can be related to genomics or proteomics. Only “radiomics approach to radiogenomics” is included in this review. For studies that associated imaging features with specific genes and expression of specific gene subsets (e.g. tumour molecular subtype), we grouped them under the category “semi-radiogenomic studies”. Studies that correlated imaging with markers measured by immunohistochemistry or fluorescent in situ hybridization [e.g. (R)-2hydroxyglutarate (2HG) metabolites from isocitrate dehydrogenase 1 mutation, p53 nuclear staining, anaplastic lymphoma kinase + status etc.] were not included under this category. Furthermore, we did not include studies that correlated BRCA1/2 gene mutations or other specific gene expressions/mutations with breast density on mammography.

Overall, 27 studies were included in the final analyses (Table 1).9,1540 These studies were published between 2007 and 2015. 8 studies used data from The Cancer Genome Atlas (TCGA) and/or The Cancer Imaging Archive (TCIA);1,16,23,27,32,34,38,41 2 studies were multi-institutional;9, 19 and the remaining 17 studies used local institutional data.15,17,18,2022,2426,28,30,31,33,3537,39,40 26 out of 27 studies were retrospective in design. The number of patients ranged from 10 to 104 patients, with a median of 38 patients. 8 (30%) studies used a validation data set to verify the association between imaging features and genomic data identified in the initial data set.20,22,23,27,28,34,37,40 Six types of cancers were studied: glioblastoma multiforme (GBM)/high-grade glioma (n = 14, 52%), non-small-cell lung cancer (NSCLC) (n = 3, 11%), hepatocellular carcinoma (HCC) (n = 3, 11%), breast cancer (n = 5, 19%), clear-cell renal cell carcinoma (CCRCC) (n = 1, 4%) and cervical cancer (n = 1, 4%). The imaging modalities used included fluorine 18 fludeoxyglucose positron emission tomography (PET) (n = 4, 15%), MRI [n = 18 (including two perfusion MR), 67%] and CT [n = 5 (including one perfusion CT), 19%].

Table 1. Radiogenomic studies published in the literature

StudyYearCountryData sourceNumberValidation setCancerImaging modalityNumber of featuresMethod of feature extractionRadiologist involvementGenomic dataIndividual vs clustersPathway analysisWet-lab validationHistology correlationOutcomeClinical outcomes
Gevaert et al152012USASingle institutional26NoNSCLCPET180BothYesMicroarrayClustersYesNoNoYesRFS and OS
Gevaert et al162014USATCGA55NoGBMMR79ManualYesMicroarray, DNA methylation, array CGHClustersYesNoNoYesPFS and OS
Aerts et al92014NetherlandsMulti-institutional89NoNSCLCCT440Semi-automatedNoMicroarraysClustersYesNoYesYesOS
Jamshidi et al172014USASingle institutional23NoGBMMR6ManualYesMicroarrays, array CGHClustersYesNoNoNoNA
Kuo et al182007USASingle institutional30NoHCCCT6ManualYesMicroarraysClustersNoNoYesNoNA
Nair et al192012USAMulti-institutional25NoNSCLCPET14Semi-automatedYesMicroarraysBothYesNoNoYesOS
Segal et al202007USASingle institutional28YesHCCCT138ManualYesMicroarraysClustersYesNoYesYesOS
Yamamoto et al212012USASingle institutional10NoBreast cancerMR26ManualYesMicroarraysBothYesNoNoNoNA
Yamamoto et al222015USASingle institutional19YesBreast cancerMR47Semi-automatedYesRNA sequencingIndividualYesYesYesYesMFS
Zinn et al232011USATCGA26YesGBMMR3Semi-automatedYesmRNA and micro-RNAIndividualYesNoNoYesPFS and OS
Barajas et al242010USASingle institutional12NoGBMMR9ManualYesMicroarrayBothYesNoYesNoNA
Diehn et al252008USASingle institutional22NoGBMMR10ManualYesMicroarrayBothYesYesNoYesOS
Pope et al262008USASingle institutional52NoGBMMR1ManualYesMicroarrayIndividualNoYesYesYesOS
Zinn et al272012USATCGA78YesGBMMR1ManualNAMicroarray and micro-RNAIndividualYesNoNoYesOS
Jamshidi et al282015USASingle institutional70YesCCRCCCT35ManualYesMicroarrayBothNoNoNoYesOS
Colen et al292014USATCGA99NoGBMMRI1ManualYesMicroarray and micro-RNAIndividualYesNoYesYesOS
Carlson et al302007USASingle institutional71NoHGGMRI1ManualNAMicroarrayIndividualNoNoNoYesOS
Colen et al312014USATCGA104NoGBMMRI30ManualYesMicroarrayBothYesNoNoYesOS
Jain et al322012USATCGA18NoGBMCT2ManualYesMicroarrayIndividualYesNoNoNoNA
Naeini et al332013USASingle institutional46NoGBMMRI3ManualNAMicroarrayClustersYesNoNoYesOS
Nicolasjilwan et al342015USATCGA68YesGBMMRI30ManualNAMicroarrays, array CGHClustersYesNoNoNoNA
Pope et al352012USASingle institutional38NoGBMMRI1ManualYesMicroarrayIndividualYesNoYesYesOS
Osborne et al362010USASingle institutional20NoBreast cancerPET1ManualNAMicroarrayBothYesNoYesNoNA
Palaskas et al372011USASingle institutional18YesBreast cancerPET1ManualNAMicroarray, array CGHClustersYesYesYesNoNA
Zhu et al382015USATCGA and TCIA91NoBreast cancerMRI38Semi-automatedNAMicroarray, array CGH, micro-RNA, somatic mutationsClustersYesNoNoNoNA
Miura et al392015JapanSingle institutional77NoHCCMRI1ManualYesMicroarrayIndividualsYesNoYesYesPFS and OS
Halle et al402012NorwaySingle institutional46YesCervical cancerMRI1ManualYesMicroarrayClustersYesYesYesYesPFS

CCRCC, clear-cell renal cell carcinoma; CGH, comparative genomic hybridization; GBM, glioblastoma multiforme; HCC, hepatocellular carcinoma; HGG, high-grade glioma; MFS, metastatic-free survival; NA, not available; NSCLC, non-small-cell lung cancer; OS, overall survival; PET, positron emission tomography; PFS, progression-free survival; RFS, recurrence-free survival; TCGA, The Cancer Genome Atlas; TCIA, The Cancer Imaging Archive.

Article by Colen et al31 published in BioMed Central Medical Genomics; article by Colen et al29 published in Radiology.

Imaging features and extraction

The number of imaging features extracted range from 1 to 440 with a median of 6. 5 (19%) studies used automatic or semi-automatic imaging feature extraction; 21 (78%) studies used manual feature extraction; and 1 (4%) study used a combination of automatic and manual imaging feature extractions. 19 (70%) studies involved board-certified radiologists in the process of imaging feature extraction. In one study, Aerts et al9 defined the region of interest in one study. 7 studies did not provide any information regarding reader qualification. 6 (22%) studies focused on building an association map between genomic data and imaging features, while the other 21 (78%) studies identified significant imaging features that correlated with genomic data.

We tabulated all the manually extracted and computationally derived imaging features from all radiogenomic studies (Table 2).9,1540 There was a wide array of imaging features that were extracted by radiologists, depending on the imaging modality used and the type of cancer studied. The most common CT features that were extracted include tumour necrosis and tumour margin. For HCC, enhancement properties on different phases of CT were the commonly studied imaging features.18,20 Internal air bronchogram was a specific feature extracted for NSCLC.15 Most MRI studies focused on GBM and breast cancer. For GBM, three studies used Visually Accessible Rembrandt Images (VASARI), a comprehensive feature set consisting of 24 observations familiar to neuroradiologists to describe the morphology of brain tumours on routine contrast-enhanced MRI.42 The imaging features in VASARI that were most likely to have a significant relationship with genomic data included enhancement characteristics of the brain tumour and its extent of involvement.16,31,34 This relationship held for other studies of GBM which did not utilize VASARI. One study of breast cancer found the location, lymph node and stromal patterns to be significant imaging features with genomic data,21 while another study of HCC focused on the intensity of the tumour in the hepatobiliary-phase MR.39

Table 2. Manually extracted imaging features from radiogenomic studies

StudyModalityCancerSignificance definitionManually extracted feature
Jamshidi et al28CTCCRCCAssociation with gene clustersPattern of tumour necrosis, tumour transition zone, tumour–parenchyma interaction, tumour–parenchyma interface
Kuo et al18CTHCCAssociation with mRNA and gene clustersInternal arteries, texture heterogeneity, wash-in, washout, necrosis, tumour margin score
Segal et al20CTHCCAssociation with mRNANecrosis, internal septa, texture heterogeneity (arterial and venous phase), tumour margin score (minimum and maximum), enhancement pattern, internal arteries (density and necrosis edge), hypodense halo, washout, internal arteries (density), tumour–liver difference, corrected imaging area, necrosis density, capsule, wash-in, infiltration, tumour–liver difference, attenuation/heterogeneity score
Carlson et al24CTHGGAssociation with mRNAOedema
Gevaert et al15CTNSCLCAssociation with mRNAInternal air bronchogram, complex shape, vascular convergence, lobulated margin, oval shape, irregular margin, pleural retraction, solid density, entering airway, right upper lobe apical location
Aerts et al9CTNSCLCAssociation with mRNANone
Jain et al32CTGBMAssociation with mRNANone
Osborne et al36PETBreast cancerAssociation with molecular subtypesNone
Palaskas et al37PETBreast cancerAssociation with Myc-overexpressionNone
Nair et al19PETNSCLCAssociation with mRNA and gene clustersNone
Yamamoto et al21MRIBreast cancerAssociation with gene clustersEnhancement pattern, size, shape, margin, location, T2 tumour signal interface between tissue and tumour, satellite lesions, multifocal disease, lymph node involvement, un-coordinated growth, stromal alterations
Yamamoto et al22MRIBreast cancerAssociation with IncRNANone
Zhu et al38MRIBreast cancerAssociation with gene clustersNone
Halle et al40MRICervical cancerAssociation with gene clustersNone
Gevaert et al16MRIGBMAssociation with molecular subtypesVASARI (deep white matter location, enhancement, enhancing margin characteristics, diffusion characteristics)
Colen et al29MRIGBMAssociation with mRNAVASARI (enhancing tumour across midline/corpus callosum, deep white matter tract involvement, ependymal involvement)
Nicolasjilwan et al34MRIGBMAssociation with mRNA and CNVVASARI (proportion of tumour contrast enhancement)
Jamshidi et al17MRIGBMAssociation with gene clustersContrast enhancement, necrosis, contrast-to-necrosis ratio, infiltrative vs oedematous T2abnormality, mass effect, subventricular zone involvement
Barajas et al24MRIGBMAssociation with mRNA and gene clustersLesion location, presence of contrast enhancement, central necrosis, degree of T2 oedema, mass effect
Diehn et al25MRIGBMAssociation with gene clustersContrast enhancement, necrosis, mass effect, pattern of T2oedema (infiltrative/oedematous), cortical involvement, SVZ involvement, C : N ratio, contrast/T2 ratio, degree of T2 oedema, T2 heterogeneity
Pope et al26MRIGBMAssociation with mRNAEnhancement extent
Zinn et al23MRIGBMAssociation with mRNA and micro-RNANone
Zinn et al27MRIGBMAssociation with mRNA and micro-RNANone
Colen et al31MRIGBMAssociation with mRNA and micro-RNANone
Naeini et al33MRIGBMAssociation with molecular subtypesNone
Pope et al35MRIGBMAssociation with mRNANone
Miura et al39MRIHCCAssociation with mRNAIntensity on hepatobiliary phase

CCRCC, clear-cell renal cell carcinoma; C:N ratio, contrast:necrosis ratio; CNV, copy number variation; GBM, glioblastoma multiforme; HCC, hepatocellular carcinoma; HGG, high-grade glioma; mRNA, messenger RNA; NSCLC, non-small-cell lung cancer; PET, positron emission tomography; SVZ, subventricular zone; VASARI, Visually Accessible Rembrandt Images.

Bold type means significant relationship of imaging feature with genomic data.

Article by Colen et al31 published in BioMed Central Medical Genomics; article by Colen et al29 published in Radiology.

There was relative uniformity for the computationally derived imaging features (Table 3).9,1540 For CT, tumour intensity, texture and shape were the most commonly extracted features, especially for NSCLC. PET studies were most likely to focus on the standardized uptake value, regardless of tumour type. Cerebral blood volume is the most commonly derived feature on either perfusion CT or MRI.24,32 For MRI studies on GBM, the most common feature extracted to correlate with genomic data was the volume of the tumour.1,16,23,27,33 Several studies divided the tumour into regions with specific imaging characteristics such as enhancing, necrotic, oedema etc. and correlated the volume of each region with the patient's genomic data.1,16,23,33 For MRI studies on breast cancer, tumour volume was still a commonly extracted imaging feature. Otherwise, studies have focused on signal strength on specific sequences at different time points and contrast kinetic pattern.21,22,38

Table 3. Computationally extracted imaging features from radiogenomics studies

StudyModalityCancerComputer-extracted features
Jain et al32CTGBMCBV, PS
Aerts et al9CTNSCLCTumour intensity, shape, texture, wavelet features
Jamshidi et al28CTCCRCCNone
Kuo et al18CTHCCNone
Segal et al20CTHCCNone
Carlson et al30CTHGGNone
Osburne et al36PETBreast cancerSUV
Palaskas et al37PETBreast cancerSUV
Nair et al19PETNSCLCSUV intensity metrics, SUV distribution metrics, SUV spatial metrics
Gevaert et al15PET/CTNSCLCHistogram, texture, edge sharpness, edge shape, ROI size, SUV
Yamamoto et al21MRIBreast cancerT1 intrinsic signal,T2intrinsic signal strength, contrast kinetic pattern, median peak signal strength at different times, nadir signal strength at different times
Yamamoto et al22MRIBreast cancerLargest tumour volume, tumour roundness, entropy, skewness, kurtosis, GLCM contrast, GLCM homogeneity, GLCM energy, Hu’s seven moment invariants, average of wash-in slope, average of washout slope, plateau fraction, persistent fraction, heterogeneity of time intensity, ERF
Zhu et al38MRIBreast cancerSize phenotypes, shape phenotypes, morphological phenotypes, enhancement texture phenotypes, kinetic curve assessment, enhancement-variance kinetics
Barajas et al24MRIGBMCBV, PH, PSR, ADC
Gevaert et al16MRIGBMNecrotic, enhancing, oedema ROIs
Zinn et al23MRIGBMFLAIR volume
Zinn et al27MRIGBMVolume
Colen et al29MRIGBMNecrosis volume
Naeini et al33MRIGBMContrast-enhancing volume, necrotic volume, contrast enhancement+necrotic volume, T2hyperintense volume, the ratio of oedema/(necrosis+contrast)
Pope et al35MRIGBMADC
Jamshidi et al17MRIGBMNone
Diehn et al25MRIGBMNone
Pope et al26MRIGBMNone
Colen et al31MRIGBMNone
Nicolasjilwan et al34MRIGBMNone
Halle et al40MRICervical cancerAbrix (enhancement-variance kinetics)
Miura et al39MRIHCCNone

ADC, apparent diffusion coefficient; C : N ratio, contrast:necrosis ratio; CBV, cerebral blood volume; CCRCC, clear-cell renal cell carcinoma; CNV, copy number variation; EFR, enhancing rim fraction; FLAIR, fluid-attenuated inversion recovery; GBM, glioblastoma multiforme; GLCM, gray-level concurrence matrix; HCC, hepatocellular carcinoma; HGG, high-grade glioma; NSCLC, non-small-cell lung cancer; PET, positron emission tomography; PH, peak height; PSR, percentage of signal intensity recovery; PS, permeability surface; ROI, region of interest; SUV, standardized uptake value; SVZ, subventricular zone.

Bold type means significant relationship of imaging feature with genomic data.

Article by Colen et al31 published in BioMed Central Medical Genomics, article by Colen et al29 published in Radiology.

Genomic data

26 (96%) studies used data from RNA or complementary DNA microarray. Only one (4%) study used data from RNA sequencing.22 Among these 26 studies that used microarray data to correlate with imaging, 4 studies included micro-RNA,1,23,27,38 5 studies copy number variation,16,17,34,37,38 1 DNA methylation16 and 1 somatic mutation.38 11 (41%) studies grouped gene expression data into gene clusters or modules to associate with imaging features;9,1518,20,33,34,37,38,40 9 (33%) studies directly associated individual elements with imaging features;1,22,23,26,27,30,32,35,39 and 7 (26%) studies used both approaches.19,21,24,25,28,31,36 23 (85%) studies performed pathway analysis, either in the initial clustering of genes to associate with imaging features (n = 10)9,15,20,21,24,33,34,37,38,40 or in the final analysis of significant genomic markers (n = 12).1,16,17,19,22,23,27,31,32,35,36,39 One study performed pathway analysis for both purposes.25

Outcome, histology and wet lab validation

18 (67%) studies included outcome data.1,9,15,16,19,20,22,23,2528,30,31,33,35,39,40 12 studies focused on overall survival (OS);1,9,19,20,2528,30,31,33,35 4 studies included both OS and progression-free survival;15,16,23,39 1 study focused on progression-free survival in cervical cancer;40 and 1 study used metastatic-free survival in breast cancer.22 Three studies stated overall follow-up time.22,26,30 12 (44%) studies correlated with histological data.1,9,18,20,22,24,26,3537,39,40 The histological parameters that were evaluated ranged from tumour type and tumour stage to specific immunological expression of tumour markers such as oestrogen receptor (ER), progesterone receptor (PR) and HER2 in breast cancer.

Five (19%) studies attempted to verify significant associations that were identified through lab-based techniques.22,25,26,37,40 Two studies performed quantitative polymerase chain reaction (PCR) to verify the significant difference in gene expression among imaging phenotypes through association studies.22,26 Two studies performed gene expression analysis in corresponding cancer cell lines.37,40 One study performed immunological staining of epidermal growth factor receptor (EGFR) and found differentially expressed EGFR among different imaging phenotypes.25

Semi-radiogenomic studies

38 semi-radiogenomic studies were identified (Table 4).28,4379 Studies were published between 2005 and 2015. All studies were retrospective in design. Five studies used data from TCGA/TCIA;45,46,48,49,57 one study used multi-institutional data;50 and one study combined both institutional data and data from TCGA/TCIA.44 The number of patients ranged from 25 to 539 with a median of 75. Only two studies had a validation data set. The type and distribution of cancers in these studies were similar to those for radiogenomic studies, except for two studies that focused on low-grade glioma47,58 and one study that focused on diffuse large B-cell lymphoma.59 The imaging modalities used included CT (n = 9), MRI (n = 21, including five perfusion) and PET (n = 6). Two studies used both CT and MR.46,53 The number of imaging features extracted ranged from 1 to 120 with a median of 5. 11 (29%) studies used semi-automatic image feature extraction.45,49,55,58,6163,72,73,75,79 All studies (except eight studies which did not provide this information) had radiologist participation. 29 (76%) studies focused on individual genes;28,43,44,46, 47,48,5053, 54,56,57,59,60,64,65, 6671,74,7678 7 (18%) studies used gene clusters or subsets derived from primary genomic data;28,45,50,57,59,61,63 and 2 (5%) studies used a combination of both.48,55 14 (37%) studies included outcome data.28,47,48,50,51,55,57,59,61,63,6971,79 Only four studies correlated with histology.50,60,71,74 None of the study verified their results via wet-lab techniques.

Table 4. Semi-radiogenomic studies published in the literature

StudyYearCountryCancerImaging modalityNumber of featuresMethod of extractionRadiologist involvementIndividual genes vs gene clustersWet-lab validationHistologyOutcomeClinical outcomesData sourceNumberValidation set
Halpenny et al432014USALung adenocarcinomasCT14ManualYesIndividual genesNoNoNoNASingle30No
Karlo et al442014USACCRCCCT10ManualYesIndividual genesNoNoNoNASingle and TCGA233No
Mazurowski et al452014USABreast cancerMRI23Semi-automaticYesGene clustersNoNoNoNATCGA48No
Shinagare et al462015USACCRCCCT, MRI6ManualYesIndividual genesNoNoNoNATCGA103No
Wang et al472015ChinaLGGMRI1ManualYesIndividual genesNoNoYesPFS and OSSingle146No
Gutman et al482013USAGBMMRI26ManualYesBothNoNoYesOSTCGA75No
Gutman et al492015USAGBMMRI11Semi-automaticNAIndividual genesNoNoNoNATCGA75No
Banerjee et al502015USAHCCCT3ManualYesGene clustersNoYesYesRFS and OSMulti-institutional157No
Jamshidi et al282015USACCRCCCT28ManualYesGene clustersNoNoYesOSSingle70Yes
Carrillo et al512012USAGBMMRI9ManualYesIndividual genesNoNoYesOSSingle202No
Drabycz et al522010CanadaGBMMRI4ManualYesIndividual genesNoNoNoNASingle103No
Moon et al532012KoreaHGGCT and MRI10ManualYesIndividual genesNoNoNoNASingle32No
Aghi et al542005USAGBMMRI4ManualNAIndividual genesNoNoNoNASingle75No
Ellingson et al552013USAGBMMRI2Semi-automaticNABothNoNoYesPFS and OSSingle507No
Gupta et al562015USAGBMMRI (physiologic)3ManualYesIndividual genesNoNoNoNASingle106No
Jain et al572013USAGBMMRI (physiologic)3ManualYesGene clustersNoNoYesOSTCGA98No
Kickingereder et al582015GermanyLGG+anaplasticMRI (physiologic)1Semi-automaticYesIndividual genesNoNoNoNASingle73No
Lanic et al592012FranceDLBCLPET1ManualYesGene clustersNoNoYesPFS and OSSingle45No
Miles et al602014EnglandCRCPET/CT3ManualYesIndividual genesNoYesNoNASingle33No
Ashraf et al612014USABreast cancerMRI31Semi-automaticYesGene clustersNoNoYesPFSSingle56No
Li et al622014USABreast cancerMRI45Semi-automaticNAIndividual genesNoNoNoNASingle103No
Macyszyn632015USAGBMMRI120Semi-automaticNoGene clustersNoNoYesOSSingle105Yes
Rizzo et al642016ItalyNSCLCCT19ManualYesIndividual genesNoNoNoNASingle285No
Izuishi et al652012JapanCRCPET1ManualNAIndividual genesNoNoNoNASingle37No
Lee et al662016KoreaCRCPET4ManualYesIndividual genesNoNoNoNASingle179No
Kawada et al672012JapanCRCPET2ManualYesIndividual genesNoNoNoNASingle51No
Tykocinski et al682012USAGBSMRI (physiologic)1ManualNAIndividual genesNoNoNoNASingle132No
Kong et al692011KoreaGBSMRI (physiologic)1ManualYesIndividual genesNoNoYesPFS and OSSingle73No
Romano et al702013ItalyGBSMRI1ManualYesIndividual genesNoNoYesPFS and OSSingle47No
Sunwoo et al712013KoreaGBSMRI1ManualNAIndividual genesNoYesYesPFSSingle65No
Ahn et al722014KoreaGBSMRI9Semi-automaticYesIndividual genesNoNoNoNASingle43No
Sutton et al732015USABreast cancerMRI14Semi-automaticYesIndividual genesNoNoNoNASingle95No
Kitao et al742010JapanHCCMRI1ManualYesIndividual genesNoYesNoNASingle38No
Lee et al752013KoreaNSCLCCT9Semi-automaticYesIndividual genesNoNoNoNASingle153No
Glynn et al762010KoreaNSCLCCT5ManualYesIndividual genesNoNoNoNASingle64No
Plodkowski et al772015USANSCLCCT12ManualYesIndividual genesNoNoNoNASingle73No
Ozkan et al782015USANSCLCCT5ManualYesIndividual genesNoNoNoNASingle25No
Yoon et al792015USANSCLCCT and PET51Semi-automaticNAIndividual genesNoNoYesPFS and OSSingle539No

CCRCC, clear-cell renal cell carcinoma; CGH, comparative genomic hybridization; GBM, glioblastoma multiforme; HCC, hepatocellular carcinoma; HGG, high-grade glioma; LGG, low-grade glioma; NA, not available; NSCLC, non-small-cell lung cancer; OS, overall survival; PET, positron emission tomography; RFS, recurrence-free survival; TCGA, The Cancer Genome Atlas; TCIA, The Cancer Imaging Archive.


Radiogenomics is an emerging field that links tumour genotype with imaging phenotypes. Since 2007, a number of studies have been published on radiogenomic characterization of certain cancers.9,1540 These studies pioneered the feasibility of this approach and paved the way for future developments in the field. However, we noticed a number of issues from our analysis.

Study design

Only a handful of studies can be considered as “real” radiogenomics studies in the sense that they used whole genome data. The dimensionality of imaging, despite being rapidly increasing over time, is still orders of magnitude lower than that of whole-genome sequencing or molecular profiling.1 One of the limitations of the current radiogenomic research is the need to reduce the dimensionality of genomic data to match that of imaging. A common approach in analysing these data is to group individual genetic elements into gene modules before performing association analysis with imaging features. Given the tenuous imaging-to-genomics and genomics-to-outcome relationships, such an approach may further undermine the potential of imaging to predict patient outcomes, one of the primary goals of radiogenomic analyses.3

Standardization in imaging analysis

Traditionally, medical imaging has been a subjective or qualitative art. Recent advances in medical imaging acquisition and analysis allow the high-throughput extraction of specific imaging features to quantify the differences that oncologic tissues exhibit in medical imaging.80 Aerts et al9 evaluated a total of 440 CT features of the lung and head and neck cancers on the basis of four imaging characteristics:1 tumour intensity,2 shape,3 texture and4 wavelet features. These imaging features were extracted by an automated algorithm written in MATLAB® (MathWorks®, Natick, MA). Using a predefined vocabulary and analytical algorithm, Gevaert et al15 extracted 153 computational image features, 26 semantic image features and standardized uptake value from PET to characterize NSCLC in 26 patients. Grimm et al81 used computer vision algorithms to extract 56 imaging features from breast cancers including morphologic, texture and dynamic features. However, automatic extraction of quantitative imaging features, such as tumour morphology, texture and contrast kinetics, is limited to homogeneous tumours. For example, in the study of GBM, the most commonly computationally derived imaging feature was tumour volume. Other features were not routinely evaluated, likely because GBM commonly demonstrated significant intratumour heterogeneity.73,82 To overcome this limitation, Gevaert et al16 segmented the tumour into enhancing, oedematous and necrotic regions. Quantitative imaging features were then extracted from each region and correlated with genomic data.

Unfortunately, automatic imaging feature extraction was implemented in only a minority of studies. In the majority of studies, imaging features were manually assessed by radiologists. Manual analysis of images has certain disadvantages. In particular, manual extraction is subject to interobserver variability, random errors during manual contour tracing for mass volume etc. Furthermore, it is labour intensive. A future direction in the field of radiogenomics is the implementation of quantitative image analysis tools to allow comprehensive image feature extraction in a fast and reproducible manner. In addition, creating a lexicon and ontology of reproducible semantics and computed image features will permit images to be mineable in a manner similar to genomic data.

Segmentation conundrums

A variety of automated or semi-automated image segmentation methods are available. Some are based on the analysis of (often multiparametric) imaging signals in an unsupervised83,84 or supervised way.85 Oftentimes, anatomical statistical priors encode normal anatomy and hence find tumours as deviations from it.86 Segmentation methods that explicitly incorporate biophysical models of tumour growth, in a way to facilitate imaging-based segmentation, have also been proposed.87,88 Although validation of these methods is a very challenging and effort-demanding task, some international efforts for creating validation platforms have started to emerge. A prime example is the Brain Tumor Segmentation challenge organized annually, which uses TCIA and other public data sets, along with ground truth, to evaluate a variety of algorithms.41

Segmentation methods are usually a first step prior to extracting imaging features, which are used in conjunction to build biomarkers of gene expression. Commonly used features include volumetrics of enhancing and non-enhancing parts, and of surrounding oedema, textural properties of the tumour, which reflect the spatial heterogeneity of tumours, shape properties of tumour boundaries, which relate to infiltrative/aggressive tumour phenotypes, multiparametric histograms of various imaging measures, which relate to cell density, perfusion dynamics, gadolinium enhancement and water content, and various other properties. Such features have been found to jointly form good predictors of tumour molecular characteristics, especially when integrated via machine learning and other multiparametric analysis models.15,63,89

Functional imaging

Currently, automated extraction is limited to CT, which is the most widely used imaging modality in oncology with the ability to assess tissue density. Emerging functional and molecular imaging methods, such as PET/CT and dynamic contrast-enhanced (DCE) or diffusion-weighted MRI, have the potential to assess the in situ tumour's metabolic and proliferative activity with higher accuracy than traditional imaging methods.1 In the only prospective radiogenomic study published to date, Barajas et al correlated physiologic MR parameters with RNA expression patterns in enhancing vs peritumoral non-enhancing GBM biopsy samples.24 The authors found that T2* dynamic susceptibility-weighted contrast-enhanced perfusion-weighted and diffusion-weighted imaging measurements were significantly different between biopsy regions and correlated with GBM histopathological features of aggressiveness. In addition, the upregulated genes were associated with similar cellular malignant biologic processes that were observed to correlate with physiologic-based MRI measurements. In another study of 18 patients with GBM, Jain et al32 correlated CT perfusion parameters with genes that are related to angiogenesis regulation. Of the 92 angiogenesis-associated genes, 19 genes had significant correlation with the permeability surface area product and 9 genes had significant correlation with the cerebral blood volume. Unfortunately, both of these studies were hampered by the extremely small sample sizes. In the future, studies with a larger cohort size and variety of cancer types are needed to uncover the potential correlation between functional and molecular imaging parameters and genomic data.

Sample size

Studies included in our review are limited by a small sample size. In addition, these studies often lack complete characterization of the patients and suffer from poor integration of individual data sets. In fact, one of the greatest limitations that are often cited for these studies is the difficulty in obtaining original cohorts of patients with both appropriate imaging studies and adequate tissue samples for genomic analysis.3 However, it is important to keep in mind that routine imaging data are readily available in large quantities, many of which have corresponding archival tumour tissue available for various molecular analyses. These cases can be collected retrospectively and studied by investigators working at large clinical institutions. Furthermore, the cost of next-generation sequencing and other high-throughput molecular techniques has reduced to a fraction of what it was before.32 These assays can generate large amount of data that can potentially be harvested.

Molecular genetic analysis

As demonstrated by our review, most studies performed to date are limited to microarray data, since it is the earliest type of genomic analysis available to allow assessment of the differential changes in genome-wide gene expression levels. Three studies used micro-RNAs.1,23,27 Micro-RNAs are non-protein-coding small RNAs that serve as negative gene regulators by binding to a specific sequence in the 3′ UTR of a target gene.90 A single micro-RNA can potentially target hundreds of genes.90 Therefore, micro-RNAs were found to have important roles as tumour suppressors and oncogenes, as well as regulators of various cancer-specific cellular features, such as proliferation, invasion and metastasis.91,92 In one of the studies on radiogenomics of GBM, Zinn et al.23 incorporated micro-RNA data into the analysis of microarray association with imaging features. By correlating quantitative MRI data with microarray data, the authors found periostin (POSTN) as the top upregulated gene. Through additional micro-RNA analysis, they identified miR-219 as the top downregulated micro-RNA. miR-219 is known to have a potential binding site in the 3′ untranslated region (UTR) of the POSTN gene. This inverse correlation between POSTN and miR-219 suggests a potential role of miR-219 in downregulating POSTN in GBM mesenchymal transition and cellular invasion. More importantly, this signature can be non-invasively detected by routine MRI. In another study by Yamamoto et al,22 the authors used next-generation RNA sequencing to correlate the expression of long non-coding RNA with MRI phenotypes and the presence of early metastasis in breast cancer. Long non-coding RNAs represent an important class of regulatory RNAs that are longer than 200 nucleotides,93 exhibit exquisite cell and tissue specificity and are critical in maintaining tissue structure and organization.93,94 The above examples illustrate the importance of including multiple genomic data sets to derive maximum benefit from radiogenomic association maps. Using new genomic technologies such as next-generation DNA sequencing, single nucleotide polymorphism genotyping, chromatin immunoprecipitation and RNA sequencing into the fold has the potential to open up new frontiers for radiogenomic research.95 Moreover, understanding molecular pathways that result in these radiogenomically identified imaging features should be one of the primary goals of radiogenomic analyses, as it is a necessary path to demonstrate radiogenomics' clinical significance.

Study validation

Validation with prospectively collected independent cohorts is the most robust approach and gold standard for verifying an identified statistical association.96 However, in our revealed literature, validation data set was used in only eight studies (30%). The decision to not proceed with validation data set in other studies may have stemmed from the data availability issue, as previously discussed. Genomic data are the hardest to obtain because they may require fresh tissue specimens. Gevaert et al15 demonstrated a radiogenomic strategy to rapidly identify prognostically significant image biomarkers. By using specific genomic characteristics as intermediate, they linked imaging data in the first data set to survival in the second data set. Since long-term clinical follow-up may not be feasible in patients with both genomic and imaging data, the authors argued that their approach was able to leverage imperfect data sets to draw new conclusions. However, this approach requires the existence of large gene expression data sets where survival outcomes are available. TCGA is a publicly available resource that contains multidimensional genomic and clinical data set for multiple types of adult cancers.97 TCIA is another publicly available resource that contains imaging corresponding to these patients in TCGA.98 However, the usefulness of radiological data that are contained in the TCIA is limited by the lack of image sample registration (i.e. gene expression profiles cannot be matched to a specific location on imaging). Successful attempts have been made to account for these differences by rigid alignment and registration with proper segmentation.23 As the imaging acquisition protocols become increasingly standardized and outcome data become more mature, public databases such as TCGA and TCIA will not only serve as powerful validation tools, but more importantly, as the foundation for further radiogenomic discoveries.5

Histopathological correlation

Radiologic studies can be correlated with whole-genome mapping, histopathology and specific genes. Most of the studies in our review (15/27, 56%) did not perform histopathology association with imaging data. In one of the studies, Pope et al26 found that incomplete enhancing imaging phenotype was associated with increased levels of oligodendroglioma marker oligodendrocyte lineage transcription factor 2 and achaete-scute complex-like 1 than completely enhancing the imaging phenotype. The authors confirmed this finding with histopathology, which showed a higher percentage of substantial oligodenroglioma histologic component in the incomplete enhancing group vs the complete enhancing group. In another study by Colen et al,1 the authors found that patients with GBM with low volumes of necrosis had a high prevalence of X-linked genes, while those with high volumes of necrosis had a high prevalence of Y-linked genes. Subsequently, the authors showed that in contrast to male patients, female patients with low volumes of necrosis on MRI had a significant survival advantage. This result was confirmed by a separate validation data set of 368 patients, where the authors were able to demonstrate that in female patients, cell death on histology was associated with a survival advantage. In another study by Pope et al, the authors correlated differential gene expression in GBM with apparent diffusion coefficient (ADC) histogram. They found that 6 of the 13 genes with increased expression in ADC tumours were isoforms of collagen-binding proteins.35 In order to confirm this result, the authors performed immunohistochemistry in both high- and low-ADC tumours to compare the expression of decorin and collagen one, three and six isoforms. There was no significant correlation between ADC values and collagen immunoreactivity scores. However, multiple patterns of immunoreactivity, including perivascular, interstitial and cytoplastmic patterns, were associated with higher ADC.35

Result verification

A limited number of studies (5/27, 19%) used wet-lab techniques to verify significant findings from their radiogenomic analyses.22,25,26,37,40 For example, Diehn et al25 confirmed EGFR overexpression among imaging phenotypes of GBM with immunohistochemistry. Halle et al40 found that in cervical cancer, the most differentially expressed gene sets between tumours with high and low ABrix (ABrix is the amplitude, Kep the transfer rate from tissue to plasma) on DCE-MRI were hypoxia-related features. To verify this result, the authors subjected three cervical cancer cell lines to hypoxia and performed gene expression profiling between the normoxia- and hypoxia-treated cell lines. The authors found that HIF1α protein was upregulated in all three hypoxia-treated cell lines. On the other hand, only minor changes of HIF1α protein regulation were observed in the control. The protein expression of HIF1α was further evaluated by immunohistochemistry and correlated with DCE-MRI in additional 32 patients. These results demonstrated that tumours with low ABrix was significantly associated with higher HIF1α expression than those with high ABrix. Verifying histopathological and molecular correlations of radiogenomic data significantly improves the quality of the radiogenomic study. Most importantly, multidimensional evaluation of biologic data allows one to gain causative insight into the underlying significance of initially discovered imaging feature—genomic correlation.

Clinical translation

Given that radiogenomics is still at its infancy, the full potential of clinical translation is yet to be realized. Nevertheless, several studies have demonstrated early promise. One example is in the research of HCC. In HCC, microscopic venous invasion (MVI) is a well-established sign of poor prognosis. However, it is extremely difficult to predict MVI using conventional imaging methods such as MRI.99,100 Currently, MVI can only be reliably diagnosed by the histology of the explanted tissue when its clinical utility is marginal. In 2002, Chen et al101 identified a 91 gene expression signature via microarray analysis that had significant correlation with the presence of vascular invasion. In 2007, Segal et al20 found that these 91 genes in the “venous invasion signature” were associated with two predominant imaging traits on CT—the presence of “internal arteries” and absence of “hypodense halos”. In a study of 157 patients with HCC who underwent surgical resection or liver transplant, Banerjee et al again demonstrated that these two imaging biomarkers, along with “tumour–liver difference”, were able to predict histological MVI with high precision. In addition, this radiogenomic biomarker consisting of these three features was associated with early disease recurrence and poor OS.50 Therefore, this marker can be extremely useful in identifying patients who are less likely to benefit from surgical treatment or liver transplant. The example above illustrates the significant impact radiogenomic analysis can have on patient management.

Radiogenomics can have significant impact on routine radiology practice. Similar to histopathology, the goal of radiogenomics is to provide information on the tumour that can be used to guide treatment and predict survival. Ideally, all of this can be achieved non-invasively with routine imaging studies. For example, in patients with CCRCC, a prognostic multigene signature, termed radiogenomic risk score, was constructed and shown to predict disease-specific survival, independent of disease stage, disease grade and performance status.102 The radiogenomic risk score consists of four CT imaging features: the pattern of tumour necrosis, tumour transition zone, tumour–parenchyma interaction and tumour–parenchyma interface. If further validated, such a radiogenomic signature can potentially be used in a way that coronary calcium score is used to improve risk stratification for future cardiovascular events.103 If the radiogenomic risk score incorporates genomic data in addition to radiologic data, the radiologist can issue an addendum to the report once genomic data from pathology become available. This enhances the radiologist's role in patient care by providing the ordering physician important information beyond what is typically reported for CCRCC (e.g. lymph node involvement, renal vein invasion etc.). Furthermore, radiologists are likely to gain a crucial role in clinical trials that use such a radiogenomic signature to divide patients into different risk groups.


The emerging field of radiogenomics has shown the potential to provide additional insights into tumour biology based on imaging data. Current studies are limited to six types of common cancers: glioma, NSCLC, HCC, breast cancer and cervical cancer. Extension of existing research methods to other tumour types will likely uncover additional associations between molecular properties and imaging characteristics. An ideal design for a radiogenomic study is illustrated in Figure 1. Future studies should strive to incorporate as many elements shown as possible. Once a link between an imaging phenotype and a molecular signature is uncovered, imaging studies of previously treated patients (such as those on clinical trials) can be re-examined to assess the clinical significance of this new link. In the future, gene expression profiling by non-invasive imaging may supplement histologic examination for cancer diagnosis and prognosis (Figure 2).

Figure 1.
Figure 1.

Literature search of published studies on radiogenomics.

Figure 2.
Figure 2.

Ideal design for a radiogenomic study. CGH, comparative genomic hybridization; CHIP, chromatin immunoprecipitation; OS, overall survival; PET, positron emission tomography; PFS, progression-free survival; SNP, single nucleotide polymorphism.

Another opportunity in radiogenomics is in identifying imaging features that predict region-specific gene expression signatures within the tumour in the proper anatomic context of the patient. Intratumour heterogeneity, in addition to intertumour heterogeneity, has been increasingly recognized as the source of cancer's development of resistance to chemotherapy after initial response.104 Several studies have shown the existence of genomic differences between different regions of the same tumours and correlated them with imaging findings.24,105,106 Further development will require imaging modalities with high resolution for proper spatial registration.107 Targeted tissue specimens from a radiographically diverse region can be studied on a per tumour basis, per patient basis or on a population basis, to allow for additional levels of multiple hypothesis testing. In the near future, it may be possible to detect intertumoural differences in treatment response at the imaging level, thereby guiding personalized and tumour-specific treatment. While public repositories, such as those supported by TCGA and TCIA, continue to grow, it is important to procure, develop and evaluate additional data sets to ensure the depth and breadth of the sample population in each study.5

Given the non-invasive nature of medical imaging and its wide use in clinical practice, radiogenomics has the potential to impact on the treatment and prognosis of a wide range of human cancers. Identification of imaging phenotypes that are associated with distinct molecular phenotypes will help advance individualized patient care.


Volume 89, Issue 1061May 2016

© 2016 The Authors. Published by the British Institute of Radiology


  • ReceivedDecember 06,2015
  • RevisedJanuary 30,2016
  • AcceptedFebruary 09,2016
  • Published onlineMarch 09,2016