Skip to main content

Table 1 Prediction of HPV status based on radiomics features of HNSCC tumors

From: Applications of radiomics in precision diagnosis, prognostication and treatment planning of head and neck squamous cell carcinomas

Authors, yearSample size, cancer typeGround truthImaging modalityML classifierMetric: maximum performance a
Bogowicz et al. 2017 [50]Train: 93, HNSCC
Test: 56, HNSCC
p16Contrast CTLogistic regressionTest-AUC: 0.78
Buch et al. 2015 [48]Total: 40, OPSCCNot reportedContrast CTn/a bn/a b
Fujita et al. 2016 [49]Total: 46: non-OPSCCNot reportedContrast CTn/a bn/a b
Huang et al. 2019 [51]Train: 113, HNSCC
Test: 53, HNSCC
Train: HPV RNA c
Test: p16
Contrast CTLASSO-regularized logistic regressionNested CV-AUC: 0.73
Test-AUC: 0.76
Leijenaar et al. 2018 [52]Train: 628, OPSCC
Test:150, OPSCC
p16Contrast CTLASSO-regularized logistic regressionTest-AUC: 0.70–0.80 d
Mungai et al. 2019 [53]Total: 50, OPSCCNot reportedContrast CTLogistic regressionn/a e
Parmar et al. 2015 [54]Train: 136, OPSCC and LSCC
Test:95, OPSCC
Not reportedContrast CTLogistic regressionTest-AUC: 0.60
Ranjbar et al. 2018 [55]Total: 107, OPSCCHPV DNA-ISHContrast CTDiagonal quadratic discriminant analysisLOOCV-AUC: 0.80
Yu et al.
2017 [56]
Train: 150, OPSCC
Test:165, OPSCC
p16Contrast CTLogistic regressionCV-AUC: 0.75
test-AUC 1 f: 0.87
test-AUC 2 f: 0.92
Zhu et al.
2018 [57]
Total: 126, HNSCCNot reportedContrast CTRandom forestCV-AUC: 0.71
  1. a The reported performance pertains to pure imaging feature-based HPV classification (i.e. models with clinical features were not considered)
  2. b A t-test was used to evaluate differences in texture parameters between HPV-positive and HPV-negative cases
  3. c The VirusSeq-software was used to detect strain-specific HPV RNA sequences in whole-transcriptome sequencing data [51, 58]
  4. d This study evaluated the impact of CT artifacts on the HPV classification performance. A test set AUC performance of 0.8 was achieved after exclusion of all artifact-affected cases from both the training- and test set. The test AUC ranged between 0.70 and 0.80 for all evaluated dataset combinations, including those with artifacts, and was not significantly different for all tested models
  5. e The logistic regression model was trained and tested on the same dataset without feature selection or cross validation, which is prone to overfitting, and overestimation of classification accuracy
  6. f Study reports results of winning submission of radiomics competition, wherein 165 test cases were split into two test sets
  7. AUC Area under the receiver operating characteristics curve, CV Cross validation (of total set or training dataset), DNA-ISH DNA in situ hybridization, HNSCC Head and neck SCC, LOOCV Leave one out cross validation of total set, LSCC Laryngeal SCC, OPSCC Oropharyngeal SCC, Test Independent test dataset, Total Only one dataset used, Train Training dataset