Radiomics is a promising approach to identify patients at high risk of having pulmonary dysfunction caused by radiotherapy. This study aims to identify optimal radiomic input features for predicting pulmonary function. Forced expiratory volume in first second (FEV1) and forced vital capacity (FVC) were measured for 257 patients between 3 months prior to and 1 week after the first radiotherapy. FEV1/FVC ratio dichotomized at 70% was used as a target variable. Each patient had a radiotherapy planning CT and associated contours of gross tumor volume and left/right lungs. A total of 2,658 radiomic features were extracted and categorized into five levels: shape (S), first- (L1), second- (L2) and higher-order (L3) local texture, and global texture (G) features, as well as four multilevel groups: S+L1, S+L1+L2, S+L1+L2+L3, and S+L1+L2+L3+G. Nested cross-validation (NCV) was used to identify optimal input features. Cross-validated glmnet models optimized with unilevel or multilevel features were used to assess predictive performance on outer CV test sets. In unilevel analysis, the highest test AUC of 0.743±0.067 was obtained from NCV models optimized with L1 features. The best performance was achieved from NCV models optimized with S+L1+L2 features with AUC of 0.752±0.063. Paired Wilcoxon signed rank test results showed that AUC values of NCV models optimized with S, L2, L3, G or S+L1+L2+L3 features were statistically significantly different from those optimized with S+L1+L2 features (P<0.05). The multilevel analysis strategy will help to handle and optimize radiomic input features.
|