Identification of a ferroptosis-related gene signature predicting recurrence in stage II/III colorectal cancer based on machine learning algorithms
Background: Colorectal cancer (CRC) is one of the most prevalent cancer types globally. A survival paradox exists due to the inherent heterogeneity in stage II/III CRC tumor biology. Ferroptosis is closely related to the progression of tumors, and ferroptosis-related genes can be used as a novel biomarker in predicting cancer prognosis. Methods: Ferroptosis-related genes were retrieved from the FerrDb and KEGG databases. A total of 1,397 samples were enrolled in our study from nine independent datasets, four of which were integrated as the training dataset to train and construct the model, and validated in the remaining datasets. We developed a machine learning framework with 83 combinations of 10 algorithms based on 10-fold cross-validation (CV) or bootstrap resampling algorithm to identify the most robust and stable model. C-indice and ROC analysis were performed to gauge its predictive accuracy and discrimination capabilities. Survival analysis was conducted followed by univariate and multivariate Cox regression analyses to evaluate the performance of identified signature. Results: The ferroptosis-related gene (FRG) signature was identified by the combination of Lasso and plsRcox and composed of 23 genes. The FRG signature presented better performance than common clinicopathological features (e.g., age and stage), molecular characteristics (e.g., BRAF mutation and microsatellite instability) and several published signatures in predicting the prognosis of the CRC. The signature was further stratified into a high-risk group and low-risk subgroup, where a high FRG signature indicated poor prognosis among all collected datasets. Sensitivity analysis showed the FRG signature remained a significant prognostic factor. Finally, we have developed a nomogram and a decision tree to enhance prognosis evaluation. Conclusion: The FRG signature enabled the accurate selection of high-risk stage II/III CRC population and helped optimize precision treatment to improve their clinical outcomes.