Abstract
Objectives:
1) Recognize that data mining is useful to analyze multiple patient and tumor factors and make predictions about outcome in head and neck cancer patients.
Methods:
Retrospective study of 124 patients with oropharyngeal squamous cell carcinoma treated at a tertiary care institution from 2000-2009 for whom 2 year outcome was known. Data was collected from the medical chart. PASW Modeler was used to analyze the data. The outcome measurement was overall 2 year survival. Receiver-operating characteristic (ROC) values (false positive rate (FPR), true positive rate (TPR)) were calculated for each outcome predictor model.
Results:
For the entire group, overall two year survival was 82%. The models used to predict survival were p16 status, Stage IV status and PASW Modeler Decision Tree analysis. ROC values for each model were as follows: p16 training 0.56, 0.74; p16 validation 0.25, 0.83. Stage IV training 0.83, 0.73; Stage IV validation 0.75, 0.61. Decision Tree training 0.00, 0.88; Decision Tree validation 0.75, 0.78.
Conclusions:
Data mining shows promise in making outcome predictions in patients with head and neck cancer. This study shows that data mining performs better than outcome prediction using a stage-based model. Although the ROC curve analysis for the Decision Tree training set was excellent, the validation set fell below the p16 validation set. We suspect that this may be due to the small size of our data set. To overcome this downfall, we will discuss the results from a K-fold cross-validation process which sequentially uses all the data in the validation set.
Get full access to this article
View all access options for this article.
