DIMACS Workshop on Computational Methods for Predicting Outcome in Cancer

May 29 - 30, 2007
DIMACS Center, CoRE Building, Rutgers University

Dechang Chen, Uniformed Services University of the Health Services, dchen@usuhs.mil
Xue-Wen Chen, University of Kansas, xwchen@ku.edu
Donald Henson, The George Washington University Cancer Institute, patdeh@gwumc.edu
Bill Shannon, Washington University in St. Louis, wshannon@wustl.edu
Li Sheng, Drexel University, lsheng@math.drexel.edu
Presented under the auspices of the DIMACS/BioMaPS/MB Center Special Focus on Information Processing in Biology.

This special focus is jointly sponsored by the Center for Discrete Mathematics and Theoretical Computer Science (DIMACS), the Biological, Mathematical, and Physical Sciences Interfaces Institute for Quantitative Biology (BioMaPS), and the Rutgers Center for Molecular Biophysics and Biophysical Chemistry (MB Center).


N. David Charkes, Section of Nuclear Medicine, Temple University Hospital, and Lynn Ries, Division of Cancer Statistics, National Cancer Institute

Title: Exponential Decomposition and Computer Simulation of Colorectal Cancer Survival Data

Cancer survival data, as currently presented, is a composite of those cured of the disease and those with residual disease, both overt and subclinical, less those who have died either of the disease or of intercurrent illness. We have decomposed the survival curves of patients with colorectal cancer using compartmental analysis and SEER data for the years 1982-1992, prior to the current era of novel forms of chemotherapy (viz., monoclonal antibodies, growth factor inhibitors, etc.). There were 38,876 patients with colon cancer in the SEER database available for analysis, and 16,186 with rectal cancer, divided into four stages each, as described in the Manual for Staging of Cancer of the American Joint Committee on Cancer of the American College of Surgeons. METHODS: Each group contained four distinct sets: (1) no disease (i.e., cured); (2) subclinical disease; (3) overt, active disease ('relapse'); and (4) remission. In preliminary studies the mortality rate in the 'no disease' group was best described by a negative exponential; we assumed similar first-order intercompartmental transfer rates between certain other groups (see appended model). In keeping with clinical experience, and to reduce uncertainties, we further assumed that patients in relapse (clinical recurrence) could either be cured or go into remission once, since the latter is uniformly fatal. Disease-specific mortality rates were obtained by subtraction of the non-disease mortality from the observed rate for each set. The simulations were run using SAAM-30 software, which fits the data using the Levenberg-Marquardt algorithm, as well as TUTSIM, which uses the simplex algorithm. RESULTS: Excellent fits to the data were obtained with both approaches, an example of which is shown below. We found that after initial surgery, residual subclinical disease was greater with each more-advanced stage (II-IV), and always greater with rectal cancer than with colon cancer. A small fraction of patients was cured at each stage. However, once metastatic disease has occurred, the fraction of patients entering remission as a result of treatment was independent of whether the primary was in the colon or rectum, as were the location- and stage-specific mortality rates. VALIDATION: To validate the model we selected a publication dealing with the effect of adjuvant treatment of colorectal cancer, subsequent to the time frame of our study. The relapse rate data from the paper were then fitted to the model prediction; an excellent fit was obtained, thus supporting the model methodology.

CONCLUSION: We have defined and validated a model for the study of colorectal cancer which gives numerical projections for the progression of disease by stage and initial location, by means of exponential decomposition of the survival curves. The model provides numerical information which is not readily available to clinicians and patients.

Dechang Chen, The Uniformed Services University of the Health Sciences
Kai Xing, The George Washington University
Donald Henson, The George Washington University Cancer Institute
Xiuzhen Cheng, The George Washington University

Title: New Methods for Predicting Outcome in Cancer Patients

Based on the availability of large cancer patient datasets, there is now the opportunity to expand the TNM (Tumor, Lymph Node, Metastasis) in order to provide more accurate outcome prediction for cancer patients. In this presentation, we describe two methods for predicting outcome, which are based on the concepts of group testing and clustering. Our approaches focus on grouping patients into subgroups such that two survival experiences corresponding to two patients from the same subgroup should be close to each other, while two survival experiences corresponding to two patients from different subgroups should differ significantly. Method I involves a series of steps, where at each step the cancer data is partitioned using levels of one individual predictive factor and then similar groups (groups with similar survival functions) are merged. Method II starts with the partition of data using the combinations of levels of factors and then the two most similar groups are merged at each subsequent step. Two demonstrations are given: lung cancer for method I and breast cancer for method II.

Donald E Henson, George Washington University

Title: The Ideal Staging Systems

Medicine entails the science of prediction. A classification of the severity of cancer is called staging. Staging is needed primarily to predict survival and guide treatment. Classification depends on predictive factors that are clinical or laboratory values associated with survival. There are three types of predictive factors: 1) natural history, which is the course of the disease without treatment, 2) treatment related, and 3) response to therapy. Prognostic factors are different for each oncological specialty, because they are related to treatment. For example, the TNM (Tumor, Lymph Node, Metastasis) is used by surgeons. It is an ideal system of staging because the predictive factors and extent of treatment are the same. An anatomic based system of therapy implies anatomic based predictive factors. A molecular based system of therapy implies molecular based predictive factors. A predictive factor not related to therapy will have minimal use in therapeutic decisions. For this reason, the histological grade has not been formally incorporated into the TNM system.

Aiguo Li, Qin Su, Yuri Kotliarov, Jennifer Walling, Jean Claude Zenklusen, and Howard A. Fine, National Cancer Institute, National Institute of Neurological Disorder and Stroke, National Institute of Health

Title: Refining Glioma Subtypes for Identifying Efficient Prediction Signatures Using Gene Expression Profiling Data

Glioma is the most common type of primary brain tumor and has a high mortality. Therapeutic decisions and prognosis have historically been based histological diagnosis, which are subjective demonstrating intraobserver variability and not biologically based. The goals of our project are to develop a comprehensive and objective classification schema for glioma based on gene expression profiling data and to identify classifiers for each group. Two unsupervised methods of K-mean and non-negative matrix factorization are used to classify about 200 primary brain tumors. The glioma samples are separated into two main types (Oligoc and Gbmc) with each subdivided into 2 subtypes respectively. The significant differences in survival days were found between the two main types and between the two Oligoc subtypes. The classifiers for each type and subtype are identified using PAM. Furthermore, the classifiers are used to predict an independent data set of additional 200 primary brain tumors and the prediction accuracy for the main types is greater than 90% and for the Oligoc subtypes is about 80%. The further validation of the classifiers and the functional and biological annotations of these novel classes are underway.

Jay Piccirillo, Washington University School of Medicine and Siteman Cancer Center

Title: The Inclusion of Comorbidity in Cancer Statistics

For nearly 50 years, cancer patients have been staged based on the size of their tumor while the clinical condition of the patient has been ignored. The clinical condition of the patient is represented by the tumor-related symptoms and the general health of the patient, defined as the number and pathophysiological severity of coexisting diseases, illnesses, or conditions. These coexisting diseases, which exist before cancer diagnosis and are not adverse effects of cancer treatment, are generally referred to as comorbidities. While comorbidity is a routine consideration by the physician in estimating prognosis and selecting treatment, it is generally not included in cancer registries or observational research.

This presentation will describe methods for comorbidity assessment and successful inclusion in cancer registries. The presentation will also demonstrate the prognostic importance of comorbidity and methods for inclusion of comorbidity as an additional data element in a cancer staging system. Upon completion of this presentation, the audience should appreciate that the continued exclusion of cogent comorbidity information from cancer statistics is a major omission. Significant improvement in the description of the cancer patient, estimates of prognosis, assessment of therapeutic effectiveness from observational research, and assessment of quality of care can result from the inclusion of comorbidity.

Gunter Schemmann, Princeton University and the University of Medicine and Dentistry of New Jersey

Title: Correlating Microarray and Clinical Data with Outcome for Colon Cancer Patients

A simple method for correlating both microarray as well as clinical data with outcome is presented. The patients are grouped into two classes based on the follow up status and interval: a good outcome and a poor outcome class. These classes are then used to perform a t test for numerical data and an extension of Fisher's exact test for categorical variables. The method is applied to a colorectal cancer data set that includes RNA microarray data as well as extensive clinical data for 176 patients. The effect of the clinical attributes on the analysis will receive special attention. In particular in has been found that when patient staging is taken into account the results from the RNA microarray are very different than when the stages are ignored.

William Shannon, Washington University in St. Louis

Title: Cluster Analysis in Tumor Staging

This talk presents an overview of the purpose and implementation of hierarchical cluster analysis. The assumptions, calculations and output obtained are presented in a tutorial format. An extension of this to tumor staging based on ultrametrics is considered, and a contrast between clustering (not using patient level outcomes) and classification/prediction (using patient level outcomes in the model fitting) is presented.

Martin R. Weiser, Weill Medical College of Cornell University and Memorial Sloan-Kettering Cancer Center

Title: Improving Colon Cancer Staging with Nomograms

Staging and estimating long-term outcome is a critical component of caring for the cancer patient. Physicians rely on recurrence and survival estimates when recommending treatment strategies, and patients utilize these estimates when deciding whether to pursue a therapeutic course. Current cancer staging is based on the anatomic TNM system developed in 1959 by the American Joint Committee on Cancer (AJCC). Despite various iterations, the scheme has not significantly changed and is very similar to Dukes' original staging system of 1932. Although attractive because of its simplicity, the AJCC TNM system has significant limitations including inadequate predictive accuracy. In particular, patients are placed in discrete groups and assumed to have homogenous outcome when, in fact, survival is quite heterogenous. A more flexible, expandable, and accurate system is needed to estimate recurrence and survival for cancer patients. Nomograms are graphical representations of prediction models based on multivariate regression and attempt to provide a more individualized outcome calculation. Utilizing categorical and continuous variables, nomograms can include a variety of epidemiologic, clinicopathologic, and treatment-related factors into the predictive model. We have developed a series of nomograms using both population and institutional databases in an effort to enhance colon cancer staging. Increasingly complex models are explored to determine the level of sophistication necessary to provide improved staging over the current AJCC system which is accurate, practical, and generalizable.

Previous: Program
Workshop Index
DIMACS Homepage
Contacting the Center
Document last modified on May 17, 2007.