DIMACS/DyDAn Workshop: Investigation of Disease Clusters:
Transitioning to the 21st Century and Beyond*

May 6 - 8, 2008
DIMACS/DyDAn Center, CoRE Building, Rutgers University

Andrew Lawson, University of South Carolina, alawson at gwm.sc.edu
Daniel Wartenberg, Robert Wood Johnson Medical School, dew at eohsi.rutgers.edu
Presented under the auspices of the Special Focus on Computational and Mathematical Epidemiology and the Center for Dynamic Data Analysis (DyDAn).

*Funded by DIMACS and the UMDNJ Academic Partnership for Environmental Public Health Tracking (1 U19 EH000 CDC grant).


Geoffrey M. Jacquez, BioMedware

Title: Methodological and Logistical Problems in Disease Clustering

This presentation considers several problems that pose considerable obstacles to the accurate detection of disease clusters: Pareidolia, invalid assumptions and technological determinism. Pareidolia is the perception of pattern where it does not exist, and emerges from the tendency of the human eye and mind to impute meaning when it is not warranted. Invalid assumptions commonly employed in disease cluster analyses include (1) clusters have a specific shape; (2) people are immobile and (3) disease has no latency. These strong assumptions are typically invalid and can lead to invalid conclusions. Technological determinism describes the recognition and description of problems in terms of the technology available to address them - "when one has a hammer everything starts to look like a nail". GIS have been described as based on a "static world view", and this impacts how we both represent and analyze disease clusters.

Geoffrey M. Jacquez, BioMedware

Title: Are Disease Cluster Investigations Biased Towards False Positives? The Shape of Things to Come

To date, two of the major deficiencies of geographic studies of disease clusters are that they often assume clusters have a specific shape (e.g. circle or ellipse) and do not evaluate statistical power using the geography, at-risk population, demographics, covariates and numbers of observed cases of the cancer under investigation. This presentation describes a new approach (called Cluster Morphology Analysis, CMA) designed to overcome these limitations. Power analyses are conducted for 11 clustering techniques using a suite of plausible clusters of different sizes, relative risks and shapes. The results are then ranked by statistical power and by the proportion of false positives, under the rationale that the objective of cluster-based disease surveillance should be to (1) find true clusters while (2) avoiding false clusters. CMA then synthesizes the results of those clustering methods found to have the best statistical performance. This approach is applied to pancreatic cancer mortality for white males in Michigan, and identifies a significant cluster in the Detroit metropolitan area that persists and grows from 1950 to the present day. The existence of this cluster is corroborated by SEER data, with white male pancreatic cancer in this cluster the highest out of the 17 areas comprising the SEER registries. CMA is a significant advance over clustering approaches that assume just one shape and rely on only one clustering method.

Lan Huang, NCI

Title: Developments in Scan Statistics

In spatial epidemiology and disease surveillance, methods have been developed for disease mapping, for global clustering evaluation, for cluster detection. All the methods are developed to help understanding the spatial distribution of the disease risk. Scan statistics have been classified as methods for cluster detection. We will briefly describe the basic concepts in constructing a spatial scan statistic, the properties and possible applications of spatial scan statistics, and comparisons of scan statistics and other methods for cluster detection. We will exam the path of the development in spatial scan statistics, from one dimension to multi-dimension, from simple Bernoulli model and Poisson model to more complicated models, from circular scan window to flexible scan window, and the other possible new directions. The motivations and examples in applications for the discussed methods are also presented. Finally, we will discuss the expanding of the applications of scan statistics for disease monitoring and epidemiological study, and possible applications beyond disease surveillance.

Andrew Lawson, University of South Carolina

Title: Disease Cluster Detection Methods: an overview

Methods for the detection of disease clusters have developed considerably over the last two decades. The methods fall largely into two broad categories: testing methods and modeling methods. In this talk I will review the pros and cons of either method and I will highlight some newer approaches that might be useful. I will stress the flexibility of modeling, in particular Bayesian modeling, and the possibility of using some simple posterior summaries to get a range of information out of models: relative risk estimates, hot spot clustering and residual diagnostics.

Richard J.Q. McNally, Newcastle University, UK

Title: Cluster Detection by Active Assessment of Regional or National Incidence Data

Anecdotal reports from the UK had indicated that there might be localized excesses in the incidence of certain childhood cancer (most notably childhood leukaemia). Subsequently, a number of studies have systematically investigated clustering. Both spatial and space-time clustering have been analysed using regional and national data from the UK. These studies have found evidence of overall clustering effects that were limited to specific diagnostic groups. We interpret these findings together with other epidemiological evidence. For childhood leukaemia the clustering patterns are consistent with the involvement of transient exposures (such as infections) in aetiology.

Richard J.Q. McNally, Newcastle University, UK

Title: Methods for Analysing Global Clustering of Disease

The recent use of two methods for studying overall clustering of disease is described: (1) the Potthoff-Whittingill method for analysing spatial clustering; (2) a method based on K-functions for analysing space-time clustering.

Allison Shevock, Division of Public Health, Delaware Health and Social Services

Title: Recent Cancer Clusters Discovered in Delaware

A recent report published by the Delaware Division of Public Health (DPH) computed all- site, breast, colorectal, lung, and prostate cancer incidence rates at the sub-county, or Census County Division (CCD), level. Average annual age-adjusted incidence rates were calculated using five years of data (2000-2004). Results showed that eight of the 27 CCDs in Delaware had significantly elevated incidence rates for one or more cancer types compared to the state as a whole. This presentation focuses on report findings, Delawareans' reactions to the report, and how DPH is addressing citizens' concerns and desire for more cancer-related information.

Dan Wartenberg, UMDNJ

Title: Parsing Cluster Activities: Response, Surveillance and Etiology

One of the major challenges of responding to concerns about possible disease clusters is the development of strategies of investigation. Quartaert et al. suggest that rather than viewing such cluster concerns as reports necessitating "crisis management," one ought to consider three complementary strategies: cluster response, proactive monitoring clustering and etiologic cluster research. These correspond to different public health contexts: public health action, public health surveillance and public health research. This presentation will review some of this history of strategies for responding to cluster concerns and provide some ideas for managing the concerns.

Dan Wartenberg, UMDNJ

Title: Using P-Values for Making Cluster Investigation Decision: Appropriate Prioritization or Misrepresentation

One of the more challenging issues in evaluating data on reported disease clusters is assessing the likelihood that the observation is indicative of variation greater than would be expected from random disease processes. This issue is often framed as a concern about false positive reports and leads some to suggest the need to specify a more extreme probability value than the traditional p<0.05, and sometimes includes adjustments for multiple comparisons. Implicit in this view is the assumptions that many potential cluster situations are evaluated implicitly by cases not being observed, and that is what necessitates the adjustments for multiple comparisons. An alternative view is that cluster observation and reporting is biased, that some clusters, with known etiology, do not get reported as clusters, and thus result in false negatives. This presentation will provide examples and suggest how surveillance and monitoring might be used to address some of these concerns.

Newspaper Articles:

Previous: Program
Workshop Index
DIMACS Homepage
Contacting the Center
Document last modified on May 1, 2008.