DIMACS Working Group on Adverse Event/Disease Reporting, Surveillance, and Analysis

October 16 - 18, 2002
DIMACS Center, CoRE Building, Rutgers University

Donald Hoover, Rutgers, Statistics, drhoover@stat.rutgers.edu
David Madigan, Rutgers, Statistics, madigan@stat.rutgers.edu
Henry Rolka, (CDC), hrr2@cdc.gov
Presented under the auspices of the of the Special Focus on Computational and Mathematical Epidemiology. Co-sponsored by the American Statistical Association, Section on Statistics in Epidemiology.

DIMACS Subgroup on Adverse Event/Disease Reporting, Surveillance, and Analysis

DIMACS Working Group on Adverse Event/Disease Reporting, Surveillance, and Analysis II


Dan Sosin, CDC

Overview of Uses for Public Health Surveillance

Abstract: Public health surveillance is the ongoing, systematic collection, analysis, and interpretation of health-related data essential to the planning, implementation, and evaluation of public health practice. This presentation will address how public health surveillance supports assessment by estimating the magnitude of public health problems, describing the natural history of a disease, determining the distribution and spread of illness, detecting outbreaks, stimulating research, evaluating public health practice, monitoring changes in disease agents, detecting changes in health practices, and facilitating planning. With increased interest for public health surveillance to detect outbreaks, non-traditional data sources are being explored and performance priorities for surveillance systems are being modified. This presentation will highlight changing demands on public health surveillance and current research needs.

Bio: Dr. Daniel M. Sosin is the Director of the Division of Public Health Surveillance and Informatics in the Epidemiology Program Office (EPO), at the Centers for Disease Control and Prevention (CDC) in Atlanta, Georgia.

He began his career at CDC in 1986 as an Epidemic Intelligence Service (EIS) Officer in the U.S. Public Health Service assigned to the Kentucky Department for Health Services in Frankfort, Kentucky. He served as a CDC Preventive Medicine Resident in the Division of Injury Epidemiology and Control in 1988-89. He supervised state-based EIS Officers as a Section Chief in EPO from 1989-1994 and then returned to the National Center for Injury Prevention and Control (NCIPC) to study traumatic brain injury (TBI) and develop longitudinal surveillance of TBI. Dr. Sosin's scientific investigations include numerous epidemiologic investigations of TBI, including those resulting from motorcycle and bicycle crashes, adolescent risk behaviors, and a range of outbreak investigations. He later served as the Associate Director for Science in NCIPC where he coordinated national injury surveillance and extramural research activities, spearheading their new research agenda and a variety of research policies.

Dr. Sosin is board certified in preventive medicine and internal medicine. He received his B.S. in biology from the University of Michigan; his M.D. from Yale University School of Medicine; and his M.P.H. in Epidemiology from the University of Washington School of Public Health.

As Director, Dr. Sosin has responsibilities in planning, managing and evaluating Division programs of National Electronic Telecommunications System for Surveillance, the National Notifiable Disease Surveillance System, 122 Cities Mortality Reporting System, the Public Health Informatics Fellowship Program, the Assessment Initiative, CDC WONDER and web-access to CDC data sets, Epi Info, and the Medical Examiner and Coroner Information Sharing Program. He also serves as a senior advisor for surveillance policy, research, and program direction. Dr. Sosin currently serves as a Clinical Assistant Professor of Medicine at Emory University. He serves as a scientific reviewer for various medical journals. Professional memberships include the American College of Physicians (Fellow), the American College of Preventive Medicine, the American Medical Association, and the Commissioned Officers Association - U.S. Public Health Service.

David Walker, CDC

Overview of the Information Systems Currently Available to Public Health Researchers

Abstract: There are a large number of data sources available for public health surveillance and research. Active and passive health surveillance systems allow researchers the opportunity to monitor the incidents of specific diseases or preventative measures on a nationwide scale, while health-related administrative databases from health providers or commercial entities can provide monitoring of a broad range of health issues within a specific population. Information from these varying data systems may be integrated to provide researchers with more enhanced signal detection, disease monitoring, and analytical capabilities than single data systems can provide. This presentation will provide a broad overview of the various types of reporting systems available to public health researchers, and discuss generally how information form differing systems may be integrated for research purposes.

Bio: David Walker holds an MPH in Biostatistics from the University of Oklahoma. He is a Public Health Analyst with the Centers for Disease Control and Prevention, currently serving as the Deputy Branch Chief of the Statistical Analysis Branch of the National Immunization Program. His primary area of focus is the data management of large, national data systems employed to monitor vaccine preventable disease and vaccine safety issues. He has served as the Acting Activity Chief of the Systems Operations and Design Activity of NIP, been involved in the NEDSS Operational Workgroup for three years.

Jana Asher, CMU

An Introduction to Multiple Systems Estimation for Estimating a Count of Adverse Events

Abstract: This talk will provide an overview of multiple systems estimation (i.e., multiple capture-recapture) by presenting the basic capture-recapture model, outlining the assumptions of this model, and discussing different modeling techniques for data for which these assumptions are inappropriate. A portion of Exhibit 67 in the trial of Slobodon Milosevic, mainly the analysis of ethnic Albanian deaths in Kosovo between March and June of 1999, will be used as an example of these modeling techniques.

Bio: Jana Asher received a Master's Degree in Statistics from Carnegie Mellon University in 1999 and worked at the U.S. Census Bureau for the next 18 months in the Small Area Income and Poverty Estimates Program and the Planning, Research, and Evaluation Division. She returned to Carnegie Mellon in August of 2000 to pursue a Ph.D. in Statistics under the guidance of Steve Fienberg. Since then, she has worked on several projects for the Science and Human Rights Program of the American Association for the Advancement of Science, and she was recently a co-author of a study that was presented as evidence by the prosecution in the trial of Slobodan Milosevic at The Hague. Jana was honored over the summer with a Special Achievement Award from the Social Statistics Section of the American Statistical Association for her contributions to the field, and she received the Edward C. Bryant Scholarship in August of this year for outstanding work as a survey statistics graduate student.

Miles Braun, FDA

Periodic Reporting of Post-licensure Safety Data to FDA by Pharmaceutical Firms: Relation to "Datamining"

Bio: Dr. Miles Braun is the Director of the Division of Epidemiology at the Center for Biologics Evaluation and Research. He has been with CBER for seven years. The mission of the Division of Epidemiology is to rapidly detect and rigorously research safety problems for licensed biologic products and to facilitate regulatory, risk communication and risk management strategies to mitigate these problems. Before coming to FDA, Dr. Braun was Senior Research Investigator in the Epidemiology and Biostatistics [intramural] Program at the National Cancer Institute, NIH. He is a graduate of the Masters in Public Health Program at John Hopkins and of the Preventive Medicine Residency Program at the Centers for Disease Control, where he also served as an Epidemic Intelligence Service Officer. He has worked in Public Health at the local, state, federal and international levels. He has authored and co-authored more than 60 scientific publications. He currently represents FDA on post-marketing issues within the International Conference on Harmonization (ICH) and on the MedDRA Management Board. He has served as a member of subcommittees of the Advisory Committee on Immunization Practices of the Centers for Disease Control, and has presented to that body on the safety of the anthrax vaccine. He is also a member of the Board of Directors of the International Society of Pharmacoepidemiology and serves as FDA liaison and steering committee member for the Agency for Healthcare Research and Quality's Centers for Education and Research on Therapeutics.

Meade Morgan, CDC

Some Problems and Challenges with Our Current Reporting Systems, Data Sources and Approaches: Lessons from the HIV/AIDS Epidemic

Bio: Dr. Meade Morgan began his career in 1979 working for the CDC Hospital Infections Program as a statistician for the national Study of the Efficacy of Nosocomial Infection Control (SENIC) project. Several years later he became involved with AIDS research as a consultant on the statistical analyses of data related to the epidemiologic investigations of the initial cluster of reported cases. In 1984 he was selected as the Chief of the newly formed Statistics and Data Management Branch in the AIDS Program within the CDC Division of Viral Diseases. During his 16-year tenure with the AIDS Program, which eventually became the Division of HIV/AIDS Prevention, he developed the computerized information system that is still in use by state and local health departments for surveillance of HIV and AIDS infection in the United States. In 1986 he published one of the first statistical models projecting the future course of the AIDS epidemic.

Dr. Morgan has supported international efforts to develop robust and reliable information systems for HIV and AIDS surveillance and research for over a decade. Most recently he has been working with the GAP staff in India to create a computerized information system for the care of HIV and TB patients in the Government Hospital of Thoracic Medicine, near the city of Chennai.

Dr. Morgan received his bachelor of science degree in mathematics and his doctorate in biometry and statistics from Emory University in Atlanta.

Farzad Mostashari, Assistant Commissioner, Division of Epidemiology, NYC DOHMH

Syndromic Surveillance- The New York City Experience, with some Lessons from the National Syndromic Surveillance Conference

Bio: Dr. Mostashari's area of expertise is non-traditional disease surveillance and outbreak detection. He did his graduate training at the Harvard School of Public Health and Yale Medical School, internal medicine residency at Mass General Hospital, and completed the CDC's Epidemic Intelligence Service. He was a lead investigator in the outbreaks of West Nile Virus and anthrax in NYC. He is a fellow of the New York Academy of Medicine's Center for Urban Epidemiologic Studies, a Clinical Assistant Professor at Weil Cornell Medical College, and an Assistant Commissioner at the NYC Department of Health. He served as Chair of the 2002 National Syndromic Surveillance Conference.

Luis Kun, Consultant

The "Homeland Security - Public Health" Challenge: Connecting and Integrating the Sources and the Players with the Applications

Abstract: "All phases of counterterrorism efforts require that large amounts of information from many sources be acquired, integrated, and interpreted. Given the range of data sources and data types, the volume of information each source provides, and the difficulty of analyzing partial information from single sources, the timely and insightful use of these inputs is very difficult. Thus, information fusion and management techniques promise to play a central role in the future prevention, detection, and remediation of terrorist acts. Unlike some other sectors of national importance, information technology is a sector in which the federal government has little leverage." [From: Making the Nation Safer: The Role of Science and Technology in Countering Terrorism (Summer 2002) Committee on Science and Technology for Countering Terrorism, National Research Council Complex and Interdependent Systems: A systems approach is especially necessary for understanding the potential impacts of multiple attacks occurring simultaneously, such as a chemical attack combined with a cyberattack on first responder communications and designed to increase confusion and interfere with the response. The required range of expertise is very broad. Information about threats must come from communities knowledgeable about chemical, biological, nuclear weapons, and information warfare, while vulnerability analysis will depend on information about critical infrastructures such as the electric-power grid, telecommunications, gas and oil, banking and finance, transportation, water supply, public health services, emergency services, and other major systems. Currently, there is a large volume of information collected and analyzed by the U.S. intelligence community and in industry that is relevant to assessing terrorist threats and system vulnerabilities. However, to maximize the usefulness of these data and increase the ability to cross-reference and analyze them efficiently, counterterrorism -related databases will have to be identified and metadata standards for integrating diverse sets of data established.

Bio: Dr. Kun is an Information technology consultant in the healthcare, public health and scientific computing arenas. He graduated from the Merchant Marine Academy in Uruguay, and holds a BSEE, MSEE and Ph.D. in Biomedical Engineering degrees all from UCLA. He is the IEEE-USA MTPC, Chairman of the Bioterrorism WG currently engaged in advising (on IT Infrastructure for Terrorism, on E-Government, and Homeland / Cybersecurity) the US Congress and the Executive Branch. He has an extensive background on Medical Informatics, which includes 14 years with IBM; Director of Medical Systems Technology and Strategic Planning at Cedars-Sinai Medical Center in L A; Senior Information Technology (IT) Advisor for the Agency for Health Care Policy and Research (AHCPR) and a Distinguished Fellow at the CDC (Senior Computer Scientist for the Health Alert Network and later the Acting Chief Information Technology Officer (CITO) for the National Immunization Program (NIP)). As an adjunct faculty in the Department of Biostatistics at the Rollins School of Public Health at Emory University he wrote the syllabus and taught in the new curricula of Public Health Informatics the following courses: Database Management Systems (Fall 2001) and Artificial Intelligence (Spring 2002).

From 1991 he was a Senior Scientist and an Adjunct Professor of Internal Medicine at UTMB Galveston, School of Medicine, and since 1997 a Research Professor of Medical Informatics and Information Technology at CIMIC / Rutgers University in NJ where he is also a member of the Advisory Board. He is in the Advisory Board for Children's Hospital at Harvard/MIT on their Bioterrorism efforts and is a member of AMIA's: Bioterrorism Response Team, Public Policy Committee and Chair of the Telehealth SIG. In the past 20 years Dr. Kun has written a large number of articles and has lectured on medical and public health informatics, IT and biomedical engineering in over 50 countries. Dr. Kun is an elected Fellow of the American Institute for Medical and Biological Engineering (AIMBE) and has been an invited speaker for the World Bank, the Pan-American Health Organization, the World Health Organization, the European Investment Bank and the Inter-American Development Bank.

Some highlights include: Formulated the IT vision for AHCPR (1996-97-98); Lead staff for HPCC program and Telehealth (Chair Security, Privacy and Confidentiality WG). DHHS Security of Health Data/Communications team for HIPAA 1996. Co-author of the Reports to the Congress on Telemedicine and on HIPAA Security. July of 1997 invited speaker to the White House. Represented DHHS Secretary Shalala at a Pan American Forum of Health Care Ministers in Mexico 1997. As Acting CITO, formulated the future IT vision for the NIP at the CDC (10/2000). 1987-1993: IEEE Health Care Engineering Policy Committee (HCEPC) Chairman of the Electronic Medical Record (EMR) and High Performance Computers and Communications (HPCC) Subcommittee. Chosen in 1988 to be an expert witness to Congress on the area of HPCC.

Latanya Sweeney, CMU

Overview: use of non-reported/administrative data for surveillance (ER data, pharmacy data, 911 call data, Harvard pilgrim, etc.)

Sean Hennessy, University of Pennsylvania

Pharmacoepidemiology: Goals and Methods

Abstract: Pharmacoepidemiology represents the dynamic interface between clinical pharmacology, pharmacotherapeutics, epidemiology and statistics. It is the primary scienctific discipline underlying postmarketing drug surveillance (PMS). PMS is an essential enterprise in ensuring the safety of the public. This presentation will review the principal research methods used in pharmacoepidemiologic studies in the context of a unifying conceptual framework.

Bio: Sean Hennessy, PharmD, PhD is an Assistant Professor of Epidemiology and Pharmacology at the University of Pennsylvania School of Medicine. Dr. Hennessy's primary field of interest is pharmacoepidemiology, which is the study of the use and effects of medications in populations. Dr. Hennessy's research in this area has been funded by the Agency for Healthcare Research and Quality, the National Institutes of Health, pharmaceutical companies, and private foundations. Examples of recently completed studies include venous thromboembolism associated with 3rd generation oral contraceptives, cardiac arrest associated with QT-prolonging antipsychotic drugs, and the effectiveness of drug utilization review programs. In addition to his research, Dr. Hennessy teaches clinical epidemiology to medical and graduate students, and directs a clinical program at Penn designed to improve outpatient medication use.

Robert O'Neill, FDA

FDA's Adverse Event Reporting System and the Use of Quantitative Methods for Screening

Abstract: One of the major sources of information on adverse events associated with marketed medical products is FDA's adverse event reporting system. This system collects several hundred thousand reports per year according to a structured format, though the quality of the data often cannot be enforced as well as one would like. This talk will describe the system, and various quantitative approaches to evaluating data, associations, signals, suspicions, etc. that were developed or considered over the years. Recently, data mining strategies have been applied to the data base and some of these applications will be discussed as well as a perspective on future enhancements [Anello and O'Neill ; Chapter 'Postmarketing Surveillance of New Drugs and Assessment of Risk', Encylopedia of Biostatistics, p 353-361;, John Wiley, 2000.

Bio: Dr. O'Neill is the Director of the Office of Biostatistics (OB) in the Center for Drug Evaluation and Research (CDER), Food and Drug Administration. The OB comprises approximately 90 staff members, and consists of three Divisions of Biometrics and the Quantitative Methods and Research Staff. The Office provides biostatistical and scientific computational support to all programs of CDER, including the pre-market application review of clinical trials, of pre-clinical animal carcinogenicity studies, of chemistry stability /expiration setting studies, of experimental and observational studies in post marketing safety assessment, of generic drug bioequivalence/bioavailability studies; and IND advice on the design and methodological issues associated with analysis of clinical trials, and mathematical modeling in pharmacokinetics and pharmacodynamic studies. Prior to October, 1998 he was Director of the Office of Epidemiology and Biostatistics responsible for the post-market safety surveillance of new drugs which deals with the receipt and analysis of adverse event reports from health providers and the epidemiology program which evaluates industry safety data and studies, and reviews and designs observational studies to follow-up and evaluate drug safety concerns.

Dr. O'Neill holds an A.B. degree in mathematics from the College of the Holy Cross, and a Ph.D. in mathematical statistics and biometry from Catholic University of America and began his FDA career in the Division of Biometrics in 1971 as a statistical reviewer of New Drug Applications in the Bureau of Drugs. He has held successively more responsible positions in the former Division of Biometrics, including Group Leader, Branch Chief, Deputy Director, and Director, a position he held for ten years before assuming his role as Office Director.

In 1989-1990, Dr. O'Neill held a visiting professorship at the Department of Research, University Medical School, Basel, Switzerland where he developed and presented numerous lectures and created a course series titled Topics in Therapy Evaluation and Review (TITER) for European pharmaceutical scientists. This course became the basis for the degree granting program European Course in Pharmaceutical Medicine, a joint consortium between the University of Basel, University of Freiburg, Germany and University of Strasbourg, France.

He has published articles in the biostatistical, epidemiology and medical literature, is a fellow of the American Statistical Association, a member of several professional societies and a past Member of the Board of Directors of the Society for Clinical Trials.

Coleen Boyle, CDC

Surveillance methods for birth defects and developmental disabilities-understanding the limitations of different approaches

Bio: Coleen A. Boyle, Ph.D., M.S. - Associate Director for Science and Public Health, National Center for Birth Defects and Developmental Disabilities, Centers for Disease Control and Prevention, Atlanta, GA

Dr. Boyle received her MS in biostatistics and PhD in epidemiology from the University of Pittsburgh School of Public Health and completed postdoctoral training in epidemiologic methods at Yale University. Dr. Boyle joined the Division of Birth Defects and Developmental Disabilities, in 1988, first as Section Chief and later as Branch Chief and Division Director. Her interest and expertise is in the epidemiology and prevention of developmental disabilities, including mental retardation, cerebral palsy, sensory impairments and autism. Dr. Boyle has recently been appointed as the Associate Director for Science and Public Health, for the newly formed National Center on Birth Defects and Developmental Disabilities. She is the recipient of the CDC Charles C. Shepard Award for scientific excellence and has authored or coauthored more than 70 scientific publications.

Joseph Reid, CDC

NEDSS in detail

Abstract: The CDC's National Disease Surveillance System (NEDSS) is a nation-wide, open-systems technology, standards-based, distributed architecture for the collection, analysis and management of disease surveillance data. NEDSS has a critical role to play in the nation's public health monitoring process in general and in the monitoring for potential bioterrorism events. NEDSS will be described from the perspectives of mission objectives, strategic information architecture and tactical software engineering.

Bio: Dr. Joseph A. Reid is the Associate Director For Science in the Information Resources Management Office of the CDC. He has broad responsibilities in the design and implementation of agency-level IT solutions at the CDC. Dr. Reid is currently leading the technical implementation of the NEDSS system. His other responsibilites include enterprise information security and IT contract management.

Gil Delgado, Emergint

Lab data? Clinical data? Over-the-counter data?

Abstract: Gil will discuss the processes and methods, as well as the challenges, barriers and possibilities for clinical data collection, integration, and normalization for use as a tool for disease surveillance, BioTerrorism detection, and adverse events reporting. Following the discussion, Gil and Tim Ellis will provide a live demonstration of the Health Data System designed specifically for clinical data collection, normalization and de-identification.

Bio: Gil Delgado, Emergint's founder and CEO, has led teams of professionals that have developed nationally recognized disease surveillance and research systems for Universities, the U.S. Department of Defense and the Centers for Disease Control and Prevention. He founded Emergint specifically to provide the tools, technologies and services to support better clinical research and disease surveillance. Prior to founding Emergint, Gil was instrumental in building several Companies and Divisions, all supporting healthcare research.

Thomas Balzer, Quintiles/Verispan

Using Electronic Healthcare Reimbursement Claims to Detect Local, Regional, and National Infectious Disease Outbreaks

(Joel R. Greenspan, MD; Thomas Balzer, PhD)


Background: Quintiles Transnational, Inc., collects a daily electronic stream of standardized, real-time, de-identified, patient-centric healthcare information from all parts of the U.S. Accumulating this data flow since 1998 has yielded the world's largest and most complete data warehouse of electronic healthcare information on >150 million unique patients.

Methods: We used this data warehouse to detect the ``footprints'' of three known infectious disease outbreaks investigated by CDC during 2001. The outbreaks included a shigellosis outbreak in a large Midwestern city, a sustained bacterial meningitis outbreak in a multi-county region, and a national outbreak of histoplasmosis among American college students returning from Mexico. We used both traditional approaches (specific ICD-9 codes) and non-traditional approaches (ICD-9 codes for syndromes, CPT codes for medical procedures, and NDC codes for prescriptions) to assess the signals generated in the healthcare system by these outbreaks.

Results: All three outbreaks were detectable using the Quintiles data warehouse.

The first data warehouse signal of unusual shigella activity in the target city occurred on July 11, 2001, and the local health department could have been notified as early as July 12. Using surveillance of enteric disease syndromes (enteritis, infectious diarrhea, and diarrhea) among children <10 years old yielded an initial signal that was both larger and detectable earlier (by approximately 1 week) than signals from shigella ICD-9 codes alone. Not only could syndromic surveillance have identified the known shigella outbreak earlier, but this approach also detected two additional large outbreaks of enteric illness in children in this community during 2000 and 2001 that were previously unknown to the local health department.

Quintiles detected a small cluster of meningococcal disease among persons <30 years old in NE Ohio that occurred during the time of the known outbreak. We also found a large unexpected signal generated by meningococcal vaccinations among middle and high school students in the same area. This suggests that surveillance of medical procedures can complement the search for rare events such as meningococcal meningitis. The Quintiles data yielded a bimodal epidemic curve of histoplasmosis cases among persons ages 15 to 24 years during Spring 2001 in states considered non-endemic for histoplasmosis. This pattern matched the bimodal pattern of the known outbreak, and the location of cases matched the location of known outbreak-related cases. We also found a much larger and earlier signal generated by prescriptions for ketoconazole (a common treatment for histoplasmosis) among persons 18 to 24 years in histoplasmosis non-endemic states. This suggests that surveillance of selected prescription drugs can augment traditional case finding and outbreak detection methods especially for multi-state outbreaks.

Conclusion: Very large convenience samples of standardized electronic healthcare claims data can augment traditional case finding and outbreak detection methods for state and local health departments. Health departments can use these data to greatly expand the scope of their surveillance efforts for bioterrorism, other infectious disease outbreaks, and other public health threats and emergencies. Further evaluation of this approach at the local health department level is necessary to document the usefulness of electronic healthcare claims for public health purposes.

Bio: Dr. Balzer is the Senior Vice President and Chief Scientific Officer of Verispan, LLC., a new informatics joint venture of Quintiles Transnational and McKesson Corporation. He came to that role from Quintiles when the Scott-Levin, SMG, Synergy, and Amaxis business units were merged with McKesson's Kelly-Waldron unit to form Verispan. At Quintiles Informatics, he had responsibility for Sales, Marketing, and Custom Solutions for clients in the pharmaceutical and medical/surgical segments. Prior to joining Quintiles in 2001, Dr. Balzer was the Senior VP, Pharmaceutical Services of NDC Health, a leading provider of healthcare information to the pharmaceutical industry. From 1992 - 1998 he was a Manager and Principal with ZS Associates, a global consulting firm specializing in pharmaceutical sales and marketing solutions. He led strategic projects for customers in over 25 countries. Prior to his ZS experience, he was a partner in a Salt Lake City-based consulting firm serving the airline and public utility industries. Dr. Balzer served for over 20 years in the US Army in a variety of command and staff positions throughout the world, including command of a field artillery unit in Vietnam. His final assignment was as Chief of conventional analyses and wargaming for the Department of Defense at the Pentagon and project manager for the simulations used to evaluate scenarios throughout the world. Dr. Balzer has a Ph.D. in Operations Research from the University of New South Wales (Australia), a Master of Science from the University of Southern California, an MBA from Central Michigan University, and a BA with distinction from Park College.

Lori Hutwagner, CDC

The Bioterrorism Preparedness and Response Program, Early Aberration Reporting System (EARS)

Bio: Lori Hutwagner, received her masters degree from the Georgia Institute of Technology in 1989. She joined the CDC in 1990 with the National Center for Infectious Diseases where she worked on aberration detection methods for Salmonella isolates. She has recently completed work with the Epidemiology Program Office where she applied aberration detection methods to the Nationally Notifiable Disease Surveillance System. In 1999 she began working with the Bioterrorism Preparedness and Response Program on developing aberration detection methods for their national ``drop in surveillance'' system and has started implementing these methods in various local sites through the US.

Owen Devine, CDC

Bayesian Methods for Monitoring Public Health Surveillance Data

Abstract: Using Bayesian methods to interpret maps of observed measures of disease risk is becoming common, both as an approach to smooth these maps and to develop etiologic hypotheses. Bayesian methods, however, are less frequently used as a means of monitoring temporally referenced health surveillance data. In this talk, I will review the applicability of these types of approaches as tools for identification of aberrant events in spatially and/or temporally referenced health surveillance data. In particular, I will focus on the practicality of using this type of complex approach in ongoing monitoring activities, identifying appropriate loss functions, and a comparison with less complex frequentist approaches.

Bio: Owen Devine is Chief of the Statistics and Data Management Branch in CDC's Division of STD Prevention. In this capacity, he has worked extensively on the development and practical use of Bayesian and frequentist tools for monitoring health surveillance data.

Martin Kulldorff, University of Connecticut

A Tree-Based Scan Statistic for Database Disease Surveillance, (Authors: Martin Kulldorff, Zixing Fang, Stephen J Walsh)

Abstract: Many databases exist by which it is possible to study the relationship between health events and various potential risk factors. Among these databases, some have variables that naturally form a hierarchical tree structure, such as pharmaceutical drugs and occupations. It is of great interest to use such databases for surveillance purposes in order to detect unsuspected relationships to disease risk.  We propose a tree-based scan statistic, by which the surveillance can be conducted with a minimum of prior assumptions about the group of occupations/drugs that increase risk, and which adjusts for the multiple testing inherent in the many potential combinations. The method is illustrated using data from the National Center for Health Statistics Multiple Cause of Death Database, looking at the relationship between occupation and death from silicosis.

Phil Smith, CDC

Methods for Capture-Recapture Estimates of the Number of Pertussis Cases in New York State When Individuals Are Not Uniquely Identifiable

Authors: Philip J. Smith, PhD, Betsy Cadwell, MS; Andrew L. Baughman, PhD; Kristine M. Bisgard, MD.


Introduction: With the introduction and widespread use of a whole cell pertussis-containing vaccine, pertussis cases declined to a historic low in 1976. In the early 1980s the number of reported pertussis cases began to increase. This increase raised questions about the effectiveness of disease control programs. This paper describes statistical methods for estimating number of pertussis hospitalizations in New York State (NYS) during the years 1992-1995 using special capture-recapture methodology. These methods accounted for the nonuniqueness of personal identifiers on administrative lists available for the analysis.

Methods: Data obtained between 1992 and 1995 from the National Electronic Telecommunications System for Surveillance (NETSS) and the Health Care Information Association (HCIA) database were used to identify individuals who had been hospitalized with pertussis. Individuals on each database were identified and matched to the other database according to their gender, state, year of illness, and birth month/birth year pair. A mathematical formula was developed for the capture-recapture estimator of the number of pertussis hospitalizations that accounted for potential nonuniqueness of individual identifiers. This estimator was found to require more computational power than is currently available with a modern high speed personal computer. A two-stage bootstrap procedure was developed to simulate the estimator and its precision.

Results: There were 310 individuals on the NETSS database and 633 individuals on the HCIA database. On the NETSS database, there were 242 individuals that could have been matched to more than one individual on the NCIA database. On the HCIA database there were 480 individuals that could potentially be matched to at least 1 person on the NETSS database. The two-stage bootstrap procedure that accounts for nonuniqueness of these matches yielded an estimate of 1,518 pertussis hospitalizations (95 percentile interval: [1,414, 1,634]) in NYS.

Conclusion: Epidemiologic and demographic applications of the capture-recapture method customarily devote considerable resources to ascertaining exact matches from cases in available administrative lists. Often, matches in these are putative, and made purposively to deliberately obtain a conservative underestimate of the size of the population of interest. Results from our study show that these ad hoc procedures are unnecessary and that reasonable estimates can be obtained that account for both uncertainty attributable to sampling variation as well as uncertainty resulting from nonunique identifiers.

Bio: Philip J. Smith is a mathematical statistician in the National Immunization Program at the Centers for Disease Control and Prevention in Atlanta, Georgia.

John Stultz, SAS

Data mining methods: Applications, problems and opportunities in the public sector

Abstract: While data mining methods continue to evolve there continues to be a broad spectrum of application and understanding of these methods. The presenter will briefly cover some public sector data access, transformation, distribution and mining applications in the areas of disease and adverse event reporting and surveillance. Brief attention will be given to problems encountered, creative problem solving and future opportunities as well as efforts to standardize data mining model deployment with Predictive Modeling Markup Language (PMML).

Bio: John Stultz is part of the Public Sector Group of SAS working as a Systems Engineer specializing in data mining. John has his Masters of Public Health from Tulane University School of Public Health and Tropical Medicine specializing in epidemiology. John has consulted with various health care companies, universities, and public sector entities including the Centers for Disease Control, the Centers for Medicaid and Medicare Services, United States Army Research Institute for Environmental Medicine, the Veteran's Administration, the Indian Health Service, and the University of Colorado Health Sciences Center.

Robert Ball, FDA

Datamining in the Vaccine Adverse Event Reporting System (VAERS) to Enhance Vaccine Safety Monitoring at the Food and Drug Administration (FDA)

(Robert Ball, Dale Burwen, M. Miles Braun, Center for Biologics Evaluation and Research, FDA, Rockville, MD)


Intro: VAERS is operated collaboratively by the FDA and the CDC to monitor the safety of vaccines after licensure and receives 10 to 14 thousand reports per year. Passive surveillance systems such as VAERS are subject to many limitations, notably under-reporting and the lack of adequate denominator data to determine incidence rates. Because of these limitations, it is usually not possible to determine causal associations between vaccines and adverse events from VAERS reports. The traditional approach to signal detection involves initial manual screening of reports, followed by more in-depth review to identify unexpected patterns in age, gender, dose number, and time to onset, or substantial numbers of "positive rechallenge" reports. To improve the efficiency of signal detection, the FDA VAERS group has been exploring 2 datamining techniques: a Bayesian method (Dumouchel) and Proportional Reporting Ratios (PRR) (Evans).

Methods: The first stage of this exploration demonstrated that retrospective application of the Bayesian method identified intussusception as a notable adverse event after the introduction of Rotavirus vaccine. Subsequently, we have tested a user-friendly implementation of the Bayesian method and used PRR in routine surveillance work.

Results: Several key concepts have been demonstrated through exploratory analyses with these methods. Comparison of the observed patterns of vaccine-event pairs for pneumococcal, meningococcal, and influenza vaccines illustrate the ability of these methods to capture in a graphical "snapshot" the differences in reported adverse events after these vaccines. The influence of age and gender, and potentially other variables, on the magnitude of the association and ranking of vaccine-event pairs was demonstrated using anthrax vaccine. Multi-dimensional associations such as the occurrence of two or three particular symptoms following receipt of a vaccine were explored. Challenges in interpretation and operationalization were encountered.

Summary: Both methodological and practical issues have been encountered that bear on the potential usefulness of datamining. Methodological issues include proper application of each method, understanding differences between the methods and their advantages and disadvantages, and the use of datamining as an analytic tool vs. an automated screening tool. Practical issues include the need for personnel training, substantial computing resources, and integration into the usual work process. Datamining methods hold promise to improve the efficiency of vaccine safety surveillance, but numerous methodological and practical issues need to be resolved before datamining can fulfill that promise.

Dumouchel W. Bayesian data mining in large frequency tables, with an application to the FDA spontaneous reporting system. American Statistician 1999;53:177-190.

Evans SW, Waller PC, Davis S. Use of proportional reporting ratios for signal generation from spontaneous adverse drug reaction reports. Pharmacoepidemiol Drug Saf 2001;10:483-486.

Bio: Robert Ball is Chief of the Vaccine Safety Branch in the Office of Biostatistics and Epidemiology, Center for Biologics Evaluation and Research, FDA. Dr. Ball received his BS degree in Mathematics and MD degree from Georgetown University. He completed his Internal Medicine internship at the US Naval Hospital Bethesda and his MPH and residency in Occupational and Environmental Medicine at the Uniformed Services University of the Health Sciences. In addition, he received the ScM degree in Infectious Disease Epidemiology and Vaccine Science and Policy from Johns Hopkins School of Public Health. He is Board Certified in Public Health and General Preventive Medicine and Occupational Medicine.

DIMACS Homepage
Contacting the Center
Document last modified on October 11, 2002.