DIMACS Working Group on Data Mining and Epidemiology

Dates: First meeting May 22, 2003
DIMACS Center, CoRE Building, Rutgers University

Ilya Muchnik, Rutgers University, muchnik@dimacs.rutgers.edu
S. Muthukrishnan, AT&T Labs - Research and Rutgers University, muthu@cs.rutgers.edu
David Ozonoff, Boston University, dozonoff@bu.edu
Presented under the auspices of the Special Focus on Computational and Mathematical Epidemiology.


Don Hoover, Rutgers University

Title: Issues in epidemiological analysis of complex data

Issues In Epidemiological Analyses Of Complex Data - For over 40 years, multivariate linear, logistic and survival regression models have identified causal epidemiological associations with notable successes. Yet conflicting findings from multivariate epidemiological models as well as those not confirmed in clinical trials have occurred. It is well known that multivariate model building has potential problems from unmeasured variables, collinearity, multiple comparison and other issues. Still, with today's computational resources, as data sets become larger and the association patterns studied more complex, the potential for such problems increases. This talk illustrates three recent examples where the data suggested: 1) the true association pattern was to complex to be modeled with available data, 2) a major unanticipated effect was identified in an ancillary analysis and 3) multivariate adjustment distorted rather than resolved causal association. While analytical methods may or may not exist to deal with settings such as these, as analyses become more complex, chances increase that such issues will fail to be identified.

David Madigan, Rutgers University

Title: Analysis of hospital discharge data

The availability of large-scale hospital inpatient data has prompted the development of websites and reports that include "league tables" of hospitals and healthcare providers. These league tables consider issues such as hospital mortality, readmission rates, and length-of-stay. The usefulness of these rankings depends critically on appropriate "risk adjustment" that accounts for pre-existing patient risk factors. This talk will look at one such website for the State of Pennsylvannia http://www.phc4.org and examine their analytic methods.

Alex Pogel, New Mexico State University

Title: Lattice representation of data sets

The talk will focus on the expressivity of an exploratory data analysis method based on lattice theory. The method is called Formal Concept Analysis. First the basic notions surrounding the method will be presented. Then we will discuss some additional enrichments and insights we have found upon applying this method, along with points requiring further development.

Previous: List of Participants
Working Group Index
DIMACS Homepage
Contacting the Center
Document last modified on May 19, 2003.