### DIMACS Working Group on Order-theoretic Aspects of Epidemiology

#### March 7 - 9, 2005 DIMACS Center, CoRE Building, Rutgers University

Organizers:
David Ozonoff, Boston University, School of Public Health, dozonoff@bu.edu
Melvin Janowitz, Rutgers University, melj@dimacs.rutgers.edu
Fred Roberts, Rutgers University, froberts@dimacs.rutgers.edu
Presented under the auspices of the Special Focus on Computational and Mathematical Epidemiology.

Many practical epidemiological problems involve the comparison of one or more quantities. Most often the quantities are rates or proportions leading to a measure of effect or association, but they may also involve distances, exposure categories, job titles, etc. Often the actual values in question are not important, only whether one value is smaller than or larger than a second, i.e., their order. This working group will study how fundamental order-theoretic concepts of TCS and DM such as semiorders, interval orders, general partial orders, and lattices [Fishburn (1985), Trotter (1992)] can be used to improve the results of epidemiological investigations. We will give epidemiological concepts a careful definition in the language of partial orders and explore the use of visualization of order-theoretic concepts in epidemiologic studies. The latter will involve issues such as how best to visualize a poset through clever presentation of its Hasse diagram - an issue of great interest in the field of TCS known as graph drawing. One application of these ideas arises in the problem of determining cutoffs or boundaries so as to determine exposure categories in epidemiology. This can be modeled by finding n attributes (age, inverse of distance to a pollution source, etc.); for each subject x, finding a number fi(x) representing a measure of the ith attribute; and saying that x has a higher exposure than y if fi(x) > fi(y) for all i. This defines a partial order that is well studied in dimension theory. Finding the exposure set, the set of all subjects x whose exposure levels fi(x) all exceed some threshold, is a common construction in dimension theory. An interesting variant is when only a given percentage of these levels need to exceed threshold. We shall seek algorithms for fitting this model to large data sets when the partial order and the exposure set are given, but the attributes and the number of them is not. One promising approach is to use algorithms developed by the DIMACS working group on multidimensional scaling (http://dimacs.rutgers.edu/SpecialYears/2001_Data/Algorithms/). As another example, point lattices may be regarded as a type of order theoretic lattice. The point lattice construction has found uses in epidemiology through visualizing the relationships of all possible contingency tables to various statistics, effect measures, and cut-off choices (see, e.g., [Ozonoff and Webster (1997)]) and has also been used in statistics (see [Narayana (1979)]). Challenges to our group in extending these ideas include generalizing the concepts to higher-dimensional tables, where there are additional attributes (ordinal, numerical or nominal) besides case status and exposure; and applying lattice-theoretic approaches to measurement error. The point lattices formed by 2 x 2 contingency tables can be represented as n-element strings from a 2 letter alphabet {x,y}. Measurement errors can be thought of as being caused by a transposition from a substring xy to a substring yx. Similar transpositions have been studied from a more general viewpoint by lattice theorists (see [Bennett and Birkhoff (1994)]) as a special case of a Newman commutativity lattice. Many references can be found in [Bennett and Birkhoff (1994)], which also mentions connections with weak Bruhat orders of Coxeter groups. We will examine how higher dimensional contingency tables relate to what are called multinomial lattices in [Bennett and Birkhoff (1994)] and study how combinatorial aspects of the Bruhat orders relate to probabilistic questions in epidemiology. Among other things, we hope that these considerations will give us guidance on how to decide when observed data tables can be explained by chance alone.The group will also consider the issue of what kinds of statistical tests are legitimately applied to data where only order matters [Marcus-Roberts and Roberts (1987), Roberts (1994)]; this issue is somewhat recognized by epidemiologists, but its order-theoretic subtleties are usually not. Mathematical and computational methods dealing with ordered algebraic systems form the foundations of the modern theory of measurement [Krantz, Luce, Suppes and Tversky (1971), Luce, Krantz, Suppes and Tversky (1990), Roberts (1979), Suppes, Krantz, Luce and Tversky (1989)] and can be used to analyze this and important related issues such as what conclusions using scales of measurement are "meaningful" [Roberts (1994), Roberts (1999)]. Measurement theory (a term which has a different connotation in epidemiology) does not seem to be known to practicing epidemiologists and we shall try to remedy that, keeping recent applications to software measurement [Fenton and Pfleeger (1997)] in mind. We plan to analyze epidemiological studies from a measurement theory point of view. This working group is our most speculative. We will build upon a large literature in TCS dealing with order relations, computing them, approximating them, visualizing them, and assigning measures to them, but we will not be building upon a large body of work connecting these ideas to epidemiology, mostly upon the view of several active epidemiologists that these ideas are relevant.

### References:

Bennett, M.K., and Birkhoff, G. (1994), "Two families of Newman Lattices," Algebra Universalis, 32, 115-144.

Fenton, N.A., and Pfleeger, S.L. (1997), Software Metrics, 2nd ed., PWS Publishing Co., Boston.

Fishburn, P.C. (1985), Interval Orders and Interval Graphs, Wiley, New York.

Krantz, D.H., Luce, R.D., Suppes, P., and Tversky, A. (1971), Foundations of Measurement, Vol. I, Academic Press, New York.

Luce, R.D., Krantz, D.H., Suppes, P.,and Tversky, A. (1990), Foundations of Measurement, Vol. III, Academic Press, New York.

Marcus-Roberts, H., and Roberts, F.S. (1987), "Meaningless statistics," J. Educ. Stat., 12, 383-394.

Narayana, T. (1979), "Lattice path combinatorics with statistical applications," Mathematical Expositions, 23, U. of Toronto Press, Toronto.

Ozonoff, D., and Webster, T. (1997), "The Lattice diagram and 2x2 tables," in Johnson, B.L., Xintaras, C., and Andrews, J.S. (eds.), Hazardous Waste: Impacts on Human and Ecological Health (Proceedings of the 2nd International Congress on Hazardous Waste), Princeton Scientific, 441-458.

Roberts, F.S. (1979), Measurement Theory, with Applications to Decisionmaking, Utility, and the Social Sciences, Addison-Wesley, Reading, MA.

Roberts, F.S. (1994), "Limitations on conclusions using scales of measurement," in Barnett, A., Pollock, S.M., and Rothkopf, M.H. (eds.), Operations Research and the Public Sector, Elsevier, Amsterdam, 621-671.

Roberts, F.S. (1999), "Meaningless statements," in Graham, R.L., Kratochvil, J., Nesetril, J., and Roberts, F.S. (eds.), Contemporary Trends in Discrete Mathematics, DIMACS Series, 49, American Mathematical Society, Providence, RI, 257-274.

Suppes, P., Krantz, D.H., Luce, R.D., and Tversky, A. (1989), Foundations of Measurement, Vol. II, Academic Press, New York.

Trotter, W.T. (1992), Combinatorics and Partially Ordered Sets: Dimension Theory, The Johns Hopkins University Press, Baltimore, MD.

Next: List of Participants
Working Group Index
DIMACS Homepage
Contacting the Center