Workshop on Information Processing in the Biological Organism (A Systems Biology Approach)

November 4-5, 2003
Four Points Sheraton, Bethesda, Maryland (5 minutes walk from NIH)

Organizers:: SCIENTIFIC CONTACTS:
Fred S. Roberts, Chair, DIMACS / Rutgers University, froberts@dimacs.rutgers.edu; Eduardo Sontag, Co-chair, Rutgers University, http://www.math.rutgers.edu/~sontag; CONTACT FOR ORGANIZATIONAL MATTERS:
Christine Spassione , DIMACS / Rutgers University, spassion@dimacs.rutgers.edu

The workshop will be a "satellite" meeting prior to the NIH Biomedical Information Science and Technology Initiative (BISTI) Symposium, "Digital Biology: the Emerging Paradigm" to be held November 6 and 7, 2003 in Natcher Auditorium on the main NIH campus. See URL: http://www.bisti.nih.gov/2003meeting/.

Presented under the auspices of the Special Focus on Computational Molecular Biology and sponsored by the National Science Foundation.

Abstracts:

Bonnie Bassler, Princeton University

Title: Small Talk: Cell-to-Cell Communication in Bacteria

Bacteria communicate with one another using chemical signalling molecules as words. Specifically, they release, detect and respond to the accumulation of these molecules, which are called autoinducers. Detection of autoinducers allows bacteria to distinguish between low and high cell population density, and to control gene expression in response to changes in cell number. This process, termed quorum sensing, allows a population of bacteria to coordinate the gene expression of the entire community. Quorum sensing confuses the distinction between prokaryotes and eukaryotes because it allows bacteria to behave as multi-cellular organisms, and to reap benefits that would be unattainable to them as individuals. In the bioluminescent marine bacterium V. harveyi, two parallel quorum sensing systems function to control light production. System 1 is composed of Sensor 1 and Autoinducer-1 (AI-1), and System 2 is composed of Sensor 2 and Autoinducer-2 (AI-2). Our results suggest that V. harveyi uses AI-1 for intra-species communication and AI-2 for inter-species cell-cell signalling. Many species of bacteria produce an AI-2-like activity. To investigate the mechanism of AI-2 signalling, we cloned the gene responsible for AI-2 production from several bacteria. The gene we identified in every case is highly homologous, and we named it luxS. Conserved luxS homologues exist in over 40 species of Gram-negative and Gram-positive bacteria suggesting that communication via an AI-2 signal response system could be a common mechanism that bacteria employ for inter-species interaction in natural environments. We determined the biosynthetic pathway for and structure of AI-2. AI-2 is a furanosyl borate diester with no resemblance to any previously characterized autoinducer. We suggest that addition of naturally occurring borate to an AI-2 precursor generates active AI-2. Our results indicate a novel biological role for boron, an element required by a many organisms but for unknown reasons. Biotechnological research is now focused on the development of molecules that are structurally related to AI-2. Such molecules have potential use as anti-microbial drugs aimed at bacteria that use AI-2 quorum sensing to control virulence. Similarly, the biosynthetic enzymes involved in AI-2 production and the AI-2 detection apparatuses are viewed as potential targets for novel anti-microbial drug design.

Dennis Bray, Cambridge University

Title: Signaling in a Molecular Jungle

The set of biochemical reactions by which an E. coli bacterium detects and responds to distant sources of attractant or repellent molecules is probably the simplest and best understood example of a cell signaling pathway. The pathway has been saturated genetically and all of its protein components have been isolated, measured biochemically, and their atomic structures determined. There is consequently a uniquely complete set of data on which to base detailed computer simulations and to use it to ask (i) whether the system in fact functions as we think it does, and (ii) whether interactions in the network depend on as-yet-undiscovered mechanisms or processes.

We have used a variety of computational methods to model E. coli chemotaxis and are so far able to account for most of the impulse and adaptational responses to the attractant aspartate, and for the resting phenotype of over 60 mutants. Certain discrepancies between theory and experiment still remain, however, and these have led us to an increasingly detailed examination of the spatial locations of the proteins in the pathway. In particular, we have become interested in the cluster of chemotactic receptors on the bacterial surface. Recent evidence supports the view that this cluster of several thousand receptors and their associated proteins functions as a sold state device that integrates, amplifies and differentiates the input fluctuations in chemical species. The dynamic complexity of this small macromolecular assembly is enormous >and far greater than one could measure experimentally. But communication between different parts of the lattice, for example through conformational interactions between neighbouring receptors, may generate spatial and temporal patterns of activity. These in turn may increase the sensitivity and range of response above that of the individual molecular components and might allow the bacterium to discriminate between competing signals and to prioritize certain combinations of environmental input.

For further information and recent publications please see: www.zoo.cam.ac.uk/comp-cell

Tom Deisboeck, Harvard University

Title: Modeling Tumors As Complex Dynamic BioSystems

We propose that malignant brain tumors behave as self-organizing and adaptive multicellular systems rather than as unorganized cell masses. To investigate this paradigm-shifting hypothesis we employ an interdisciplinary approach combining cancer research, biomedical imaging, statistical physics, mathematical biology, materials science, computational and complex systems science. Findings from novel experimental settings are presented in the context of innovative computational models, using techniques from cellular automata to agent-based modeling. Implications of this work for future experimental and clinical cancer research will be discussed.

Joseph Duffy, Indiana University

Title: Oocytes on "cell" phones - EGFR signaling in Drosophila

Through a temporal progression of feedback loops involving juxtacrine and autocrine signals, activation of the Epidermal Growth Factor Receptor imparts axial and morphological pattern to the Drosophila egg chamber. One of the most prominent results of this signaling activity is the presence of two dorsal respiratory appendages on the Drosophila chorion (eggshell). Although intraspecific variation in this phenotypic feature is lacking, extensive interspecific variation has evolved resulting in species with a range of appendage numbers (0 to >5). We are using this simple epithelial system to investigate the complexity and evolution of the EGFR signaling network and its role in morphological patterning. By combining genetic, molecular, and evolutionary approaches, we have identified and characterized molecules functioning at distinct levels in this network. One of these elements, Kekkon 1 is a direct transmembrane inhibitor of EGFR, while two additional elements, CBP and Bullwinkle, regulate receptor output as components of the transcriptional machinery. Our work has also led to the identification of a family of Kek1-related molecules. We propose that these molecules function analogous to Kek1 to regulate the activity of additional signaling networks.

David Gifford, MIT Laboratory for Computer Science and the MIT Artificial Intelligence Laboratory

Title: Learning Predictive Models of Cellular Systems

Our fundamental goal is to discover predictive models of cellular systems that explain patterns of combinatorial regulation and how the activity of genes involved in related biological processes are coordinated and interconnected. We present methods for efficiently combining complementary large-scale expression and transcription factor protein-DNA binding data to discover co-regulated modules of genes and associated regulatory networks. A module is a foundational building block that describes the interaction of transcription factors and the genes they regulate. Our analysis is based upon high-throughput data sources including expression data, in vivo protein-DNA binding data, and sequence data. Because each high-throughput data source is limited in scope and accuracy, an important computational goal is to integrate information from multiple data sources into principled models. We validated the quality of the results obtained with our module discovery algorithm and used time-course expression data to recover the temporal relationships between key regulatory events in the Saccharomyces cervisiae cell cycle. We discuss how to validate predictive models with knock-out data, the MIPS database, chromatin-IP experimental results, transcription factor-gene interactions identified in the literature, and DNA sequence motif information.

Mark A. Gluck, Neuroscience. Rutgers-Newark

Title: Computational models of cortico-striatal-hippocampal interaction during learning: Implications for neurological disorders of memory

We seek to understand the neural and behavioral bases of learning and memory, using computational neural-network modeling, behavioral analyses of animals with experimentally-induced brain lesions, and neuropsychological studies of memory-impaired human clinical populations. Our work focuses on interactions between the medial temporal lobe (including the hippocampus, entorhinal cortex) and the basal ganglia/striatum. I will describe studies of associative learning in three populations based on preditictions from our computational modeling: (1) elderly at risk for Alzheimer's Disease who show structural imaging evidence of hippocampal atrophy, (2) individuals with global anterograde amnesia with damage to the hippocampus, and (3) Parkinson's disease patients with striatal dysfunction due to the loss of dopamine producing cells. The results from these studies, along with analogous studies of animal conditioning, provide converging evidence for a view of the hippocampus as an essential gateway to memory in which stimulus representations are adaptively modified to reflect significant stimulus-stimulus and stimulus-outcome regularities, and a view of the striatum as a system for developing mappings between stimulus representations and actions, based on external feedback.

Selected References:

Gluck, M. A., Meeter, M., & Myers, C. E. (2003). Computational models of the hippocampal region: Linking incremental learning and episodic memory. Trends in Cognitive Science. In press.

Myers, C., Shohamy, D., Gluck, M., Grossman, S., Kluger, A., Ferris, S., Golomb, J., Schnirman, G., & Schwartz, R. (2003). Dissociating hippocampal versus basal ganglia contributions to learning and transfer. Journal of Cognitive Neuroscience. 15(2). 185-193.

Myers, C., Kluger, A., Golomb, J., Ferris, S., de Leon, M., Schnirman, G., & Gluck, M. (2002). Hippocampal atrophy disrupts transfer generalization in non-demented elderly. Journal of Geriatric Psychology and Neurology, 15, 82-90.

Gluck, M. A. & Myers, C. E. (2001). Gateway to Memory: An Introduction to Neural Network Models of the Hippocampus and Learning. Cambridge, MA: MIT Press.

Ary L. Goldberger, NIH/NCRR Research Resource for Complex Physiologic Signals and Beth Israel Deaconess Medical Center, Harvard Medical School

Title: Multiscale Complexity in Health and Information Loss with Aging and Disease

According to classical concepts of physiologic control, healthy systems are self -regulated to reduce variability and maintain physiologic constancy. Contrary to the predictions of homeostasis, however, the output of a wide variety of systems, such as the healthy human heartbeat, fluctuates in a complex, nonstationary manner, even under resting conditions. Scaling techniques adapted from statistical physics reveal the presence of long-range, power-law correlations, as part of multifractal cascades operating over a wide range of time scales. These scaling properties suggest that the nonlinear regulatory systems are operating far from equilibrium, and that maintaining constancy is not the goal of physiologic control. In contrast, for subjects at high risk of sudden death, fractal organization, along with certain types of nonlinear interactions, breaks down, associated with a loss of complexity. Application of fractal and nonlinear analysis may provide new approaches to assessing cardiovascular risk and forecasting sudden cardiac arrest, as well as to modeling and monitoring the aging process. Similar approaches show promise in assessing other regulatory systems, such as human gait control in health and disease. Elucidating the nonlinear mechanisms involved in complex physiologic signaling networks is emerging as a major challenge in the post-genomics era.

References
1. Costa M, Goldberger AL, Peng CK. Multiscale entropy analysis of complex physiologic time series. Phys Rev Lett 2002;89:068102-1-4.
2. Goldberger AL, Amaral LAN, Hausdorff JM, Ivanov PCh, Peng CK, Stanley HE. Fractal dynamics in physiology: alterations with disease and aging. Proc Natl Acad Sci USA 2002;99[suppl 1]:2466-2472. (http://www.pnas.org/cgi/content/full/99/suppl_1/2466)
3. Goldberger AL, Amaral LAN, Glass L, Hausdorff JM, Ivanov PCh, Mark RG, Mietus JE, Moody GB, Peng CK, Stanley HE. PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals. Circulation 2000;101:e215-e220.

Jason Haugh, North Carolina State University

Title: Quantitative Experiments and Modeling of PDGF Receptor Signaling

Manipulation of signal transduction networks offers an approach to influence cell behavior at the molecular level. As a model system, we have studied signaling events mediated by platelet-derived growth factor (PDGF) receptors in mouse fibroblasts, events which control cell proliferation, survival, and cell migration in homeostasis and wound healing. Our general approach is the use of both biochemical and fluorescence microscopy techniques to quantify signaling in stimulated cells at many different levels, in conjunction with pharmacological and genetic interventions that perturb them. Equally important are mathematical models, formulated with only enough detail to explain all of our measurements, which we use to interpret our data, assess candidate mechanisms, and derive new hypotheses. This approach is demonstrated through both kinetic and spatial analyses of a PDGF-stimulated signaling network comprised of the PI 3-kinase/Akt and Ras/Erk pathways.

Leroy Hood, Institute for Systems Biology, Seattle, Washington

Title: Systems Biology: Deciphering Life

With the completion of the human genome, the emergence of cross-disciplinary biology, and the availability of the internet and world wide web, systems biology is emerging. Two additional aspects of systems biology include the idea biology is an informational science and the emergence of high throughput biological technologies for acquiring global sets of data. Systems biology encompasses the idea that all the elements in a biological system can be studied simultaneously in response to genetic and environmental perturbations. I will discuss new global genomics and proteomics technologies, as well as novel and integrative computational tools. Then I will discuss how we have applied the systems approach to individual biological systems in bacteria, yeast, and sea urchins. And I will conclude by pointing out how systems biology inevitably leads to predictive, preventive, and personalized medicine.

Ravi Iyengar, Mount Sinai School of Medicine

Title: Analysis of a Cell Signaling Network (with Avi Maayan)

An initial representation of a hippocampal neuron as a network of cellular components and interactions has been developed. The network consists of 411 components and 790 interactions and includes the major signaling pathways that constitute the central regulatory network and cellular machines involved in plasticity functions, This include the the postsynaptic electrical response machinery, the secretory machinery , the actin cytoskeleton based motility machinery and the transcriptional and translational machinery. The components may represent around 1 to 2% of the estimated number of different types of cellular components. In its maximally connected state the characteristic path length of the network is 4.7 and the clustering coefficient is 0.1, indicative of a "small world" configuration. The network is enriched in positive feedback loops containing three to five components and regulatory motifs that can function as coincidence detectors and gates indicative of a preponderance of short-range regulation within the network. Connectivity within the network is dependent on multiplicity, intensity, and duration of signals, and hence we hypothesize that the network is capable of existing in a series of configurations. Such variability of connectivity is estimated by computing the characteristic path length of various configurations ranging from 4.7 to 5.8 or greater. The ensemble of configurations may be considered a meta-network. The ability to function as a meta network may allow the cell to balance the functions of plasticity and homeostasis.

Thomas Kepler, Duke University

Title: The Fluid Organs of Immunity: Phase transitions in bacterial pathogenesis

The vertebrate immune system of consists of a great diversity of motile cells whose activities become coordinated during infection. This orchestration is mediated by signaling molecules either secreted (cytokines) or engaged by direct cell-cell contact. Pathogenic microorganism (and other stimuli) induce internal changes in the responding immune cells which, in turn, lead to spatial reorganization of these cells in a process arguably akin to a phase transition.

We illustrate this novel perspective with our ongoing studies of bacterial pathogenesis using PDE-embedded agent (PDEEA) models. In these models, the microorganisms and host leukocytes are represented computationally as agents with non-trivial internal structure, while the soluble factors are treated as continua described by reaction-diffusion equations.

Boris Kholodenko, Thomas Jefferson University

Title: Inferring Dynamic Architecture of Cellular Networks by Perturbation Response Analysis

   Boris N. Kholodenko, Department of Pathology, Anatomy and Cell Biology, Thomas Jefferson University
   Eduardo Sontag, Department of Mathematics, Rutgers University 
   Anatoly Kiyatkin[1]Department of Pathology, Anatomy and Cell Biology, Thomas Jefferson University

High-throughput technologies have facilitated the acquisition of large genomics and proteomics data sets. However, these data provide snapshots of cellular behavior, rather than help us reveal causal relations. Here we propose how highthroughput technologies can be utilized to infer connections among genes, proteins, and metabolites by monitoring stationary and time-dependent responses of cellular networks to experimental interventions [1,2,3]. Our experimental strategy is justified by a mathematical proof showing how the topology and strengths of connections leading to a given node, e.g., to a particular gene or gene product, are deduced from responses to perturbations none of which can directly influence that node. We demonstrate that to infer all interactions from stationary data, each node should be perturbed separately or in combination with other nodes. The use of time series is crucial for a plethora of intrinsically transient biological processes, such as the cell cycle and apoptosis. Measurements of time series allow for inferring the dynamics of the interaction strengths and do not require perturbations to all nodes. This is important when some network components are difficult to perturb, whereas two or more independent perturbations can be applied to other components. Our approach is scalable, since in contrast with other methods an increase in the network complexity does not result in a combinatorial increase in the number of perturbation experiments and/or computations. Overall, the methods we propose are capable of deducing and quantifying functional interactions within and across cellular gene, signaling, and metabolic networks.

1. Kholodenko B.N, Kiyatkin A, Bruggeman FJ, Sontag E, Westerhoff HV, Hoek JB. Untangling the Wires: A strategy to Trace Functional Interactions in Signaling and Gene Networks. Proc Natl Acad Sci U S A. (2002) 20, 12841-12846.

2. Kholodenko, B. N. and Sontag, E. D. (2002) Determination of Functional Network Structure from Local Parameter Dependence Data.
arXiv: physics/0205003.

3. Sontag, E., Kiyatkin A., Kholodenko, B. N. (2003, submitted).

Galit Lahav and Uri Alon, Weizmann Institute of Science

Title: Dynamics of the p53-mdm2 feedback loop in living cells, and the design-principles of biological feedback

A major goal of systems biology is to understand complex protein networks in cells. Great simplification would occur if networks could be broken down into basic recurring circuits, such as the recently defined 'network motifs'. We focused on a common network motif, where a transcription factor is negatively regulated on the protein level by one of its downstream gene products. We employ the well studied p53-mdm2 feedback loop to experimentally study the dynamics of this motif in single living cells. We constructed human cell lines expressing functional p53-CFP and mdm2-YFP fusions. Accurate measurements of protein levels, localizations and interaction were obtained at high temporal resolution by fluorescence microscopy. Detailed oscillatory kinetics following DNA damage was found. We also studied the variability between the dynamics of individual cells, which can not be seen in assays on cell populations. The results allowed the construction of a mathematical model that captures the behavior of this regulatory module. We discuss the design-principles of biological feedback that were found in this and other systems.

Andre Levchenko, Johns Hopkins University

Title: Information transfer in signaling pathways: Gradient sensing

Signal transduction pathways transfer and process information in the form of chemical activity of receptors and second messengers. The quantification of the information transferred is a highly non-trivial mathematical and biological question. Here I will present an attempt to quantify signaling information based on simple arguments related to the organization of underlying chemical reactions. I will apply this analysis to the case of gradient sensing by single cells. I will also discuss the inherent paradox relating the "noisiness" of the chemical reactions to the capacity of information transfer through the biochemical channels, and how this paradox can shed light on organization and behavior of multiple signaling pathways.

Benno Schwikowski, Institute for Systems Biology

Title: Challenges for computer science and math as a part of Systems Biology

Systems Biology aims to integrate experimental measurements at various hierarchical levels of biological organization into single conceptual frameworks that offer new lines of attack for many biomedical problems. One way mathematicians and computer scientists can contribute to this is by helping to solve problems in existing formal frameworks. Another area of opportunity lies in developing these formal frameworks themselves.

For many formally trained scientists, the familiar four-letter model for DNA sequences provides an easy entry point into biological research. However, models that incorporate large-scale data on higher levels of cellular organization are yet to be worked out through experimental observation and analysis. While the development of semi-formal or formal models is a traditional activity in biology, the shift from small-scale to large-scale data sets requires substantial input from mathematics and computer science. New challenges for these disciplines arise from the new context in which these data sets are generated: the complexity of biological systems, specific experimental technologies, and the nature of available data. This talk will outline a few of these challenges.

Anirvan Sengupta, Rutgers University (Physics)

Title: The Challenge of Understanding Bio-molecular Specificity

The traditional models in biology often depend on lock and key like specificity of the interaction of certain biomolecules and their targets. It is becoming abundantly clear that this is not the only strategy employed in biological systems to keep different "information channels" from interfering with each other. However the strategies for building functioning information processing systems with relatively non-specific components impose certain constraints on the design of the biological systems. Understanding these constraints require combining physical models, system level analysis and evolutionary theory. Such understanding has an impact on devising appropriate bioinformatics methods for making sense of biological circuit. Illustrations would range from transcriptional control by pleiotropic factors to cross-talk in signaling systems with paralogous components.

Eduardo Sontag, Rutgers University (Mathematics and BIOMAPS Institute)

Title: Systems Biology as a Generator of New Mathematics

The sciences and engineering have always provided the driving force for innovative mathematics. New fields, from geometry to calculus to discrete mathematics have arisen in response to the need to understand the physical world, and the urge to build devices that help society cope with the world.

Systems biology, rapidly emerging as the source of some of the most challenging questions facing the sciences, will undoubtedly provide the inspiration for the creation of entirely new areas of mathematics. New conceptual frameworks are urgently needed in order to deal with the analysis of systems that are robust to large parameter variations, multiscale phenomena, and the seamless interface between analog (chemical concentrations) and digital (genetic information) phenomena. This session aims to generate discussion of such issues in general; as an example, this talk will briefly mention some recent work on dynamical systems and control theory directly motivated by systems biology.

Joel Stiles, Carnegie Mellon University

Title: Counter-Intuitive Insights from Spatially Realistic Simulations of Synaptic Transmission

Physiological function depends on the spatial and temporal dynamics of specific genes, proteins, signaling molecules, and/or metabolites within and between cells. Realistic physiological simulations present an enormous challenge because of the wide range of underlying space and time scales, as well as the widely disparate organization and properties of different systems. In short, a major challenge is to develop modeling and simulation methods that allow integration of mechanisms, kinetics, and stochastic behaviors at the molecular level with structural organization and function at the cellular level.

Synaptic transmission exemplifies 3-D reaction-diffusion systems in which stochastic behaviors and spatial complexity can be very important. Using examples from current research, I will describe our methods for Monte Carlo simulation of synaptic microphysiology, and illustrate counter-intuitive results obtained from spatially realistic models of the vertebrate neuromuscular junction. Particular examples will focus on postsynaptic membrane topology and acetylcholine receptor distribution, as well as synaptic acetylcholinesterase density and distribution.

Gustavo Stolovitzky, IBM

Title: Reconstructing synthetic biological networks using pair-wise correlation analysis
(with J. Jeremy Rice and Yuhai Tu)

We use a pair-wise correlation method to reconstruct synthetic biologically-inspired networks. We use synthetic networks of known topology as surrogates for real biological networks to test the limitations of our network reconstruction algorithms. The networks consist of nodes (genes), directed edges (gene-gene interactions), and a dynamics of the genes' mRNA concentrations in terms of the gene-gene interactions. We interrogate the network by downregulating each gene and probing the system in its stationary state. We measure the correlation between the perturbed gene and the other genes in replicate experiments. If the measured correlation is above (below) a given threshold, we postulate the existence of an excitatory (inhibitory) connection. We test the algorithm on various network topologies, including topologies based on the transcriptional regulatory network in E. coli. We investigate how error rate is affected by noise, network size, number of connections, and dynamic parameters. We find two main sources of reconstruction error. False positives arise from correlation between nodes connected through intermediate nodes. False negatives occur when the correlation between two directly connected nodes is obscured by noise, non-linearity, or multiple inputs to the target node. We propose a novel method to choose an optimal threshold for predicting connections and an algorithm to reduce the errors arising from indirect connections. With these improvements, we can reconstruct networks with the topology of the transcriptional regulatory network in E. coli with an error rate of less than 1%. Our methods can be used as an aid in the design of network reconstruction assays.

H. Steven Wiley, Pacific Northwest National Laboratory

Title: Interrogative Cell Signaling: How cells perceive their context

The traditional view of signal transduction in eukaryotic cells has been that cells passively receive information from their environment and then respond in an appropriate fashion. For this scheme to work, cells must have a complete complement of receptors that can bind any information-containing molecules that they encounter. Recent research results, however, suggest that cells can actively interrogate their extracellular environment by a variety of mechanisms, such as the proteolytic shedding of growth factors and receptors. Thus, a primary stimulus can induce the release of growth factors that produce a context-dependent secondary response. In addition, productive signaling depends on access of receptors to substrates that are frequently spatially restricted. Because the extracellular environment controls the pattern of ligand availability, signaling pathways will reflect the immediate extracellular context. Such subtleties in signal transduction cannot be observed using the traditional experimental approach of adding a large bolus of soluble ligands to cells. Instead, normal physiological responses of cells are likely to be driven by recursive signaling and numerous positive and negative feedback processes. Understanding complex cell signaling will thus require computational models that contain the spatial and temporal aspects of cells as well as the structure of their signaling networks.

Raimond Winslow, The Johns Hopkins University School of Medicine & Whiting School of Engineering

Title: Information Flow at the Systems Level: The Organization and Modeling of Experimental Data Across Multiple Scales of Biological Analysis

The impact of both high throughput measurement technologies and the emerging importance of informatics and computational sciences on basic biomedical research have been profound. Perhaps even more importantly, these new approaches are now impacting on the study of the cause and treatment of human disease. In the future, large-scale clinical studies will aim at collecting a broad range of data, including gene sequence, genomic, proteomic, imaging and clinical data, from populations of patients sharing a specific disease diagnosis. Such studies will enhance our fundamental understanding of disease mechanisms across hierarchical levels of biological analysis, contribute to the identification of biological markers which correlate with and can be used to diagnose different disease states, and provide insights into novel therapeutics. However, the technical challenges posed by these clinical studies are considerable due to: a) the volume of multi-modal data that must be acquired per patient; b) the large number of patients required for such studies; c) the need to assure strict confidentiality of patient data; d) the need to access and analyze multiple data sets in heterogeneous databases; e) the need for new "robust" statistical inference and pattern classification algorithms that are appropriate for situations in which only a small number of measurements can be made on a very large number of genes or proteins; f) the need for a powerful computing infrastructure and algorithms that can support such databases and data analysis methods; and g) the need to provide clinical researchers with simple, easy to use software tools that enable them to solve the problems at hand.

Such a large-scale study is underway in the D. W. Reynolds Cardiovascular Clinical Center of the Johns Hopkins University School of Medicine. This study seeks to identify the factors which correlate with risk for Sudden Cardiac Death. In this talk, we will present our approach to the representation and organization of heterogeneous experimental data including imaging, proteomic, genomic, genetic and clinical data; and b) the ways in which we develop and apply a hierarchy of computational models, each addressing a specific level of heart function, to extract new knowledge from these data. We believe these approaches are "generic" and can be of value in applications beyond that of heart function.

Program

Workshop Index

DIMACS Homepage

Contacting the Center
Document last modified on November 1, 2003.