DIMACS Mini-Workshop on System Based Modeling in Informatics
February 19 - 20, 2001
DIMACS Center, Rutgers University, Piscataway, New Jersey
- Organizers:
- Michael Liebman, Abramson Family Cancer Center, University of Pennsylvania, School of Medicine, liebmanm@mail.med.upenn.edu
- Richard L.X. Ho, R.W. Johnson Pharmaceutical Research Institute, RHo@prius.jnj.com
Presented under the auspices of the Special Year on Computational Molecular Biology.
Abstracts:
1.
NIGMS Funding Opportunities for Quantitative Approaches to
Biomedical Research
James J. Anderson, NIGMS
The National Institute of General Medical Sciences (NIGMS: a component of
the U.S. National Institutes of Health), in conjunction with other
Institutes and the National Science Foundation, has issued a number of
program announcements (PA) and requests for applications (RFA) that provide
for the support of cross-disciplinary research, education, and training
involving quantitative approaches to complex biological and biomedical
problems. The suite of initiatives have as their foci:
1. The understanding of system principles and dynamics in processes
involving large numbers of interacting components, at all levels of
biological organization.
2. The development of analytical methodologies to discover the genetic
architecture of complex genetic traits.
3. The study of the evolutionary dynamics of pathogens and their hosts with
their environments.
4. The development of enabling technologies useful for the study of
metabolic processes and metabolic engineering.
5. The development of basic mathematical concepts and algorithms that have
the potential for significantly advancing the state of the art in
biomedical research.
The initiatives comprise mechanisms to fund research projects (traditional
research project grants (R01) and program project grants (P01)), to fund
establishment of integrative research efforts ("glue grants," R24), to fund
extensive programs of research related activities (Center grants (P50)), to
provide support for short courses and workshops (R25 education grants) for
both biologists and non-biologists, and to provide training at both pre-
and postdoctoral level (T32 and T33 Training Grants).
Detailed information on these programs will be available, and can also be
found at the following URL:http://www.nih.gov/nigms/funding
2.
Analysis of the Global Role of IHF in Transcriptional Regulation
Craig J. Benham
Department of Biomathematical Sciences
Mount Sinai School of Medicine
New York, NY
In a collaboration with Dr. Wes Hatfield (UC-Irvine) we are investigating
the IHF-regulated ilvPG promoter. We have shown that it is activated by a
mechanism involving the transmission of stress-induced
destabilization. Specifically, IHF binding causes a four-fold increase in
transcriptional initiation rates, but only when the substrate DNA is
negatively supercoiled. The mechanism for this activation was elucidated
by a collaboration between computational analysis of stress-induced
destabilization and experimental investigations. When negatively
supercoiled, a region containing the IHF binding site experiences a
destabilization of its duplex structure. IHF binding causes this region to
reform the B-DNA structure, which causes the destabilization to be
transferred to the 10 region of the promoter, thereby activating transcription.
This talk will briefly describe the computation of stress-induced
destabilization. Then the collaboration will be described by which the
mechanism of the IHF-regulated ilvPG promoter was elucidated. Then we
indicate how this work is being extended to elucidate the global roles of
high-affinity IHF binding in transcriptional regulation throughout the E.
coli genome. The theoretical analysis of all other known IHF-regulated
genes finds that they all have destabilization properties that suggest that
they also could be regulated by the same transmission-type
mechanism. Finally, the complete analysis of the entire genome finds a
total of 125 ORFs with the catenation of properties needed for this
mechanism. These results are currently being experimentally tested in Dr.
Hatfield's lab using expression arrays. This is the first collaborative
investigation of a global mechanism of regulation.
This and other work shows that stress-induced strand separation plays
central roles in the initiation of gene expression. Indeed, recent results
suggest that it may be among the most archaic regulators of this essential
biological process. Speculations will be presented regarding the reasons
for this importance.
3.
An application framework for modelling biological processes
Carolyn Cho, Physiome Sciences, Princeton NJ USA
The recent increase in the generation and variety of biological data has
stimulated demand for modeling and simulation to complement experimental
investigation. The mathematical expertise required for model building,
however, has limited its wide-spread application.
Physiome Sciences is developing software and applications for modeling
signal transduction and other biochemical pathways, intracellular and
extracellular physiological processes, organs and systems. These tools
have expert-user capabilities but are also designed for use by researchers
with no mathematical modeling expertise. In addition to providing
hypothesis testing and predictive capabilities, the resulting models become
the entry point for accessing relevant data. Physiome software provides
researchers a unified environment to filter information, analyze data,
develop hypotheses and create a shared knowledge base.
Physiome's modeling framework is designed around 4 core themes.
- Transparent - Researchers can choose to build and simulate a model
without encountering the underlying mathematical formalism, or to edit the
mathematics and data directly.
- Customizable - Physiome provides the tools for the researcher to
build an individual model from a general framework. The researcher may
also import custom mathematics at any stage of the model construction or
simulation.
- Expandable - Models can be re-used as components to build more
realistic, larger-scale models as new data are generated
- Flexible - The model interface with other applications and build on
the user's existing computational infrastructure
This presentation focuses on the use of Physiome Sciences' In Silico CellTM
modeling environment to construct, analyze and interpret biochemical
pathway data. In Silico Cell supports the hierarchical modeling of
biological systems and the creation of detailed models from simple ones.
This process is enabled by the use of the CellMLTM modeling language, an
application of XML for describing biological processes at the cellular and
sub-cellular level.
In Silico Cell's Pathway Editor function allows the researcher to build
and edit pathway diagrams and provides network analysis tools to identify
possible drug targets at critical points in a biochemical pathway. This is
done using a graphical user interface that automatically generates
simulations and mathematics from pathway maps. The pathway and its
individual components are linked to a database that can be referenced and
updated by the researcher.
4.
Innovative information management software for genetic analysis.
Richard Groves, Genomica Corporation, Boulder, Colorado
Genomica provides a unified database structure in which genetic,
phenotypic, molecular and population-derived information can be stored in a
single, purpose-built database. Stored data may be visualized, manipulated
and mined in simple and intuitive data management tools, specifically
designed for genetic and molecular data. All stored information can be
rapidly transformed and exported in the rich formats required for further
research and analysis. Benefits to users include rapid and sophisticated
data stratification and mining, more time available for data analysis, less
time spent on data formatting and all data is stored in a single location
resulting in superior data integrity and security.
5.
Applying In Silico, Animal, and Mental Models of Human Weight Control
Richard L.X. Ho, R.W. Johnson Pharmaceutical Research Institute
There are many ways to study the control of body weight, but all depend on
having a model with which to construct hypotheses, design experiments, and
integrate the information for new understanding. To refine their mental
model of a disease process, researchers traditionally utilize data from
clinical studies and in vivo animal models. Recently there has been growing
interest in using in silico models or computer simulations to enhance
understanding of disease pathophysiology. A few top-down computer models
with nonlinear dynamics are now available commercially which allow creation
of simulated patients and virtual interventions on those patients. We are
now using such computer models with data mining software to integrate
information from low and high throughput methods as well as to help us
formulate novel hypotheses and experimental designs in the field of human
weight control.
6.
Pharmacosemiotics: An Emerging Physics/Chemistry/Linguistics Paradigm in
Pharmacology. Sungchul Ji, Department of Pharmacology and Toxicology,
Rutgers University, Piscataway, N.J. 08855.
Semiotics is the scientific study of signs and symptoms that
was developed originally in Ancient Greece as a means of medical
diagnosis and prognosis. Signs can be macroscopic (e.g., written words
and sentences) or microscopic (e.g., hormones and second messengers)
in size. Therefore, `pharmacosemiotics,' a term coined by F. E. Yates
in 1999, designates the field of study of drugs viewed as `molecular
signs' with which biomedical scientists can communicate with living
cells in patients to effectuate pharmacotherapy.
There are two distinct approaches to biomedical research 96
i) the PC paradigm based on the assumption that physics (P) and
chemistry (C) are necessary and sufficient to solve all biomedical
problems, and ii) the PCL paradigm postulating that P and C are
necessary but not sufficient and hence that a new approach,
linguistics (L), must be added to the traditional paradigm in order to
completely describe and understand living phenomena. The PCL paradigm
is synonymous with the semiotics paradigm, since the study of
molecular signs entails integrating physics, chemistry, and
linguistics. The semiotics paradigm appears to be strongly supported
by the recent uncovering of the isomorphism between the molecule-based
cell language and the word-based human language [BioSystems 44:17-39
(1997); Ann. N. Y. Acad. Sci. 870:411-417 (1999)].
The main objective of this contribution is to discuss a
theoretical model of the living cell known as the Bhopalator that has been
developed over the past two decades based on the PCL or semiotics paradigm
[J. Theoret. Biol. 116:399-426 (1985); Comments Toxicol. 5(6): 571-585
(1997)]. Three groups of concepts, each derived from physics (i),
chemistry (ii), and linguistics, played crucial roles in the development
of the Bhopalator model of the cell 96 i) the notion of solitons or
soliton-like energetic entities entrapped in biopolymers, ii) the principle
of self-organizing chemical reaction-diffusion systems, and iii) the
concept of words and sentences and the associated principle of `double
articulation.' The Bhopalator embodying these concepts predicted i) that
biopolymers contain sequence-specific conformational strains, called
`conformons,' that drive all biopolymeric functions, ii) that the final
form of expression of structural genes is not polypeptides as is widely
believed but patterns of concentration and mechanical stress gradients in
the cytosol and the nucleus, collectively known as `intracellular
dissipative structures' (IDS's), iiia) that noncoding regions of DNA
carry `spatiotemporal genes' that control the space- and time-dependent
evolution of gene expression, and iiib) that IDS's act as molecular analogs of
`sentences' in the cell, which are essential for cells to execute
molecular analogs of `propositions,' `arguments,' and `computations.'
Prediction i) was supported by the discovery by C. Benham of
`strain induced duplex destabilization or SIDS's in DNA
[PNAS 90:2999-3003 (1993)]. Prediction ii) is validated by the finding
reported by Sawyer et al. that intracellular calcium waves drive chemotaxis
in neutrophils [Science 230:663-666 (1985)]. Prediction iiia) is substantiate by the
finding of N. Amano et al. that the number of noncoding bases per genome
increases with the increasing number of transcription factors per
structural genes in multicellular, but not in unicellular, organisms [Biol.
Chem. 378:1397-1404 (1997)]. Finally, Prediction in iiib) is consistent
with the notion of "hyperstructures" proposed by Norris et al. as
nonequilibrium, transient complexes of biopolymers, metabolites and ions,
intermediate in size between individual macromolecules and the cell itself,
that regulate bacterial structure and cell cycle [Biochimie 81:915-920
(1999)].
7.
Biosimulation: Systems-Based Modeling of Human Physiology in Health and
Disease
Robert J. Leipold, Ph.D.
Modeler - Professional Services
Entelos, Inc.
Compared to other high-tech industries (automotive, aeronautics,
electronics, chemical processing), the pharmaceutical industry has been slow
to adopt systems-based modeling to improve the efficiency and effectiveness
of its operations. The current process of drug discovery and development is
characterized by long development times, great expense, and a high failure
rate. Recent technological advances such as expression profiling and
sequence databases have increased the rate at which prospective drug targets
can be identified, but they have done nothing to improve the odds for
success in the subsequent development process. Entelos offers comprehensive
models of human physiology (PhysioLabs(tm)) within which one can test
prospective targets for effects on clinically-relevant endpoints.
PhysioLabs can be used at every step of the drug discovery and development
process from target selection and prioritization through design and
evaluation of clinical trials. Examples of these applications will be
illustrated with case studies.
8.
Effective Representation of Gene Families.
Hugh B. Nicholas Jr., David W. Deerfield II., Alexander J. Ropelewski,
and Jaclyn Schwizer.
Pittsburgh Supercomputing Center
440 Fifth Avenue
Pittsburgh, PA 15213
The explosion of sequence data has greatly increased the amount of
information needed to identify distant homologues to a query sequence
against the large, constantly increasing, background of essentially
random, unrelated sequences. One effective solution to this problem
would be a non-redundant database of aligned gene family protein
products which preserves the knowledge of the sequence variation to
which a query sequence must be compared.
We have prepared alignments of gene families' protein products using
different practical unguided, automatic methods as well as refining
these alignments by pattern-based methods. We have represented the
alignments by three different profile methods (average, evolutionary,
and bayesian) and hidden markov models. We will present the results
from a comparison of these different alignment and representation
strategies.
9.
KBTool: An approach to managing diverse biological information that
provides simple, coherent, extensible storage and retrieval.
Stephen Shaw, Experimental Immunology Branch, National Cancer Institute,
Bethesda, MD
Information overload is pervasive in modern biology; finding satisfactory
ways to efficiently capture and manage available information is essential
both for human comprehension and for automated analysis. A small fraction
of that biological knowledge is stored in structured databases that are
optimal for automated retrieval, analysis and computation. In contrast, a
large fraction of that information is present in free-text documents; and
as implicit knowledge of biological experts. We have been experimenting
with an intermediate approach (evolving in a software application we call
"KBTool") that encodes biological information in a fashion more structured
than free text, but more flexible and extensible than conventional
relational databases. It is closest in concept to an entity-relationship
approach. The current implementation has about 100 categories of entities
(genes, proteins, peptides, bindings sites, phosphorylation sites,
pathways, molecular assemblies, references, drugs, types of malignancies,
etc), about 100 kinds of allowed relationships, and more than 50,000
specific entities. Utilization is sufficiently simple that we use it to
encode and retrieve information related to very diverse tasks: conduct of
biological research, administration of an electronic journal, managing
references, personal contacts, bookmarks etc. Because it provides linkage
of public data, workgroup-specific data, and private data, KBTool maintains
a coherence that we have been unable to achieve in any other way. In our
experience, accumulation of biological expertise into a machine-readable
form will occur not when experts are required to log in information, but
rather when they find it beneficial to do so. KBTool is making progress
towards that goal of users storing information based on "enlightened
self-interest".
10.
A Systems Approach to Post-Genomic Therapeutic Research
Roland Somogyi, Alan Ableson, Max Kotlyar, Ross Dickson, Evan Steeg
Molecular Mining Corporation, Canada
Integrated analysis of genetic, phenotypic and chemical data will be
required to gain insight into complex diseases, discover novel drugs,
and integrate diagnostics with treatments for individualized therapies.
Recent developments in molecular biology, measurement technologies and
computational performance are making it possible to approach these
challenges from a systems perspective. We must now develop flexible
computational methods for discovering predictive connections between
genes, phenotypes and drugs. These relationships involve complex
non-linear and combinatorial interactions, which cannot be found using
traditional linear inference. We are currently applying our advanced
data mining and modeling procedures to a) predictions in the areas of
drug efficacy, genetic predispositions, diagnostics and toxicology, b)
systematic experimental design to provide the right set of facts that
permit valid analysis, and c) network reverse engineering and in silico
experimentation.
11.
Transcriptome annotation using Gene Ontology nomenclature
Han Xie and Liat Mintz
Bioinformatics, Compugen, Jamesburg, NJ
Advances in genomic sequencing and computational biology have presented a
unique challenge to annotate the transcriptome of different species. Here we
report our efforts in systematically examining the human and rodent
transcriptome. EST, mRNA and genomic sequences are clustered using Compugen
LEADS platform technology (www.labonweb.com, www.cgen.com ) to predict gene
clusters. The standardized gene ontology (GO) nomenclature
(www.geneontology.org) was utilized to designate the functions, cellular
localization and involvement of pathways for transcripts. Annotation
procedures were centered on the sequence and motif homology to GO-annotated
genes. In addition, text-mining techniques and multi-parameter cellular
localization modeling were used to increase the annotation accuracy, and
predict novel annotation. The majority of gene clusters containing mRNA
sequences have been assigned GO. This systematic annotation of transcriptome
will help discover new gene functionality as well as facilitate higher-order
analysis of biological systems.
12.
Recurrence Quantification as Tool for Protein Bioinformatics
Joseph P. Zbilut, Professor, Molecular Biophysics and Physiology,
Rush Medical College, Chicago, IL
Recurrence quantification analysis (RQA), a methodology
related to contact maps which extends analyses to higher
dimensions, has been used to understand proteins in a variety of
contexts. Unlike FFTs, its utility redounds from its ability to
quantify variables without first filtering them through
a mathematical transform. Examples of its use will be
given in the context of prion singularities, structure/acitivity
relationships and protein protperties.
Previous: Participation
Next: Registration
Workshop Index
DIMACS Homepage
Contacting the Center
Document last modified on February 7, 2001.