
Project title: Proteininduced DNA Looping
Many genetic processes are mediated by proteins that bind at separate, often widely spaced, sites on the double helix, tethering the intervening DNA into a loop [1, 2]. Examples of these processes include gene expression and its control, DNA replication, genetic rearrangements, multisite cutting by restriction enzymes, and DNA packaging. A DNA loop stabilized by such longrange proteinprotein contacts constitutes a discrete topological unit. As long as the ends of the DNA stay in place and the duplex remains unbroken, the linking number, Lk, or number of times the two strands of the double helix wrap around one another, is conserved. This constraint in Lk underlies the well known supercoiling of DNA: the stress induced by positioning the ends of the polymer in locations other than the natural (relaxed) state perturbs the overall coiling of the chain axis and/or the twisting of successive basepairs in the intervening parts of the chain [3]. As a first step in understanding the effects of specific proteins and drugs on DNA looping, we propose to study the imposed bending and twisting of neighboring base pairs [4] in known complexes of proteins and drugs with double helical DNA stored in the Nucleic Acid Database [5]. By subjecting a DNA segment of the same chain length as that found in a given complex to the observed orientation, displacement, and superhelical stress and setting the elastic moduli to sufficiently high values, we can use existing programs, e.g., [6], to simulate the presence of a rigidly bound molecule at arbitrary positions on circular DNA molecules or to model specific systems in which DNA looping plays an important role, e.g., the lac repressoroperator assembly in EscherichiaA0coli [7]. One student could devote a summer project to the collection of geometric information. Other students could apply this information to simulations of DNA loops and folds in subsequent years. Prerequisites: Students should have an interest (but not necessarily formal training) in molecular biology, familiarity with linear algebra in order to understand the parameters used to describe DNA structure, and knowledge of basic chemistry and physics to understand the nature of the DNA molecules and the elastic treatment of the double helix.
[1] Halford, S. E., Gowers, D. M., and Sessions, R. B., ``Two are Better than One,'' Nature Struct. Biol., 7, (2000), 705707.
[2] Schleif, R., ``DNA Looping,'' Annu. Rev. Biochem., 61, (1992), 199223.
[3] Bauer, W. R. ``Structure and Reactions of Closed Duplex DNA,'' Annu. Rev. Biophys. Bioeng., 7, (1978), 287313.
[4] Olson, W. K., Gorin, A. A., Lu, X.J., Hock, L. M., and Zhurkin, V. B., ``DNA Sequencedependent Deformability Deduced from ProteinDNA Crystal Complexes,'' Proc. Natl. Acad. Sci. USA, 95, (1998), 1116311168.
[5] Berman, H. M., Olson, W. K., Beveridge, D. L., Westbrook, J., Gelbin, A., Demeny, T., Hsieh, S.H., Srinivasan, A. R., and Schneider, B., ``The Nucleic Acid Database: A Comprehensive Relational Database of Threedimensional Structures of Nucleic Acids,'' Biophys. J., 63, (1992), 751759.
[6] Westcott, T. P., Tobias, I., and Olson, W. K., ``Modeling Selfcontact Forces in the Elastic Theory of DNA Supercoiling,'' J. Chem. Phys., 107, (1997), 39673980.
[7] MullerHill, B., The Lac Operon, Walter
de Gruyter, Berlin, 1996.
* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.
Mentors: Ming Xie, mxie at stat.rutgers.edu, Department of Statistics and E.A. Elsayed, elsayed at rci.rutgers.edu, Department of Electrical Engineering
Project title: Optimization of Sequencing and Threshold Levels of Detection Systems
The candidate will provide assistance to an ongoing project that
considers a problem of container inspection at the portofentry.
Containers are inspected at inspection stations, and data generated from
different analytical methods, xrays detectors, gammarays detectors and
others sensors used for the detection of chemical, biological,
radiological, nuclear, and explosive (CBRNE) agents are often relied upon
to make critical decisions with regard to the nature of the containers and
the appropriate response mechanism. The process of designing an efficient
and accurate detection system must be deliberate and well thoughtout.
The threshold levels of sensors at the inspection stations might result in
accepting undesired containers or subjecting "good" containers to
unnecessary additional inspections. In this project we investigate
approaches for determining the optimum arrangements of sensors and their
corresponding threshold levels while considering potential measurement
errors and cost and other constraints. Efficient approaches for
investigating inspection systems with a large number of sensors will be
investigated. The theoretical contributions of this research will be
transferred via computational algorithms for practical applications.
* You must be a enrolled at a U.S. university to be eligible for this project.
Mentor: Nina Fefferman, feferman_at_Math.Princeton.edu , Tufts University, DIMACS
Project title: Duration of infectivity and disease in dynamic networks
This project will focus on examining the effect of relative time scale in shifting
network structure and disease transmission patterns. Starting from
empirical, computational experimentation, we will try to build a
theoretical understanding of how relative durations of social and
disease processes interact to shape epidemics. Very basic knowledge of
graph theory and/or any programming language would be useful, but aren't
strictly necessary. (Ideally, this project will be done in parallel with
project DDD200804.)
*You must be a enrolled at a U.S. university to be eligible for this project.
Mentor: Nina Fefferman, feferman_at_Math.Princeton.edu , Tufts University, DIMACS
Project title: Long vs. Shortterm friendships and the spread of disease
This
project will focus on examining the effect of varying the percentages of
long vs. shortterm social contacts on patterns of disease spread
in a population over time. Starting from empirical, computational
experimentation, we will try to build a theoretical understanding of how
varying the percentage of each duration of friendship among social
contacts over time will affect disease dynamics. Very basic knowledge of
graph theory and/or any programming language would be useful, but aren't
strictly necessary. (Ideally, this project will be done
in parallel with project DDD200803.)
*You must be a enrolled at a U.S. university to be eligible for this project.
Mentor: Nina Fefferman, feferman_at_Math.Princeton.edu , Tufts University, DIMACS
Project title: Entropy and DNA
DNA is encoded information. Entropy is a
mathematical measure from information theory that quantifies the amount
of information in a given signal. Some papers from the 90's looked at
the relative entropy of coding vs. noncoding regions of the genome. In
this project, we would explore a few related alternative hypotheses.
This project will rely heavily on computational analysis of data, rather
than mathematical theory. For this project, applicants should have at
least a basic knowledge of computer programming.
* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.
Mentor: William Pottenger, billp at dimacs.rutgers.edu , DIMACS
Project title:Traditional machine learning approaches make the assumption that instances are independent and identically distributed (IID). We term models constructed under the IID assumption firstorder because in general they only leverage relationships between attributes within instances (e.g., cooccurrence relationships). Thus classification of a single instance (of previously unseen data) is possible because no additional context is needed to infer class membership. Such a contextfree approach, however, does not exploit valuable information about relationships between instances in the dataset.
In our research we are developing a novel Bayesian framework for text classification named Higherorder Naive Bayes. Unlike approaches that assume instances are IID, Higherorder Naïve Bayes leverages implicit cooccurrence relationships between attributes in different instances. We term these implicit cooccurrence relationships higherorder paths. Attributes ( e.g., words in documents in text collections) are richly connected by such higherorder paths, and the generative model built by Higherorder Naïve Bayes exploits this rich connectivity pattern.
In order to demonstrate the utility of this approach we vary the
sparsity of the input ( i.e., labeled training data) in our experiments.
Sparsity is defined in two ways: first, in terms of the word dictionary
size, which is ordered by increasing document frequency and second in
terms of the size of the input training set. Our results on several
textual datasets show that when training data is sparse ( e.g., small
training set and/or words with poor parameter estimates), Higherorder
Naïve Bayes significantly reduces the generalization error by leveraging
higherorder paths. Additionally we argue that higherorder path
statistics have potential for use in assessing the quality of the
training set; i.e., whether the training set is representative of the
concept to be learned. Our ongoing research builds on these results.
* You must be a enrolled at a U.S. university to be eligible for this project.
Mentor: Daniel Cranston, dcransto@dimacs.rutgers.edu , DIMACS
Project title: Efficient Algorithms for Graph Coloring
Graph coloring problems model partitioning a set of objects into as few
subsets as possible subject to various constraints. These problems have
direct applications to many problems in scheduling, timetabling, and
sequencing. In general, coloring problems are computationally
intractable. So in practice, we look for approximation algorithms and
for classes of graphs on which these coloring problems can be solved
efficiently. This project focuses on finding new classes of graphs which
we can color efficiently and optimally or nearoptimally. Prequisites: a
course in graph theory.
* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.
Mentor: Aaron Jaggard, adj@dimacs.rutgers.edu , DIMACS
Project title: Pattern Avoidance and Parallels Between Families of Permutations
The broad questions motivating this project involve parallels
between different classes of permutations: Given two different
classes of permutationse.g., all permutations and the set of
involutions (permutations whose square is the identity
permutation)which combinatorial questions have (approximately)
the same answer when asked of both classes? (For example, the
probability that a permutation in the symmetric group
S_{n} contains a
specified subsequence of k distinct integers is exactly
1/k!,
while the probability that an involution in S_{n}
contains the
subsequence approaches 1/k! as n approaches infinity
[3].)
In this project, we'll focus on questions related to
pattern avoidance by permutations; a permutation
p:{1,2,...,n}>{1,2,...,3} (which we write as its list
of values
p(1)p(2)...p(n)) avoids the
pattern 231 if there is no triple
(i,j,k) of indices such that
i<j<k and
p(k)<p(i)<p(j)
(i.e., the
order relationships between p(i), p(j), and
p(k) are exactly the same
as those between 2, 3, and 1); avoidance of other patterns is defined
analogously. Depending on the interests of the student, the questions
that we study may range from concrete (as in [1], answering questions about
specific patterns) to more abstract (as in [2], answering questions about
families of patterns or, even more generally, about parallels
between different families of patterns). We'll use Mathematica or Maple
to help suggest/test conjectures, but extensive programming experience
is not required.
2. A.D. Jaggard, "Prefix exchanging and pattern avoidance by involutions," Elec. J. Comb., 9 (2): Research paper #16, 2003 (link).
3. B.D. McKay, J. Morse, and H.S. Wilf, "The Distributions of the Entries
of Young Tableaux," J. Comb. Th. A, 97 (1):117128, 2002 (link)..
* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.
Mentor: Paola VeraLicona, mveralic@rci.rutgers.edu, DIMACS/BioMaPs/Mathematics Department
Project title: On the Reverseengineering of Biological SystemsMathematical and computational methods have provided powerful tools for the study and understanding of biological systems [1]; some of the methods used in my research range from combinatorics and computational algebra to evolutionary computation tools [2]. Typically, the modeling approach employed is that of reverseengineering, which utilizes data to infer the structure of a given a system; the type of networks of interest include: gene response networks [3] and brain response networks [4], reconstructed from microarray and fMRI data, respectively.
In this project, we will be introducing some of the techniques afore mentioned and how they can be applied to construct the structure and dynamics of biological systems of interest. Theoretical or applied aspects on these methods will be studied according to the student’s background and interest.
[1] B.O. Palsson. 2006. Systems Biology: Properties of Reconstructed Networks. Cambridge University Press, New York.
[2] P. VeraLicona. 2007. Algorithms for modeling and simulation of biological systems; applications to gene regulatory networks. Electronic Dissertation. University Libraries, Virginia Polytechnic Institute and State University.
[3] R. Albert and H. G. Othmer. 2003. The topology of the regulatory interactions predicts the expression pattern of the segment polarity genes in Drosophila melanogaster. J Theor Biol, vol. 223, pp. 118.
[4] P. VeraLicona, B. Komisaruk, WC. Liu, M. Stillman, R. Laubenbacher. A Reverse Engineering Method of Brain Response Networks. Journal of Neuroimaging. Preprint.
* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.
Mentor: Endre Boros, boros@rutcor.rutgers.edu , Rutcor and Paul Kantor, kantor@scils.rutgers.edu , School of Communication, Information and Library Studies
Project title: Interactive Tools for Resource Allocation in Threat Detection
We are working on the problem of allocating scarce human and technical resources to provide better screening within the United States, particularly for nuclear threats. We work with models representing realistic (but not actual, which is classified) information on sensors and their effectiveness. We seek one or two students who will work on the problem of interfacing complex linear and nonlinear optimization programs to the unsophisticated user and visualizing the effect of policy choices on the overall costeffectiveness of complex screening policies.
* You must be a enrolled at a U.S. university to be eligible for this project.
Mentor: Debbie Yuster, yuster@math.rutgers.edu , DIMACS, Mathematics Department
Project title: Minimal Tropical Bases for MatroidsA matroid is a combinatorial object consisting of a set of elements, along with information about which subsets of those elements form “dependent sets”. We can think of this as generalizing a set of vectors in a vector space and information about which of the vectors form linearly dependent sets [1]. Tropical geometry is a growing area of math that turns complicated geometric objects into piecewiselinear objects that are much easier to understand [2]. Certain “tropical linear spaces” can be related to matroids, and questions about these tropical objects turn into purely combinatorial questions about dependent sets of matroids [3]. We will investigate different classes of matroids and try to characterize their “minimal tropical bases”. Computer programming experience may be helpful but is not necessary.
References:
[1] Sandra Kingan’s Matroid Website: http://www.matroids.googlepages.com/index.html
[2] D. Speyer and B. Sturmfels, “Tropical mathematics”
[3] J. Yu and D. Yuster.
* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.
Mentor: Debbie Yuster, yuster@math.rutgers.edu , DIMACS, Mathematics Department
Project title: Computing Shift EquivalenceTwo square, integer matrices A and B (not necessarily the same size) are shift equivalent if there exist integer matrices R and S, and a whole number m, such that AR=RB, SA=BS, A^{m}=RS, and B^{m}=SR. It is an open problem to decide whether or not two given matrices are shift equivalent. Even shift equivalence of 2x2 matrices is not very well understood. I am trying to understand shift equivalence for a project involving building a database of dynamical systems to be used for biologists. Our summer research project would consist of trying to classify and understand different shift equivalence classes of matrices, and possibly shift equivalence classes of group homomorphisms (which are defined analogously). Prerequisite: a strong background in linear algebra. Abstract algebra would be helpful but is not necessary.
Reference:
[1] D. Lind and B. Marcus, An Introduction to Symbolic Dynamics and
Coding [Chapter 7], Cambridge University Press.
* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.
Mentor: Wanpracha (Art) Chaovalitwongse, wchaoval at rci.rutgers.edu, Department of Industrial Engineering
Project title: Brain Disorder
Epilepsy is the second most common brain disorder after stroke and currently
afflicts at least 2 million Americans. There are 4 stages in the seizure
process: normal, preseizure, seizure onset, and post seizure, all of which
can be captured by an EEG. There has been recent research indicating
promising signs of seizure predictability. The objective of this study is to
determine whether the correlations/synchronizations among different brain
areas during normal and preseizure periods are significantly different. We
will investigate and apply advances in signal processing, statistical
analysis, and optimization.
* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.
Mentor: Wanpracha (Art) Chaovalitwongse, wchaoval at rci.rutgers.edu, Department of Industrial Engineering
Project title: Cooperative Sensors in Battlespace
Wide Area Search Munitions (WASMs) are unmanned aerial vehicles with an
array of onboard sensors, a warhead or other kill mechanism, and autonomous
flight capabilities. WASMs have played many important roles in the modern
battle field including reconnaissance, search, battle damage assessment, or
communications relay. The ATR (automatic target recognition) system in WASM
is used to identify potential targets and broadcast this information to
other WASMs. However, target identification is subject to errors. For
example, two WASMs might detect the same target, but associate with that
target two separate target tracks. Conversely, two WASMs might detect
separate targets, but assign the same target ID to both targets. We call
this problem as the object misidentification (OMI) problem. To solve the OMI
problem for a set of points in 4dimensional space (time, longitude,
latitude and altitude), we will investigate data mining techniques like
clustering to identify the individual route of each target (detected point).
The overall goal is to reconstruct the battle space using statistical,
optimization, and data mining approaches.
* You must be a enrolled at a U.S. university to be eligible for this project.
Mentor: Danfeng Yao, danfeng@cs.rutgers.edu , Computer Science Department
Project title: Privacy and Visualization of Local SharingWe are working on the problem of flexible and assured information sharing in crisis situations. We have developed quantitative models for authorizing access based on contextual and digital credential evidence. The project will examine the privacy aspects of the model by analyzing potential attacks and prevention. We will also develop a simple visualization tool for sharing location information. Some knowledge of computer security is useful, but not required.
* You must be a enrolled at a U.S. university to be eligible for this project.
Mentor: Danfeng Yao, danfeng@cs.rutgers.edu , Computer Science Department
Project title: Trust Inference in Web 2.0 EnvironmentsWeb 2.0 refers to webbased services that facilitate collaboration and sharing among users, such as Wikipedia and Sensorpedia. The aim of this project is to quantitatively determine the trustworthiness of data integrated from multiple sources. The specific task is to develop a trust model and evaluate algorithms to compute trust scores of integrated data based on the trustworthiness of data sources.
* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.
Mentor: Neeraj Kayal, kayaln@dimacs.rutgers.edu , DIMACS
Project title: Decomposing a Group Into a Direct Product of Two GroupsIn this project, we will investigate an algorithmic problem arising out of groups. The problem is to devise an efficient algorithm to split a group into the direct product of two of its subgroups, if such a decomposition exists. See the wikipedia entry for the definition and basic properties of a group and the notion of direct product of two groups. Besides being an interesting problem by itself, a solution to this problem might play a significant role in any eventual solution to the group isomorphism problem. The group isomorphism problem involves deciding if two given finite groups are isomorphic or not. One possible line of approach is to observe that if a group decomposes then the subgroup of commuting elements corresponding to any group element also decomposes. The challenge then is to stitch together the information obtained from decomposing these kind of proper subgroups in order to obtain a decomposition of the original group.
* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.
Mentor: Neeraj Kayal, kayaln@dimacs.rutgers.edu , DIMACS
Project title: Recognizing Polynomials Which Have a Root Modulo Every Prime
Given a polynomial f(x) with integer coefficients, can we efficiently determine if it has the following property:
the polynomial f(x) has a zero modulo p for every prime p.
For example, the polynomial
f(x) = x^{2}  1. It has a root modulo every prime p because the integer 1 is a root of this polynomial. The polynomial f(x) = x^{2} + 1 does not have this property. A somewhat more nontrivial example of a polynomial with this property is the polynomial:
f(x) = (x^{3}  2)(x^{2} + x + 1).
The project will involve investigating such polynomials and identifying their properties and trying to come up with a characterization that is hopefully strong enough that we can turn this characterization into an efficient algorithm.
Besides being an interesting problem in itself because of its elementary description, if shown to be in polynomial time, it would be one of the first computational problems which involve two alternating quantifiers of the form
(for all x, there exists y so that ...).
It is known that a polynomial f(x) has this property if and only if the roots of the polynomial have a certain kind of symmetry and this problem can be boiled down to a group theoretic problem. More specifically, f(x) has this property if and only if every automorphism in the Galois group of the polynomial f(x) fixes at least one of the roots of the polynomial. It is hoped that groups which have this property must be fairly small and if this conjecture is true, then we would get an efficient algorithm. Thus, if we take this approach, then the project would involve trying to determine how large can a group be if all its members fix at least one root.
* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.
Mentor: Vinod Ganapathy, vinodg@cs.rutgers.edu , Computer Science Department
Project title: Leveraging Transactional Memory for Computer SecurityAs multicore chips become ubiquitous, programs of the future will have to be multithreaded to harness the parallel processing power afforded by these chips. Researchers and practitioners are therefore seeking programming models and tools to make multithreaded programs easier to write, verify and debug. One such promising model that is the subject of much current research is Transactional Memory (TM). TM systems aim to ease multithreaded programming by providing system support (at the hardware and software level) for transactions.
This project will seek to leverage TM systems to improve the security of both multithreaded and singlethreaded applications. TM systems provides several key features such as precise access to runtime information, isolation, and rollback, that make them an attractive target to implement security services. The goal of this project will be to design and implement a security service, such as a malware detector, or an information flow tracker, within a TM system.
* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.
Mentor: James Abello, abello@dimacs.rutgers.edu , DIMACS
Project title: DIMACS Graph MiningIn this project we will build a variety of multigraphs that can be extracted from this data set. We will then use our current set of graph analysis tools to parse, navigate, visualize and synthesize the findings. One central challenge is to devise methods that are privacy preserving.
Requirements: Some Knowledge of SQL, fundamental graph algorithms and C/C++ programming is a plus. However, there are several analysis tasks that can be performed using our plug in graph visualization system and that require only minimal amount of programming.
Reference:
[AvHK06] J. Abello, F. van Ham, and N. Krishnan, "AskGraphView:
A Large Scale Graph Visualization System", IEEE Transactions in
Visualization and Computer Graphics, 12(5): 669676 (2006).
* You must be a enrolled at a U.S. university to be eligible for this project.
Mentor: James Abello, abello@dimacs.rutgers.edu , DIMACS
Project title: Name That ClusterGiven a user query, search engines generally return a very sizeable collection of possible answers. Clustering has been proposed as a tool to partition the possible answer set into more manageable subsets of related results. There is no current agreement on the preferred mode of presentation of these clusters. Currently, most search engines display the set of results on almost a pure textual form. However, relatively recently we have witnessed some timid attempts to use some graphical representations. This study is a first step to elucidate when and why text appears to outperform graphics for certain fundamental clustering related tasks. To this end we designed three interfaces to display flat clusters of user queries. The interfaces are enhanced with mechanisms by which users provide feedback about the relevance of a cluster for a prespecified input query. Subsequently, users are asked to provide a name for a given cluster that best describes the cluster contents. In this project, we will analyze the results obtained from a web user study that is planned to run from Feb to May 2008.
Requirements:Basic Statistics, reading clustering related papers and preparing summary of findings.
Reference:
[ASGT07] J. Abello, J, Schulz, H, Gaudin, B, and Tominski, C (2007).
Name That Cluster  Text vs. Graphics, IEEE InfoVis Conference, Sacramento, November 2007.
* You must be a enrolled at a U.S. university to be eligible for this project.
Mentor: James Abello, abello@dimacs.rutgers.edu , DIMACS
Project title: Point Configurations, The Weak Bruhat Order of Sn and The Majority RuleMaximal Chains in the Weak Bruhat Order of the Symmetric Group give rise to a special class of graphs called Persistent Graphs. These graphs are directly related to visibility graphs associated with point configurations on the plane and they encode transitivity of the majority rule. This project will study the combinatorial properties of this new class of graphs and their geometric implications. Requirements: A good discrete math course, basics of graph theory , linear algebra and a strong will to face interesting combinatorial geometry questions.
Reference:
[AEK95] J. Abello, O. Egecioglu and K. Kumar, "Visibility Graphs of
Staircase Polygons and the Weak Bruhat Order", Discrete and
Computational Geometry, Volume 14, 1995.
* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.
Mentor: Graham Cormode, cormode@lucent.com, AT&T Labs  Research
Project title: Solving Problems on Uncertain DataThere is growing interest in handling uncertain data, which arises from sensor networks, record linkage, and confidences attached to the results of machine learning. This prompts the need to design and analyze algorithms for solving various problems on uncertain data, such as clustering. This project is to design and implement algorithms for solving new problems on uncertain data with provable guarantees.
* You must be a enrolled at a U.S. university to be eligible for this project.
Mentor: Alantha Newman, alantha.newman@gmail.com, DIMACS
Project title: Finding Isomorphic Substructures, and Its Applications to Analyzing Interaction GraphsGraph or subgraph isomorphism problems are that of determining if two graphs have the same structure or substructures. We can think of the following graph optimization problem: Given two graphs G and H, we would like to find the largest subgraph that is isomorphic to some subgraph of both G and H. A recent technique that has been applied to graph optimization problems is semidefinite programming. This project involves (a) developing semidefinite programming based methods and code for finding isomorphic subgraphs, and (b) exploring applications in specific "interaction graphs". For example, if we let the nodes of the graph represent people or objects and we let the edges represent the relationships between pairs of people or between a person and an object, then the problem of finding isomorphic substructures could be used to detect similar behavior patterns.
* You must be a enrolled at a U.S. university to be eligible for this project.
Mentor: Gyan Bhanot, gbhanot@rci.rutgers.edu , BioMaPS Institute and BME, Rutgers University
Project title: Finding the Sequence of "mtDNAEve"Mitochondria (mtDNA) are structures found in the cytoplasm of eukaryotic cells responsible for vital cellular functions such as oxidative phosphorylation, release of Cytochrome C to initiate apoptosis etc. The current understanding of their origin derives from a proposal by Lynn Margulis in the 1970s, who argued for an extracellular origin due to an endosymbiosis event between bacteria and early prokaryotic cells. In complex organisms such as humans, mtDNA are maternally inherited without recombination and have a high rate of mutation (10X the mutation rate of nuclear DNA). These facts make a population analysis of their sequence an ideal method of tracing the maternal ancestry of geographically isolated populations to understand the origins and migratory history of humanity. Such analysis using clustering and phylogenetic techniques have shown that modern humans emerged from Africa in two or more migrations approximately 5070 kYBP. The tree describing these events has its root in a sequence that represents “mitochondrial Eve”, who can be thought of as the ancestral mother of all of modern humans. Mapping the tree to world geography suggests that “mtDNA Eve” lived in Africa approximately 200 kYBP.
The project we propose is to use data on approximately 2000 mtDNA
sequences from worldwide sources (available at www.mitomap.com
The undergrad selected for this project must be able to program in C, C++, Java or Matlab, must learn to use the tree building software Phylip and must have a basic understanding of the underlying biology (e.g. as described in the books: “Adam’s Curse” and “The Seven Daughters of Eve” by Brian Sykes and “Acquiring Genomes” by Lynn Margulis and Dorion Sagan).
* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.
Mentor: Vijay Ramachandran, vijayr@cs.colgate.edu, Colgate University
Project title: Modeling and Simulation for Better Internet RoutingInterdomain routing is the task of establishing connectivity among the autonomous subnetworks of the Internet, and is currently handled by the Border Gateway Protocol (BGP). BGP is unique among routing protocols because a subnetwork can independently configure its routers based on complex policies. Certain combinations of these local policies can cause global instability. Therefore, understanding how interdomainrouting policies are used is essential for a properly functioning Internet.
Recent work, taking a theoretical approach, has provided guidelines on how to mitigate some of these problems. However, there are potential obstacles to adopting these guidelines. First, their efficiency and effectiveness in realworld networks have not been studied. Second, the ability to detect circumventions of these guidelines (for selfinterest or malice) is unknown, as is the effect of partial deployment or compliance. Third, the theoretical assumptions may not always correspond with the details of policy implemention in today's Internet.
In this project, we will study these issues through modeling, simulation, collection and analysis of realworld routingpolicy data, and other methods as appropriate. The specific focus of and techniques used in the project will reflect the interests and background of the student. Students should be comfortable with programming and learning new software tools, although no specific language background is required.
* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.
Mentor: Regina Liu, rliu@stat.rutgers.edu , Statistics Department
Project title: Extreme Value Theory and Applications in Simultaneous Monitoring of Multiple Risk IndicatorsRisk assessments in reality often encounter extreme settings with few or no occurrences of the event of interest. Inferences about risk indicators in such settings face the problem of insufficient or no data. The extreme value theory is particularly well suited for addressing this type of problems. Among other usefulness, we plan to apply the multivariate extreme value theory to establish thresholds for signaling different risk levels in the context of simultaneous monitoring of multiple risk indicators. The proposed threshold system is to be justified in terms of multivariate extreme quantiles. As an illustrative example, the proposed approach is to be applied to the monitoring of airline performance measures collected by the FAA over a period of 5 years. We hope to use this threshold system to assign different risk levels to the observed airline performance measures. In particular, we expect to divide the sample space into regions with increasing levels of risk. Moreover, in the univariate case, such a thresholding technique should be useful for determining a suitable cutoff point on a runway for holding short of landing aircrafts. This cutoff point needs to be chosen to meet a certain required level of safety when allowing simultaneous operations on two intersecting runways, which can help ease the congestion of air traffic.
* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.
Mentor: Alex Borgida, borgida@cs.rutgers.edu , Computer Science Department
Project title: SourcetoSource Translation Using OntologiesReal world SQL database management systems (as well as programming language implementations) have a surprising (and bewildering) number of variants which make it very hard to port software written for one vendor to another, especially if one wants to do this "intelligently". The proposed project is to consider the use of formal ontologies to express the properties of data definition and data manipulation languages, and use this towards automatic sourcetosource translation of database applications. It is expected that the research will involve not just ontological modeling but also studying extensions to Description Logics and their reasoners (such as OWL) to better represent the semantics of programming language constructs.
Requirements: familiarity with description logics, conceptual modeling, DBMS.
* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.