This special focus is jointly sponsored by the Center for Discrete Mathematics and Theoretical Computer Science (DIMACS), the Biological, Mathematical, and Physical Sciences Interfaces Institute for Quantitative Biology (BioMaPS), and the Rutgers Center for Molecular Biophysics and Biophysical Chemistry (MB Center).
Over the last decade, biology has been transformed into a data-driven science. Through innovations in sequencing, high-throughput microscopy, mRNA expression arrays, protein-protein and protein-DNA binding assays, and numerous other high-throughput methods, it is now possible to query simultaneously the activities of thousands of genes and their products under a wide variety of experimental conditions.
The resulting data pose an exciting challenge for the field of machine learning. Many of the model organisms (most notably S. cerevisiae) are of sufficient complexity to render detailed mathematical modeling intractable. However, it is still possible to try to learn quantitative models which are rich enough to fit data, yet simple enough to generalize and to be interpretable. Work by numerous groups suggests a promising future for more complex eukaryotes (e.g., C. elegans, S. pombe, or D. melanogaster).
Qualitatively new challenges to the machine learning community include the integration of heterogeneous datasets, such as sequence, binding, and expression data; the creation of models which are interpretable even to those not trained in probabilistic reasoning or statistical learning theory; and the presentation resulting models in a way useful to bench biologists as well as computational biologists.
This three-day workshop is designed to encourage interaction among innovators in computational biology and innovators in machine learning; to illuminate recent successes as well as pressing challenges; and to inspire the development of novel, biologically relevant, and biologically interpretable machine learning approaches to the current problems in biology.