DIMACS Workshop on Computers in Scientific Discovery III

February 6 - 9, 2006
Ghent University, Ghent, Belgium

Organizers:
Gunnar Brinkmann, Ghent University, Gunnar.Brinkmann@UGent.be
Patrick W. Fowler, University of Sheffield, P.W.Fowler@sheffield.ac.uk

Organizing Committee:
Arnout Ceulemans, Mel Janowitz, Martine Labbé, Hadrien Mélot

Local Organizing Committee:
Kris Coolsaet, Veerle Fack, Adriaan Peeters,
Ghent University

Presented under the auspices of the Special Focus on Data Analysis and Mining and the Research Foundation Flanders (FWO).


Ali Reza Ashrafi, University of Kashan, Iran

Title: On Symmetry and Topological Indices of Fullerenes

It is well known to associate an Euclidean graph to a molecule. Balasubramanian computed the Euclidean graphs and their automorphism groups for benzene, eclipsed and staggered forms of ethane and eclipsed and staggered forms of ferrocene [see Chem. Phys. Lett. 232 (1995), 415]. In this paper, we present a simple algorithm for computing symmetry of molecules. We apply this algorithm to calculate the symmetry of some big fullerenes. The Wiener, Schultz and also Padmakar-Ivan(PI) indices of some big fullerenes are computed. Here the PI Index is a Szeged-like topological index developed very recently and defined as PI(G) = S[neu(e|G)+nev(e|G)], where neu(e|G) is the number of edges of G lying closer to u than to v, nev(e|G) is the number of edges of G lying closer to v than to u and summation goes over all edges of G. Finally, we pose some problems related to the symmetry and these topological indices for fullerenes.


Jon Borwein

Title: Computationally Discovered and Proved Identities ζ(n)

This lecture will describe older and very recent work in which Bailey, Bradley and I hunted for various desired generating functions for zeta functions and then were able to methodically prove our results. One example is

The constant term in (1) recovers the well known identity

As I hope to show, this led to one of the most satisfying experimental mathematics experiences we have had.

(Download extended abstract)


Eric Breimer*

Title: Clustering the Protein Databank

Bioinformatics applications have significantly changed the nature and scope of the graph clustering and graph partitioning problem. Predicting the structure of proteins is one such application where data clustering is an important step in understanding the relationships between the sequence of amino acids and the structure of a protein. The Protein Databank (www.rcsb.org) forms a graph with over 33,400 vertices and over 500 million edges, where each vertex is a protein and the value of each edge represents the structural similarity between two proteins. The clusters that arise from structurally similar proteins may significantly overlap. Conventional clustering or graph partitioning algorithms are not suited for this type of problem. This talk will describe the characteristics of graphs that arise from protein similarity, techniques for handling very large, complete graphs, and algorithms for discovering locally optimal, dense sub-graphs. If time permits, I will also address how the emergence of bioinformatics has changed my approach in teaching traditional undergraduate courses such as Analysis of Algorithm and Discrete Math.

*Contributions by Darren Lim, Mark Goldberg, Malik Magdon-Ismail and Jeff Baumes.


Gilles Caporossi

Title: The new developments of the AutoGraphiX system

The AutoGraphiX (AGX) system for computer assisted or, for some of its functions, fully automated graph theory was developed at GERAD, Montreal since 1997. We report here on the last improvements of a new version (AGX2) of that system. It contains many enhancements, as well as a new function for automated proof of simple propositions. Among other results, AGX2 led to several hundred new conjectures, ranking from easy ones, proved automatically, to others requiring longer unassisted or partially assisted proofs, to open ones. Many examples are given, illustrating AGX2's functions and the results obtained.


Mircea Diudea

Title: Supra-Faces in Nanostructures

Tiling modification in nanostructure modeling can be achieved by sequences of classical map operations, or single generalized operations. The operations for obtaining corannulenic flowers and corannulene-like azulenic patterns were established. The first "corazulenic" tessellation is reported.

The aromaticity of some cages tessellated by the above supra-faces is discussed in terms of several criteria. The covering was given as p-electron partition within some Kekulé valence structures.

The HOMA (harmonic oscillator model of aromaticity) geometric index of aromaticity enabled the evaluation of local aromaticity of the discussed supra-faces and brought evidence for several dominant Kekulé valence structures. The most interesting is the "Kekulé-Dewar" valence structure of the cage C192 which shows a generalized Clar, disjoint corazulenic tessellation. The original software package program CageVersatile enabled us to embed various supra-faces in surfaces of any genus.


Andreas Dress, MPI Leipzig/Shanghai

Title: Visualization Software for Scientific Discovery in Toponome Research

The aim of proteomics is to elucidate the complicated and highly specific relations between the distribution of proteins at any given moment in a given cell and that cell's functional state. Most existing methods rely on first homogenizing any sample under consideration, thus destroying any information regarding the actual spatial distribution of proteins within an individual cell which, however, is crucial for understanding how a cell's functional state is determined by the spatial organization of its "proteom". In the lecture, I will present a new technology developed by Walter Schubert that allows biologists to locate up to more than one hundred distinct proteins within the same cell and, thus, to study the spatial organization of cellular protein networks, including cell-state specific colocalization and "anti-colocalization" events. I will present samples of data generated by this technology as well as various data analysis methods that have been developed to investigate such "multi-spectral image" data and to separate signal from noise.


Dieter Gernert

Title: A Knowledge-based System for the Support of Graphtheoretical Proofs

The interactive knowledge-based system is based upon a collection of about 1500 theorems from graph theory (conditional/unconditional equation/inequalities between graph invariants). A program run starts from a set of restrictions characterizing a class of graphs, particularly the class of all (hypothetical) counter-examples to a graphtheoretical conjecture. The system supplies advanced knowledge about the class considered, amomg other things the exact values for some graph invariants and better inclusions for most of the numerical invariants. In this way, proofs for subcases are obtained, and generally a complete proof becomes easier. The lecture is to present the principal modes of internal derivation and some examples.


Jack Graver

Title: Fullerenes and the Problems of Thomson and Tammes

To each solution to the problems of Thomson (minimize the potential of n unit charged particles on the sphere) and Tammes (maximize the minimum distance between n points on the sphere - pores on a grain of pollen) we may associate the planar map given by the Dirichlet-Voronoi cells of the solution points. For small n, these maps include triangular and square faces. For n in the range 25 to 125, almost all such maps are fullerenes (only pentagonal and hexagonal faces). For n > 125, heptagonal faces begin to appear and are almost always present when n exceeds 300. However, in all large solutions the pentagonal and heptagonal faces occur in 12 distinct clusters. And each such solution has a unique underlying fullerene. Using what we know about fullerenes to build solutions to these classic problems from nature is the topic of this talk.


Harald Gropp

Title: Conjectures on Configurations

Configurations are linear regular uniform hypergraphs. In the language of designs they are "Steiner systems where two different points may be not connected". A configuration (vr, bk) is a finite incidence structure of v points and b lines such that each line consists of k points, there are r lines through each point, and two different points are connected by at most one line. If v=b and r=k the configuration is called symmetric and denoted by vk. Configurations were defined already in 1876, and the research on configurations v4 was started in Gent (Belgium) nearly 100 years ago. In this talk the focus will be on conjectures on configurations, i.e. those which were solved in the past and those which are still open. For a given set of parameters v,r,b,k the question is: Is there a configuration (vr, bk) ? If yes, how many non-isomorphic are there ? Probably the possible applications of configurations as special hypergraphs in chemistry and related sciences are by far not yet exploited.


Ivan Gutman, Faculty of Science, University of Kragujevac, Kragujevac, Serbia & Montenegro

Title:Graph Invariants of Importance in Chemistry

The structure of a molecule can be represented by means of certain graphs, the so-called "molecular graphs". Various (single-number) invariants of these graphs found applications in chemistry, because they reflect relevant structural features of the underlying molecule. Such invariants are traditionally called "topological indices" (TIs) although "molecular--graph--based structure descriptors" would be a much better name for them. Initially (from the 1940s to the 1980s) a reasonably small number of TIs was put forward, say one dozen. In more recent times we are faced with a deluge of TIs, whose count far exceeds one thousand and continues of rapidly increase. In the first half of the lecture the main types of molecular graphs will be presented, followed by a survey of the most significant TIs. We focus our attention to the following TIs: Wiener index, Randi'c index, Hosoya index, greatest eigenvalue of the adjacency matrix, greatest eigenvalue of the Laplacian matrix, graph energy. In the second half of the lecture we point out various (exact or approximate) relations between the mentioned TIs, as well as some of their general mathematical properties. Computer-based studies revealed numerous general properties of these TIs that are still waiting to be mathematically proved (or disproved). We conclude the lecture by stating a few unsolved problems in this area.


Pierre Hansen, Montreal, Canada

Title: Automated Comparison of Graph Invariants

A graph invariant is a function of a graph G which does not depend on labelling of G's vertices or edges. An algebraic expression of one or several graph invariants is itself an invariant. Graph theory is replete with theo- rems about graph invariants, which are either algebraic, i.e., equalities or inequalities involving such invariants, or structural, i.e., characterizations of which families of graphs are extremal for a given invariant, that is give it maximum or minimum value. Both types of results can be conjectured by the system AutoGraphiX 2 (AGX 2), in an automated way, or, in some carefully distinguished cases, in an assisted way. We report here on a systematic comparison of 20 graph invariants: for each pair of them, AGX 2 considers the four operations +,-,×,/ and derives best possible lower and upper bounding functions of the number of vertices, as well as extremal graphs. Out of 1520 cases, AGX 2 recognizes 33 known results, derives automatically algebraic and corresponding structural conjectures in 1340 cases (839 of which are proved automatically), and structural con- jectures only in 84 more cases, from which algebraic conjectures could be derived by hand in 42 cases. No results were obtained in 63 cases. Manual or assisted proofs have been obtained in 301 cases, 10 conjectures were re- futed and 274 conjectures remain open. Many examples are given. AGX 2 is also compared to the three operational systems GRAPH, GRAFFITI and HR.

(Joint work with Mustapha Aouchiche and Gilles Caporossi).


Peter E. John and Horst Sachs, Ilmenau Technical University

Title: On Goldberg Transformations and their Applications to Simple Polyhedra

M. Goldberg was the first to consider a certain 2-parametric class of transformations of regular hexagonal tessellations. These can also be applied to simple (i.e., trivalent) polyhedra, especially to (5,6)-cages (fullerenes). The simplest is the "leapfrog" which, in a chemical context, has much been used by P.W. Fowler and his co-workers. We investigate the algebraic background and some of the combinatorial properties of the transformed polyhedra with possible implications for carbon chemistry. The particular role the leapfrog plays among all Goldberg transformations is elucidated.


Adalbert Kerber, University of Bayreuth

Title: History and Progress of Structure Enumeration in Chemistry

The history of molecular enumeration theory will be briefly sketched. The software package MOLGEN for the generation of connectivity isomers corresponding to a given chemical formula will be described, together with several extensions for QSAR, for molecular structure elucidation and for patents in chemistry. Afterwards, a recent extension for the generation of stereo isomers will be sketched.


Mikhail Klin, Ben-Gurion University of the Negev, Israel

Title: Computer Algebra Experimentation in Algebraic Combinatorics

We give a brief survey of a few lines in our investigation of coherent configurations, association schemes and related structures with the aid of computer algebra packages. In particular, we will touch such topics as aglebraic and combinatorial automorphisms of coherent configurations, search for new strongly regular graphs, generalized quadrangles and their various axiomatic relaxations, strategies for constructive enumeration of certain color graphs. Our presentation will be correlated with current attempts to initiate a development of computer package COCO-II, as a share package of GAP. This presentation is based on a collaboration with M.Muzychuk, Ch.Pech, S.Reichard, A.Rosa, A.Woldar, P.-H.Zieschang and other colleagues.


Anthony Labarre

Title: Graphs, Permutations and Sets in Genome Rearrangement

Genome rearrangement is the field in bioinformatics that studies and models how a collection of genomes evolve and is generally expressed in the following way: given two (or more) genomes, find a shortest sequence of mutations that transform one into the other. Several models have been proposed, that differ either by the kind(s) of mutations taken into account or by the way genomes are represented, depending on the different biological assumptions that can be made. We review some of these models and known results, explain how computers have already helped in this area and suggest some further possible uses for them.


Brenda Latka, DIMACS

Title: Using LINK for Research on Tournaments: A User's Experience

This is a case study of the strengths and weaknesses of using computer software in doing research. I will talk about three related experiences using LINK to study antichains of tournaments. LINK is a system for the creation, manipulation, and drawing of graphs and hypergraphs. It was created with the intention that it could easily be used in research and education projects. A tournament is a complete directed graph. We say that a tournament S embeds in a tournament T if S is isomorphic to a subtournament of T. Tournament embedding is an order relation on the class of tournaments. An antichain of tournaments is a set of tournaments which are pairwise incomparable. First, Jon Berry, a developer of LINK, and I used LINK to study an infinite family of infinite antichains of tournaments. Second, I worked with Jon Berry as he served as an REU mentor to Chris Burrows, comparing Latka tournaments to Paley tournaments. Third, I worked with Brian Bagenstose, an undergraduate at Lafayette College, on the same infinite family of infinite antichains of tournaments.


Erwin Lijnen and Arnout Ceulemans, University of Leuven, Department of Chemistry, Heverlee-Leuven, Belgium

Title: Topology-Aided Molecular Design: The Platonic Molecules of Genera 0 to 3

Since ancient times, scientists have been intrigued by highly symmetrical models and structures. In recent years, this has led to the successful synthesis of a multitude of molecular structures exhibiting the highest possible point group symmetries. However, as the list of point group symmetries is currently fully realized in molecules, molecular scientists must explore novel strategies. One of the possibilities is to turn to the field of topological graph theory which describes networks on surfaces with more intricate topologies than the sphere, like the torus (doughnut), double torus (pretzel), triple torus, etc. Within mathematics these networks are known as maps or embeddings and classified according to the genus (number of handles) of their underlying surface. In the present talk we discuss a procedure to find the most symmetrical spatial realization of a given map, where we are especially interested in those maps which realize the highest possible symmetry for a given surface. They are known as 'regular maps' and can be seen as extensions of the Platonic solids as their symmetry groups act transitively on the sets of vertices, edges and faces. We also analyze the structure of the groups involved, which can viewed as 'multiples' of the classical point groups. To exemplify our procedures, we derive the highest symmetrical spatial realizations of all regular maps of genera ranging from the sphere to the triple torus.


Brendan D. McKay, Department of Computer Science, Australian National University

Title: Isomorphism and Automorphisms of Combinatorial Objects

We will discuss the problem of computing the automorphism group of a combinatorial object, and the related problem of testing several objects for isomorphism. Emphasis will be strongly on the practical aspects of the problem rather than the theoretical niceties. Objects considered include various types of graphs, matrices, codes, and so forth.


Hadrien Mélot

Title: The Evolution of GraPHedron

In the previous workshop (Montreal 2004), we presented for the first time a new conjecture-making system. This software, called GraPHedron, uses a polyhedral approach to find conjectures in Graph Theory. We will present the new features that have been added to GraPHedron. Among them is a procedure which derives conjectures automatically (without user intervention) in some cases. Other options have been added to help the user to extract more information about a given problem.


Vincent Moulton, School of Computing Sciences, University of East Anglia, Norwich, UK

Title: Phylogenetic Networks from Multi-labeled Trees

It is now quite well accepted that the evolutionary past of certain species is best represented by phylogenetic networks as opposed to trees. For example, polyploids are typically thought to have resulted through hybridization and duplication, processes which are not best represented as bifuricating speciation events. Based on the knowledge of a multi-labeled tree relating collection of polyploids, we present a canonical construction of a phylogenetic network that exhibits the tree, which is in some well defined sense a minimal network having this property.


Wendy Myrvold

Title: Fuigui: A Graphical User Interface for Investigating Conjectures About Fullerenes

Fullerenes are all-carbon molecules whose molecular structures correspond to 3-regular planar graphs that have face sizes equal to five or six. Fuigui (Fullerene Interactive Graphical User Interface) is a java program under development whose goal is to aid the exploration of fullerenes and their parameters. Some of the design challenges covered in this talk are:

  1. Making the system fully interactive, fast and fun to play with.
  2. Plug and play parameters- a user can add his or her own fullerene parameters without changing the code for fuigui.
  3. Drawing pictures which are pleasing to the chemists.
  4. Facilitating animated algorithms.

A demonstration will be given of the current state of the system. The talk will conclude with ideas for future enhancements.

This is joint work with Patrick Fowler, Marsha Minchenko, Jennifer Woodcock, Sameer Girn, and possibly several other students (this list will be updated in December).


Sanja Narancic

Title: From CID Threshold Measurements to Kinetic and Thermodynamic Information

This project is a development of fast, reliable methods to extract kinetic and thermodynamic information from energy resolved dissociation reactions in a mass spectrometer. For the extraction of activation energies from CID threshold measurements, as well as relative thermochemistry from branching ratios in mass spectrometric dissociation reactions, one has to model energy-dependent reactive cross-section s(E)=s0[1-exp(-k(E))t] for an ion-molecule reaction in the tandem spectrometer, and therefore encounters repeatedly the need for an efficient method for the calculation of density-of-states function, ?(E). Furthermore, the simulation is optimized using the genetic algorithm approach, providing a method, which rapidly deconvolutes measured CID thresholds.


Patric Östergĺrd

Title: On the Steiner Quadruple Systems of Order 16

The Steiner quadruple systems of order 16 have recently been classified; there are 1,054,163 such designs. This classification - obtained in a computer search starting from the derived Steiner triple systems of order 15 - is briefly discussed. The catalogue of classified objects makes it possible to address conjectures and open questions regarding Steiner systems and related mathematical structures; a few such examples are considered.

This is joint work with Petteri Kaski and Olli Pottonen.


Mark Pagel and Andrew Meade, Reading University, England

Title: Detecting Conflicting Phylogenetic Signals in Gene-Sequence Data

Gene-sequence or other phylogenetic data often exhibit conflicting phylogenetic signals, taking the form of different sites in the alignment favouring different phylogenetic trees. Conflicting phylogenetic signals can arise from mechanisms of drift, natural selection, and lateral gene transfer. We present a Bayesian 'mixture-model' approach to identifying conflicting signals in multiple sequence alignments. In contrast to conventional models of phylogenetic inference, we allow each site in the alignment to infer more than one phylogenetic tree, with weights determined empirically from the data. The likelihood of the data at each site is summed over topologies as well as the usual model of sequence evolution. The approach can identify when more than one topology describes the data, and identify sites that support one topology strongly over another, without requiring the user to partition the data a priori. We illustrate the method by applying it to simulated data sets, to several 'known' phylogenies, and to a data set on the prokaryotes.


Adriaan Peeters, Ghent University

Title: Grinvin: The Graph Investigation Framework

The Graph Investigation Framework (Grinvin) provides support for generating, drawing and investigating graphs and their properties. The investigation of these properties is achieved by computing conjectures on (groups of) graphs. The Grinvin framework has been designed to be open, with extendability as the main goal, thus allowing the integration of existing and/or external tools trough simple plug-ins. A researcher can for example easily add a conjecture making engine, a graph invariant computing routine or an algorithm to compute a useful embedding for a graph. A lot of effort has been put in the software design of the framework. In this talk we will explain the design decisions we had to make and difficulties we had creating Grinvin. We will also show how the different kinds of plug-ins can be created for and used in the framework.


Neil J. A. Sloane, AT&T Shannon Labs, Florham Park, New Jersey, USA

Title: The influence of the On-Line Encyclopedia of Integer Sequences on research

The On-Line Encyclopedia of Integer Sequences (OEIS) is a database of over 100000 number sequences. It is freely available on the Web and is widely used. There are several ways in which it benefits research.

  1. 1. It serves as a dictionary, to tell the user what is known about a particular sequence. There are hundreds of papers which thank the OEIS for assistance in this way.
  2. The associated Sequence Fans mailing list is a worldwide network which has evolved into a powerful machine for tackling new problems.
  3. As a direct source of new theorems, when a sequence arises in two different contexts.
  4. As a source of new research, when one sees a sequence in the OEIS that cries out to be analyzed. The talk will be illustrated with numerous examples, emphasizing new sequences that have arrived in the past few months. Many open problems will be mentioned.

Tibor Tarnai and Patrick Fowler

Title: New Results in Constrained Circle Packings

Many situations in the natural sciences and technological applications can be modelled as variants of the packing problem - the determination of efficient packings of objects in an appropriate space. In one version of the packing problem, the task is to arrange n equal circles (spherical caps) without overlap on a sphere, so that their angular radius r is a maximum. In applications it can be more useful to deal with a constrained version of this spherical circle packing problem. For example, the arrangement of packed circles may be required to exhibit a particular overall point-group symmetry; or a given number of circles form a morphological unit, and identical copies of these units should be packed on a sphere. A survey of families of multisymmetric packings, in tetrahedral, octahedral and icosahedral groups will be presented, and the polymorphism of these packings will be analysed. Additionally, three problems will be studied here: How must kN non-overlapping equal circles forming N units be packed on a sphere so that the angular diameter of the circles will be as large as possible under the constraint that, within each unit, the k circle centres lie (1) for k = 2, at the end points of a diameter; (2) for k = 3, at the vertices of an equilateral triangle inscribed in a great circle; (3) for k = 4, at the vertices of a regular tetrahedron. Computer-generated putative solutions have been calculated and will be presented for different values of N. Finally, for k = 2, the problem of packing of twin circles is investigated, where a twin is defined as two circles that are constrained to touch. Here solutions to the constrained problem are mainly expected as perfect matchings in the graphs of the solutions to the respective unconstrained problem.


Previous: Participation
Workshop Index
DIMACS Homepage
Contacting the Center
Document last modified on December 1, 2005.