### Working Group on Computer-Generated Conjectures from Graph Theoretic and Chemical Databases I

Working Group Meeting: November 12 -16, 2001

Public Workshop: Graph Theory Day, Saturday, November 10, 2001

Location: DIMACS Center, CoRE Building, Rutgers University

Organizers:
Patrick Fowler, University of Exeter, P.W.Fowler@exeter.ac.uk
Pierre Hansen, GERAD - Ecole des Hautes Etudes Commerciales, pierreh@crt.umontreal.ca
Presented under the auspices of the Special Focus on Data Analysis and Mining.

#### Abstracts:


1.

Speaker: Hernan Abeledo, George Washington University

Title: Polyhedral properties and duals of benzenoid indices.

Abstract:

The Clar and Fries numbers of a benzenoid system are
optimization-based indices that have straightforward
formulations as integer programming problems. It has been been
shown that these integer programs  have the  integrality property,
though the cause is unusual in the literature: their constraint
matrices are unimodular, but are not always totally unimodular.
In this paper we derive totally unimodular formulations for both
indices and use them to obtain simpler proofs of the unimodularity
of the original models. We also use these results to establish
strong combinatorial duals for both the Clar and Fries numbers.

2.

SEAL: System for enhancing algorithms through learning
Eric Breimer and Darren Lim
Department of Computer Science
Rensselaer Polytechnic Institute
Troy, New York 12181

We introduce two experimental systems which use learning to automatically
enhance the performance of practical algorithms.  The systems demonstrate
the feasibility and potential advantages of applying machine learning
techniques to automatically generate algorithmic heuristics.  Both
systems are currently in development.

Part I: Supervised Learning In Dynamic-programming Environments

We introduce an experimental approach for enhancing the runtime of
dynamic-programming algorithms by using both supervised and unsupervised
learning. We have designed a prototype learning system for solving the
local alignment problem for DNA sequences.  The system starts with a
traditional dynamic programming algorithm for computing the optimal
alignment.  Over time, the system exploits relationship between the input
features and algorithm behavior to evolve an approximate algorithm.
Experiments show that the evolved algorithm is near optimal and
significantly faster for a wide variety of input types.

Part II: Supervised Learning In Restricted Backtracking Environments

We present an approach to DNA shotgun sequencing which uses restricted
backtracking in order to reduce search space computation.  Algorithms are
generated from previously sequenced samples of DNA, and then are used for
various input domains. Our experiments demonstrate that considerable gains
in speed can be achieved  without significant loss in accuracy. Further,
the resulting algorithms learned  on different input domains are distinct,
illustrating the value of developing  domain specific algorithms. We also
present experimental results on the applicability of the algorithms
learned on one domain to the other domains.

3.

Graffiti.pc Red Burton Style
Barbara Chervenka, University of Houston

Implementations of the program Graffiti.pc, subject to the new Red Burton
style of generating conditional conjectures, led to a partial
characterization of the alpha-core number of a graph in terms of other
graph theoretical concepts.  The alpha-core number of a graph is defined as
the cardinality of the intersection of all maximum independent sets of a
graph.  In short, the Red Burton style dictates that a smallest
counterexample is found for false conjectures, and that in the case of true
statements, a complete characterization is given of the graphs for which
equality holds.  Before the program generates the next list of conjectures,
these graphs are removed from consideration.  The focus of this
presentation will be the Red Burton style as well as my observations as a
student on this approach.

4.

Computer Generated Conjectures In Mathematical Chemistry

Siemion Fajtlowicz, University of Houston.

I will discuss selected conjectures of Graffiti which have or may have
some chemical significance. One  of the first conjectures of the program
seem to indicate that the boiling point of single-bonded acyclic
hydrocarbons is closely related to  the sum of the positive eigenvalues of
their graph. The latter  is considered  a predictor of energy  and thus
also of stability of molecules. One  of  the most   recent conjectures of
the program suggests a  new,  possibly better predictor of stability of
molecules, which should be valid at least for the fullerenes.

Most of the discussed conjectures will deal with  stability sorting
patterns, the independence number, and the expanding properties of
fullerenes. The first of these conjectures led to a new representation and
characterization of icosahedral C60 and it may link its structure to
the Golay code.

5.

Designing Learning Algorithms for Practical Computing
Mark K. Goldberg
Department of Computer Science
Rensselaer Polytechnic Institute
Troy, New York 12181

We present ideas for designing algorithms that construct algorithms for
combinatorial optimization problems. The design of these meta-algorithms''
follows closely the PAC-learning paradigm for supervised learning.

The ingredients for the design of meta-algorithms are a supervisor
algorithm, termed an oracle'' and an algorithm which generates
inputs to the optimization problem in question according to a
desirable probability distribution, e.g. from a specified input
domain. The goal of the meta-algorithm is to run the oracle on the
solutions obtained, and after a number of training sessions of this
type, to come up with a new optimization algorithm for the same
problem, which (hopefully) is significantly more efficient on the
input domain for which it was trained, although it may loose the
optimality of the solutions. It is also expected that the constructed
algorithm can be applied to arbitrary inputs to the optimization
problem.

We show how to design learning algorithms (meta-algorithms) for the
case of oracles that implement two basic algorithmic strategies:
backtracking and dynamic programming.

6.

Encoding Fullerenes and Geodesic Domes
Jack Graver, Syracuse University

Coxeter's classification of the highly symmetric geodesic domes
(and, by duality, the highly symmetric fullerenes) is extended to a
classification scheme for all geodesic domes and fullerenes. Each geodesic
dome is characterized by its signature: a plane graph on twelve vertices
with labeled angles and edges. In the case of the Coxeter geodesic domes,
the plane graph is the icosahedron, all angles are labeled one and all
edges are labeled by the same pair of integers (p,q). Edges with these
Coxeter coordinates" correspond to straight line segments joining two
vertices of T, the triangular tessellation of the plane, and the faces of
the icosahedron are filled in with equilateral triangles from T whose sides
have coordinates (p,q).

We describe the construction of the signature for any geodesic dome. In
turn, we describe how each geodesic dome may be reconstructed from its
signature: the angle and edge labels around each face of the signature
identify that face with a polygonal region of T and, when the faces are
filled by the corresponding regions, the geodesic dome is reconstituted.
The signature of a fullerene is the signature of its dual.  For each
fullerene, the separation of its pentagons, the numbers of its vertices,
faces and edges and its symmetry structure are easily computed directly
from its signature. Also it is easy to identify nanotubes by their
signatures.

7.

An Application Framework for Combinatorial Algorithms
Thang Bui, Robert Kingan, Sandra Kingan

In this paper, we discuss an application framework we are developing
for manipulating combinatorial objects.  A combinatorial object is
represented as an interface in Java (an abstract base class in C++).
A particular representation of a combinatorial object is modeled as
a class implementing the interface.   Not only are objects such as
subsets, vectors over GF(p), matroids, graphs etc. represented by
interfaces, but also the algorithms that perform various tasks.
At runtime, a simple expert system delegates tasks to the appropriate
algorithm.  This allows developers to add new algorithms to
the system without changing the existing code.  This application
framework will be part of a larger combinatorial object repository,
which will consist of a database of matroids together with their
relations.

8.

Craig Larson, University of Houston
On Progress in Automated Conjecture-Making

Several programs have attempted to automate mathematical
conjecture-making including AM, Graffiti, HR and AGX. These programs have
implemented a variety of heuristics and have had varying degrees of
success.  Of use to researchers in any endeavor is an evaluation of past
art: What ideas have proved successful and which have not?  What has been
learned? What can be built on?

Research on automated conjecture-making is relatively new and
researchers have been not been motivated by the same aims and have even
adopted conflicting vocabularies. Some research has aimed at "simulating
mathematical intelligence" while some has simply aimed at producing
conjectures. Some would say that a program which produces Goldbach's
Conjecture has made a conjecture -- while others would not. These
differences are to be expected in a young field and some attempt will be

9.

Constraint graph generation for mathematical chemistry

T. Gruener and R. Laue
University of Bayreuth

While graph generators usually are optimized to solve
the isomorphism problem as fast as possible for as many
graps as possible structure elucidation requires a different
paradigm. Only a few structures have been subject to an
experiment and these should be determined. So, a collection of
constraints have to be fulfilled by the generated structures,
constraints that often are formulated in an interactive
response to previous proposals of solutions, using a graphical
user interface.

MOLGEN 4.0 is based on a new generator that allows
to formulate a large number of different constraints like a range
of brutto formulae, neighbourhoods, hybridization, substructures,
ring structures, bondings, ionization, and symmetries.
The constraints are checked early during the generation process
to prune the backtrack tree. The construction strategy may depend
on the actual constraints given. Instead of orderly generation or
related methods here canonical forms are used to solve the isomorphism
problem for the - hopefully - few resulting structures. The concept
allows to integrate new tests of constraints with modest effort.
The system is already in use in chemical industry in various
countries.

MOLGEN 4.0 is a tool for chemical structure elucidation.
It allows to interactively formulate several types of
constraints resulting from different kinds of spectra.
It contains several components, notably a flexible graph
generator, a graphical graph editor and a 3D energy optimized
drawing. Some further special tools for mass spectra, for
combinatorial chemistry, or for interactive teaching are
based on MOLGEN.

A description is available on the internet at
http://www.mathe2.uni-bayreuth.de/molgen4/

10.

Chemical Evolution

Robert B. Nachbar

Applied Computational Science and Mathematics, Merck Research Laboratories,
126 E. Lincoln Ave., Rahway, NJ 07065

A simple hierarchical data structure (Mathematica expression) has been
developed to represent the constitution (topology) of organic chemical
structures. This representation also supports the notions of chemical
valence and stereochemistry. Mathematica expression patterns and built-in
functions can be used to perform chemically meaningful transformations on
instances of this data structure. The important aspects of the
implementation will be presented, and as well as its strengths and
weaknesses. Application of this methodology to chemical constitution
optimization (inverse QSAR and average molecular structure generation) and
chemical reactivity will be described or demonstrated.

11.

Educational Aspects of Graffiti
Ryan Pepper, University of Houston

This presentation is intended to demonstrate some educational aspects of the
conjecture making program Graffiti.  It will discuss the first classroom
experiences with the semi-automated process of the program generating graph-
theoretical conjectures, and the student supplying counter-examples or proofs.
There will also be brief discussion on how this learning process encourages
the independence of thought required for students to make conjectures of
their own.

12.

Representation of Chemical Graphs, Maps and Polyhedra
Tomaz Pisanski, University of Ljubljana, Slovenia

Research in discrete mathematics can be conducted more efficiently
if one can visualize discrete structures such as graphs, groups,
polyhedra, maps, posets, lattices, tilings, incidence structures,
molecules, etc. At present we are not aware of any coherent theory
that would tackle the general problem of visualization. For graphs
visualization is sometimes called graph realization, graph drawing,
or graph representation. During the talk we will present some of our
research results obtained with the help of our computer package.
We will also review and demonstrate several drawing methods and
propose a direction in which a general theory could be developed.
Namely, for a graph, one can define a representation to be a function
that maps each vertex to a vector from some vector space. One of the
problems we have to face is how to extend such a function to
edges and faces of maps and polyhedra.

13.

Puzzles, Graphs, and Graph Generators

Dennis Shasha, Courant Institute, New York University
(puzzle columnist for Dr. Dobb's Journal and Scientific American)

Solving puzzles often involves the construction of a graph
having a certain property or combination of properties.
The impressive graph generation packages available today ought
to help solve such puzzles.

In this talk, I describe a few graph puzzle problems as challenges
to graph generation. I am prepared to make these puzzles
available to anyone before the talk.

Topics: extremal graph theory, graphs on a lattice, graphs in Euclidean
space, color swapping, graph cuts, and mazes.

14.

Dragan Stevanovic, University of Nis, Yugoslavia

TITLE:
CAR: Computer-Aided Research of Integral Graphs

ABSTRACT:
We discuss various ways in which
computers were used as a research aid
in investigation of integral graphs in last 30 years.
Graph is said to be {\em integral} if
each eigenvalue of its adjacency matrix is integer.

The investigation of integral graphs
began with Harary and Schwenk in 1974,
and one of the first results was
a determination of cubic integral graphs by Cvetkovi\'c.
There are $13$ such graphs and he managed to
construct $12$ by purely theoretical case analysis (by hand),
but the last one was beyond human capabilities.
The construction of the remaining graph by Bussemaker and Cvetkovi\'c
was the first use of computer in this fashion.

After that,
we will discuss how Bali\'nska et al.
used genetic algorithms to search for all integral graphs
up to $11$~vertices.
This collection also initiated some further research.
Namely, it contains numerous graphs with special structure,
and some of them inspired Hansen et al.
to define new graph classes,
and for each of them
generate all graphs with sufficiently large number of vertices,
isolate integral ones, pose a conjecture and
then prove characterization of integral graphs in that class.

In previous cases
computer was used only once during research:
human prepared program based on known theoretical results,
run it on computer and
analyzed its output in order to get new theoretical results.
In our discussion
we will also touch upon the cases
where more human-computer interaction was needed:
use of system GRAPH by Simi\'c et al. to construct
nonregular integral graphs with largest degree~$4$,
and by Stevanovi\'c to construct
$4$-regular integral graphs avoiding $\pm 3$ in spectrum.
The later research also gave clues
how computer can be used to construct
all regular bipartite graphs with given spectrum
consisting of $5$ or $6$ distinct eigenvalues.

At the end,
based on experience from exposed cases
we will also discuss a mathematician's dream
of features a system for computer-aided research should have.

15.

Minimum Total Distance d-Trees

Pierre Hansen and Maolin Zheng

Abstract
The AutoGraphiX system, created by G. Caporossi, computes various graph
indices, finds extreme or near extreme graphs for such indices, and makes
conjectures and predictions. One index it computes is the total distances
(the sum of distances between all vertices) of trees. What are the trees
with maximum degree d and having the minimum total distance? The system
suggests that they are the trees of minimum diameter. In this paper, we
will prove this is true.



Previous: Participation
Next: Registration
Workshop Index
DIMACS Homepage
Contacting the Center