Working Group on Computer-Generated Conjectures from Graph Theoretic and Chemical Databases I

Working Group Meeting: November 12 -16, 2001

Public Workshop: Graph Theory Day, Saturday, November 10, 2001

Location: DIMACS Center, CoRE Building, Rutgers University

Organizers:
Patrick Fowler, University of Exeter, P.W.Fowler@exeter.ac.uk
Pierre Hansen, GERAD - Ecole des Hautes Etudes Commerciales, pierreh@crt.umontreal.ca
Presented under the auspices of the Special Focus on Data Analysis and Mining.

Abstracts:


1.

Speaker: Hernan Abeledo, George Washington University

Title: Polyhedral properties and duals of benzenoid indices.


Abstract:

The Clar and Fries numbers of a benzenoid system are 
optimization-based indices that have straightforward 
formulations as integer programming problems. It has been been 
shown that these integer programs  have the  integrality property,
though the cause is unusual in the literature: their constraint
matrices are unimodular, but are not always totally unimodular.
In this paper we derive totally unimodular formulations for both 
indices and use them to obtain simpler proofs of the unimodularity
of the original models. We also use these results to establish 
strong combinatorial duals for both the Clar and Fries numbers.

2. SEAL: System for enhancing algorithms through learning Eric Breimer and Darren Lim Department of Computer Science Rensselaer Polytechnic Institute Troy, New York 12181 We introduce two experimental systems which use learning to automatically enhance the performance of practical algorithms. The systems demonstrate the feasibility and potential advantages of applying machine learning techniques to automatically generate algorithmic heuristics. Both systems are currently in development. Part I: Supervised Learning In Dynamic-programming Environments We introduce an experimental approach for enhancing the runtime of dynamic-programming algorithms by using both supervised and unsupervised learning. We have designed a prototype learning system for solving the local alignment problem for DNA sequences. The system starts with a traditional dynamic programming algorithm for computing the optimal alignment. Over time, the system exploits relationship between the input features and algorithm behavior to evolve an approximate algorithm. Experiments show that the evolved algorithm is near optimal and significantly faster for a wide variety of input types. Part II: Supervised Learning In Restricted Backtracking Environments We present an approach to DNA shotgun sequencing which uses restricted backtracking in order to reduce search space computation. Algorithms are generated from previously sequenced samples of DNA, and then are used for various input domains. Our experiments demonstrate that considerable gains in speed can be achieved without significant loss in accuracy. Further, the resulting algorithms learned on different input domains are distinct, illustrating the value of developing domain specific algorithms. We also present experimental results on the applicability of the algorithms learned on one domain to the other domains.
3. Graffiti.pc Red Burton Style Barbara Chervenka, University of Houston Implementations of the program Graffiti.pc, subject to the new Red Burton style of generating conditional conjectures, led to a partial characterization of the alpha-core number of a graph in terms of other graph theoretical concepts. The alpha-core number of a graph is defined as the cardinality of the intersection of all maximum independent sets of a graph. In short, the Red Burton style dictates that a smallest counterexample is found for false conjectures, and that in the case of true statements, a complete characterization is given of the graphs for which equality holds. Before the program generates the next list of conjectures, these graphs are removed from consideration. The focus of this presentation will be the Red Burton style as well as my observations as a student on this approach.
4. Computer Generated Conjectures In Mathematical Chemistry Siemion Fajtlowicz, University of Houston. I will discuss selected conjectures of Graffiti which have or may have some chemical significance. One of the first conjectures of the program seem to indicate that the boiling point of single-bonded acyclic hydrocarbons is closely related to the sum of the positive eigenvalues of their graph. The latter is considered a predictor of energy and thus also of stability of molecules. One of the most recent conjectures of the program suggests a new, possibly better predictor of stability of molecules, which should be valid at least for the fullerenes. Most of the discussed conjectures will deal with stability sorting patterns, the independence number, and the expanding properties of fullerenes. The first of these conjectures led to a new representation and characterization of icosahedral C60 and it may link its structure to the Golay code.
5. Designing Learning Algorithms for Practical Computing Mark K. Goldberg Department of Computer Science Rensselaer Polytechnic Institute Troy, New York 12181 We present ideas for designing algorithms that construct algorithms for combinatorial optimization problems. The design of these ``meta-algorithms'' follows closely the PAC-learning paradigm for supervised learning. The ingredients for the design of meta-algorithms are a supervisor algorithm, termed an ``oracle'' and an algorithm which generates inputs to the optimization problem in question according to a desirable probability distribution, e.g. from a specified input domain. The goal of the meta-algorithm is to run the oracle on the generated inputs, to "ask" from the oracle some information about the solutions obtained, and after a number of training sessions of this type, to come up with a new optimization algorithm for the same problem, which (hopefully) is significantly more efficient on the input domain for which it was trained, although it may loose the optimality of the solutions. It is also expected that the constructed algorithm can be applied to arbitrary inputs to the optimization problem. We show how to design learning algorithms (meta-algorithms) for the case of oracles that implement two basic algorithmic strategies: backtracking and dynamic programming.
6. Encoding Fullerenes and Geodesic Domes Jack Graver, Syracuse University Coxeter's classification of the highly symmetric geodesic domes (and, by duality, the highly symmetric fullerenes) is extended to a classification scheme for all geodesic domes and fullerenes. Each geodesic dome is characterized by its signature: a plane graph on twelve vertices with labeled angles and edges. In the case of the Coxeter geodesic domes, the plane graph is the icosahedron, all angles are labeled one and all edges are labeled by the same pair of integers (p,q). Edges with these ``Coxeter coordinates" correspond to straight line segments joining two vertices of T, the triangular tessellation of the plane, and the faces of the icosahedron are filled in with equilateral triangles from T whose sides have coordinates (p,q). We describe the construction of the signature for any geodesic dome. In turn, we describe how each geodesic dome may be reconstructed from its signature: the angle and edge labels around each face of the signature identify that face with a polygonal region of T and, when the faces are filled by the corresponding regions, the geodesic dome is reconstituted. The signature of a fullerene is the signature of its dual. For each fullerene, the separation of its pentagons, the numbers of its vertices, faces and edges and its symmetry structure are easily computed directly from its signature. Also it is easy to identify nanotubes by their signatures.
7. An Application Framework for Combinatorial Algorithms Thang Bui, Robert Kingan, Sandra Kingan In this paper, we discuss an application framework we are developing for manipulating combinatorial objects. A combinatorial object is represented as an interface in Java (an abstract base class in C++). A particular representation of a combinatorial object is modeled as a class implementing the interface. Not only are objects such as subsets, vectors over GF(p), matroids, graphs etc. represented by interfaces, but also the algorithms that perform various tasks. At runtime, a simple expert system delegates tasks to the appropriate algorithm. This allows developers to add new algorithms to the system without changing the existing code. This application framework will be part of a larger combinatorial object repository, which will consist of a database of matroids together with their relations.
8. Craig Larson, University of Houston On Progress in Automated Conjecture-Making Several programs have attempted to automate mathematical conjecture-making including AM, Graffiti, HR and AGX. These programs have implemented a variety of heuristics and have had varying degrees of success. Of use to researchers in any endeavor is an evaluation of past art: What ideas have proved successful and which have not? What has been learned? What can be built on? Research on automated conjecture-making is relatively new and researchers have been not been motivated by the same aims and have even adopted conflicting vocabularies. Some research has aimed at "simulating mathematical intelligence" while some has simply aimed at producing conjectures. Some would say that a program which produces Goldbach's Conjecture has made a conjecture -- while others would not. These differences are to be expected in a young field and some attempt will be made to address them.
9. Constraint graph generation for mathematical chemistry T. Gruener and R. Laue University of Bayreuth While graph generators usually are optimized to solve the isomorphism problem as fast as possible for as many graps as possible structure elucidation requires a different paradigm. Only a few structures have been subject to an experiment and these should be determined. So, a collection of constraints have to be fulfilled by the generated structures, constraints that often are formulated in an interactive response to previous proposals of solutions, using a graphical user interface. MOLGEN 4.0 is based on a new generator that allows to formulate a large number of different constraints like a range of brutto formulae, neighbourhoods, hybridization, substructures, ring structures, bondings, ionization, and symmetries. The constraints are checked early during the generation process to prune the backtrack tree. The construction strategy may depend on the actual constraints given. Instead of orderly generation or related methods here canonical forms are used to solve the isomorphism problem for the - hopefully - few resulting structures. The concept allows to integrate new tests of constraints with modest effort. The system is already in use in chemical industry in various countries. MOLGEN 4.0 is a tool for chemical structure elucidation. It allows to interactively formulate several types of constraints resulting from different kinds of spectra. It contains several components, notably a flexible graph generator, a graphical graph editor and a 3D energy optimized drawing. Some further special tools for mass spectra, for combinatorial chemistry, or for interactive teaching are based on MOLGEN. A description is available on the internet at http://www.mathe2.uni-bayreuth.de/molgen4/
10. Chemical Evolution Robert B. Nachbar Applied Computational Science and Mathematics, Merck Research Laboratories, 126 E. Lincoln Ave., Rahway, NJ 07065 A simple hierarchical data structure (Mathematica expression) has been developed to represent the constitution (topology) of organic chemical structures. This representation also supports the notions of chemical valence and stereochemistry. Mathematica expression patterns and built-in functions can be used to perform chemically meaningful transformations on instances of this data structure. The important aspects of the implementation will be presented, and as well as its strengths and weaknesses. Application of this methodology to chemical constitution optimization (inverse QSAR and average molecular structure generation) and chemical reactivity will be described or demonstrated.
11. Educational Aspects of Graffiti Ryan Pepper, University of Houston This presentation is intended to demonstrate some educational aspects of the conjecture making program Graffiti. It will discuss the first classroom experiences with the semi-automated process of the program generating graph- theoretical conjectures, and the student supplying counter-examples or proofs. There will also be brief discussion on how this learning process encourages the independence of thought required for students to make conjectures of their own.
12. Representation of Chemical Graphs, Maps and Polyhedra Tomaz Pisanski, University of Ljubljana, Slovenia Research in discrete mathematics can be conducted more efficiently if one can visualize discrete structures such as graphs, groups, polyhedra, maps, posets, lattices, tilings, incidence structures, molecules, etc. At present we are not aware of any coherent theory that would tackle the general problem of visualization. For graphs visualization is sometimes called graph realization, graph drawing, or graph representation. During the talk we will present some of our research results obtained with the help of our computer package. We will also review and demonstrate several drawing methods and propose a direction in which a general theory could be developed. Namely, for a graph, one can define a representation to be a function that maps each vertex to a vector from some vector space. One of the problems we have to face is how to extend such a function to edges and faces of maps and polyhedra.
13. Puzzles, Graphs, and Graph Generators Dennis Shasha, Courant Institute, New York University (puzzle columnist for Dr. Dobb's Journal and Scientific American) Solving puzzles often involves the construction of a graph having a certain property or combination of properties. The impressive graph generation packages available today ought to help solve such puzzles. In this talk, I describe a few graph puzzle problems as challenges to graph generation. I am prepared to make these puzzles available to anyone before the talk. Topics: extremal graph theory, graphs on a lattice, graphs in Euclidean space, color swapping, graph cuts, and mazes.
14. Dragan Stevanovic, University of Nis, Yugoslavia TITLE: CAR: Computer-Aided Research of Integral Graphs ABSTRACT: We discuss various ways in which computers were used as a research aid in investigation of integral graphs in last 30 years. Graph is said to be {\em integral} if each eigenvalue of its adjacency matrix is integer. The investigation of integral graphs began with Harary and Schwenk in 1974, and one of the first results was a determination of cubic integral graphs by Cvetkovi\'c. There are $13$ such graphs and he managed to construct $12$ by purely theoretical case analysis (by hand), but the last one was beyond human capabilities. The construction of the remaining graph by Bussemaker and Cvetkovi\'c was the first use of computer in this fashion. After that, we will discuss how Bali\'nska et al. used genetic algorithms to search for all integral graphs up to $11$~vertices. This collection also initiated some further research. Namely, it contains numerous graphs with special structure, and some of them inspired Hansen et al. to define new graph classes, and for each of them generate all graphs with sufficiently large number of vertices, isolate integral ones, pose a conjecture and then prove characterization of integral graphs in that class. In previous cases computer was used only once during research: human prepared program based on known theoretical results, run it on computer and analyzed its output in order to get new theoretical results. In our discussion we will also touch upon the cases where more human-computer interaction was needed: use of system GRAPH by Simi\'c et al. to construct nonregular integral graphs with largest degree~$4$, and by Stevanovi\'c to construct $4$-regular integral graphs avoiding $\pm 3$ in spectrum. The later research also gave clues how computer can be used to construct all regular bipartite graphs with given spectrum consisting of $5$ or $6$ distinct eigenvalues. At the end, based on experience from exposed cases we will also discuss a mathematician's dream of features a system for computer-aided research should have.
15. Minimum Total Distance d-Trees Pierre Hansen and Maolin Zheng Abstract The AutoGraphiX system, created by G. Caporossi, computes various graph indices, finds extreme or near extreme graphs for such indices, and makes conjectures and predictions. One index it computes is the total distances (the sum of distances between all vertices) of trees. What are the trees with maximum degree d and having the minimum total distance? The system suggests that they are the trees of minimum diameter. In this paper, we will prove this is true.

Previous: Participation
Next: Registration
Workshop Index
DIMACS Homepage
Contacting the Center

Document last modified on November 6, 2001.