A New Algorithm for Learning From Examples

Summer project of Rodolfo Edward Pérez with advisor Prof Edre Boros


What is learning from example?

Humans have a great ability to learn new concepts from a set of examples. For instance, if shown a number of examples of New York style and Chicago style pizzas, you will quickly learn to distinguish between the two varieties, perhaps noticing the differenct in the crust thickness or the way the toppings are positioned either within or on top of the cheese. One current topic of research in machine learning is the development of algorithms which give computers some of the power of generalization that humans naturally have. Essentially, an algorithm for learning from example will tell a computer how to construct a mapping from a set of observable variables (such as crust thickness, which toppings are present, and where they are located) to a classification (New York or Chicago style) when given a set of examples (a few pizzas of each type). Many such algorithms already exist, and some algorithms have advantages over others in terms of the number of examples necessary to learn the concept, time taken for the algorithm to run, memory requirements, accuracy of learning to classify given examples, and accuracy of genearalization. Given the variaty of standards by which the effectiveness of an alogrithm is measured, and the relative youth of this field, the search for new algorithms for learning by example is still very much a current area of research.

Description of the problem

This project treats only the case of learning functions from boolean variables to a boolean classification. I have been implementing and testing algorithms developed by my advisor, professor Boros. The goal for this project is to develop an algorithm which runs in a reasonable amount of time and has a high accuracy in generalization.

Description of the algorithm

The basic idea of the algorithm is to build a function represented as a logical circuit by creating one layer of gates at a time. The new gates are designed to satisfy optimize a heuristic which balances accuracy on the training examples with its ability to aid the other gates in the same layer to distinguish the positive and negative examples. We are still experimenting with the specifics of the algorithm, and more details will be posted later when the algorithm is finalized.