### DIMACS Theoretical Computer Science Seminar

Title: Learning decision trees and DNF formulas in the average case

Speaker: ** Rocco Servedio**

Date: February 8, 2005 2:00-3:00pm

Location: DIMACS Center, CoRE Bldg, Room 431, Rutgers University, Busch Campus, Piscataway, NJ

Abstract:

The problem of average-case learning for decision trees is as
follows: a decision tree T (over Boolean features x1,...,xn) is chosen at
random from some natural distribution over decision trees. The learning
algorithm is given access to uniform random examples which are labeled
according to T. The job of the learning algorithm is to efficiently
construct a hypothesis function f which closely agrees with the unknown
randomly chosen tree T.

Decision trees are an expressive representation for Boolean functions
which have proven hard to learn in worst-case learning models. An even
more challenging problem is that of learning an unknown DNF (Disjunctive
Normal Form formula, i.e. an AND of ORs of Boolean variables) or CNF in
the worst case.

In this work we give the first efficient algorithm for the average-case
versions of these learning problems. Our algorithm for decision trees
runs in polynomial time for random trees of logarithmic depth. Our
algorithm for DNF and CNF formulas runs in polynomial time for random
formulas with a subquadratic number of AND gates.

The talk will be self-contained and assumes no prior background in
computational learning theory.

This is joint work with Jeff Jackson.