DIMACS Theoretical Computer Science Seminar

Title: Information and Learning

Speaker: Joel Ratsaby, Ben Gurion University

Date: September 27, 2005 2:00-3:00pm

Location: DIMACS Center, CoRE Bldg, Room 431, Rutgers University, Busch Campus, Piscataway, NJ

Abstract:

In every learning problem that we encounter there is always some background assumption or side information that helps us structure the family of models which we believe can do well in approximating the optimal solution. Different types of side information can amount to a large difference in the level of success of the resulting learned model. The most ubiquitous type of information is depicted by data, i.e., samples taken from the underlying unknown process or 'target function'. Other types are a hybrid between descriptional and sample-based information.

Let t be any fixed target in the domain and suppose that a binary input string x can express any finite amount of information about t. How valuable is x for learning t? Can this value be quantified? Is there a universal law relating the information about t in x to the complexity of x?

In this talk I report on some preliminary results of ongoing work which I started in the Computer Science Department at University College London on trying to quantify information for the problem of learning.

My approach is combinatorial in nature. I consider a simple setting with a finite domain [n] = {1, ..., n and a binary target function t: [n] -> {0,1}. In this respect the class of possible hypotheses amounts to finite sets of binary sequences. I define a simple but fundamental aim of learning which allows me to quantify the partial information about t in x in a non-stochastic way. I then compute it for various types of sample and hybrid information.