### DIMACS Theoretical Computer Science Seminar

Title: Information and Learning

Speaker: **Joel Ratsaby**, Ben Gurion University

Date: September 27, 2005 2:00-3:00pm

Location: DIMACS Center, CoRE Bldg, Room 431, Rutgers University, Busch Campus, Piscataway, NJ

Abstract:

In every learning problem that we encounter there is always some
background assumption or side information that helps us structure
the family of models which we believe can do well in approximating
the optimal solution. Different types of side information can
amount to a large difference in the level of success of the
resulting learned model. The most ubiquitous type of information is
depicted by data, i.e., samples taken from the underlying unknown
process or 'target function'. Other types are a hybrid between
descriptional and sample-based information.

Let t be any fixed target in the domain and suppose that a binary
input string x can express any finite amount of information about t.
How valuable is x for learning t?
Can this value be quantified?
Is there a universal law relating the information about t
in x to the complexity of x?

In this talk I report on some preliminary results of ongoing work which
I started in the Computer Science Department at University College
London on trying to quantify information for the problem of learning.

My approach is combinatorial in nature. I consider a simple setting
with a finite domain [n] = {1, ..., n and a binary target function
t: [n] -> {0,1}. In this respect the class of possible hypotheses
amounts to finite sets of binary sequences. I define a simple but
fundamental aim of learning which allows me to quantify the partial
information about t in x in a non-stochastic way. I then compute it
for various types of sample and hybrid information.