« search calendars« Theoretical Computer Science Seminar

« Efficient PAC Learning from the Crowd

Efficient PAC Learning from the Crowd

September 13, 2017, 11:00 AM - 12:00 PM

Location:

Conference Room 301

Rutgers University

CoRE Building

96 Frelinghuysen Road

Piscataway, NJ 08854

Pranjal Awasthi, Rutgers University

In recent years crowdsourcing has become the method of choice for gathering labeled training data for learning algorithms. Standard approaches to crowdsourcing view the process of acquiring labeled data separately from the process of learning a classifier from the gathered data. This can give rise to computational and statistical challenges. For example, in most cases there are no known computationally efficient learning algorithms that are robust to the high level of noise that exists in crowdsourced data. Furthermore, efforts to eliminate noise through majority voting often require a large number of queries per example.

In this talk I will describe a model of crowdsourcing and algorithms that interleave the process of labeling and learning to get good classifiers with minimal overhead. In particular, I will show that any function class learnable in the traditional PAC model is also learnable in the crowdsourced model where only a noticeable fraction of the labelers are correct. Furthermore, each labeler is required to label only a constant number of examples and each example is only queried a constant number of times on average.

Joint work with Avrim Blum, Nika Haghtalab and Yishay Mansour.