Scaling Human Feedback Using Foundation Models

October 20, 2023, 10:45 AM - 11:05 AM



Rutgers University

CoRE Building

96 Frelinghuysen Road

Piscataway, NJ 08854

Click here for map.

Minae Kwon, Stanford University

The way we learn from humans is changing as models become more capable. Previously, we’d rely heavily on human demonstrations, handcrafted rewards, and preference labels to train models. However, human feedback is not very scalable for several reasons, including the fact that it simply requires a lot of human effort to specify objectives and preferences. I explore two ways in which we can physically scale the amount of human feedback by reducing human specification burden using foundation models.