DIMACS :: Details

Learning with Drifting Input Distributions

February 19, 2025, 11:00 AM - 12:00 PM

Location:

Conference Room 301

Rutgers University

CoRE Building

96 Frelinghuysen Road

Piscataway, NJ 08854

Alessio Mazzetto, Brown University

Abstract

We develop and analyze a general technique for learning with unknown distribution drift. Given a sequence of independent observations from the last T steps of a distribution that evolves over time, our algorithm addresses a learning problem with respect to the distribution's state at the final time step. In this non-stationary setting, older data may provide outdated information due to drift, making the selection of an appropriate recent data window crucial. Unlike previous work that relies on a priori assumptions about the magnitude of the drift, our algorithm dynamically adapts its window size based on the input. In particular, at each step, it selects a window of past observations that minimizes the trade-off between increased variance in the learning process from using fewer samples, and greater drift-induced error from including older, less relevant data. The challenge is that estimating the drift is impossible, as we may have only a single sample from each distribution. Nonetheless, we show that without explicitly estimating the drift, our method solves the learning problem with nearly the same error as an algorithm that knows the drift magnitude in advance. We demonstrate applications of this technique and establish matching lower bounds for problems such as binary classification, discrete distribution estimation, and vector quantization.

See: https://theory.cs.rutgers.edu/theory_seminar