« Beyond Trace Reconstruction: Population Recovery from the Deletion Channel
March 27, 2019, 11:00 AM - 12:00 PM
Location:
Conference Room 301
Rutgers University
CoRE Building
96 Frelinghuysen Road
Piscataway, NJ 08854
Xi Chen, Columbia University
Population recovery is the problem of learning an unknown distribution over an unknown set of n-bit strings, given access to independent draws from the distribution that have been corrupted according to some noise channel. We study the problem under the deletion channel, in which each bit is independently deleted and the surviving bits are concatenated. This is a more challenging noise model than bit-flip or erasure noise; indeed, even the simplest case in which the population is of size 1 corresponds to the trace reconstruction problem, for which an exponential gap remains in the sample complexity.
In this talk we give upper and lower bounds for population recovery under the deletion channel. We show that a population of constant size can be learned using 2^{O(sqrt{n})} samples, while n^{Omega(k)} samples are required for population size k.
Our upper bounds are obtained via a robust multivariate generalization of a polynomial based analysis, due to Krasikov and Roddity, of how the k-deck of a bit-string uniquely identifies the string; this is a very different approach from recent algorithms for trace reconstruction (the case of k=1). Our lower bounds build on moment-matching results of Roos and Daskalakis and Papadimitriou.
Joint work with Frank Ban (Berkeley), Adam Freilich, Rocco A. Servedio and Sandip Sinha (Columbia).