### DIMACS Theoretical Computer Science Seminar

Title: Optimal lower bounds for locality sensitive hashing (except
when q is tiny)

Speaker: **Ryan O'Donnell**, CMU

Date: Wednesday, October 13, 2010 11:00-12:00pm

Location: DIMACS Center, CoRE Bldg, Room 431, Rutgers University, Busch Campus, Piscataway, NJ

Abstract:

Locality Sensitive Hashing (LSH) is a widely used algorithmic tool,
combining hashing with geometry, with applications to nearest-neighbor
search. For points in {0,1}^d, one wants a family H of functions such
that if dist(x,y) <= r then Pr[h(x) = h(y)] >= p (when h is chosen
randomly from H), whereas if dist(x,y) >= cr then Pr[h(x) = h(y)] <=
q. For a fixed c, the quality of the LSH family is determined by the
smallness of its "rho parameter", namely rho = ln(1/p)/ln(1/q). In
their seminal 1998 work, Indyk and Motwani gave a simple LSH family
with rho <= 1/c. The only known lower bound,
[Motwani-Naor-Panigrahy'07], was that rho must be at least roughly
0.5/c.

In this work, we give a simple proof of the optimal lower bound, rho
>= 1/c. In one sentence, it follows from the fact that the
noise-stability of a boolean function at "time" t is a log-convex
function of t.

We will also discuss the fact that all results mentioned here rely
on the underpublicized assumption that q is not too small.

This is joint work with Yi Wu of IBM Research and Yuan Zhou of Carnegie Mellon.