« search calendars« DIMACS Workshop on Efficient Algorithms for High Dimensional Metrics: New Tools

« Locally Consistent Decomposition of Strings with Applications to Edit Distance Sketching

Locally Consistent Decomposition of Strings with Applications to Edit Distance Sketching

May 08, 2024, 4:30 PM - 5:00 PM

Location:

DIMACS Center

Rutgers University

CoRE Building

96 Frelinghuysen Road

Piscataway, NJ 08854

Click here for map.

Sudatta Bhattacharya, Charles University

In this paper we provide a new locally consistent decomposition of strings. Each string x is decomposed into blocks that can be described by grammars of size O(k) (using some amount of randomness). If we take two strings x and y of edit distance at most k then their block decomposition uses the same number of grammars and the i-th grammar of x is the same as the i-th grammar of y except for at most k indexes i. The edit distance of x and y equals to the sum of edit distances of pairs of blocks where x and y differ. Our decomposition can be used to design a sketch of size O(k^2) for edit distance, and also a rolling sketch for edit distance of size O(k^2). The rolling sketch allows to update the sketched string by appending a symbol or removing a symbol from the beginning of the string.

[Video]