G. Cormode and D. Ting. Federated data distribution shift estimation. In Proceedings of the VLDB Endowment, volume 18, pages 2399-2412, 2025.

As data is increasingly held at the edge of the network, new methods are needed to perform analysis over distributed inputs. This has led to the emergence of the federated model of distributed computation, which places emphasis on privacy and scalability. A central problem is to analyze data distributions where the data is spread across a large number of distributed clients. This supports a number of tasks within federated learning and federated analytics. We present techniques to measure the similarity of distributions of data in the federated model. We define sketches for this task that allow efficient estimation of the difference between two distributions based on the total variation distance (L1) metric. These have accuracy and privacy guarantees, and can be computed incrementally over dynamic data. Our experimental study shows that these are practical to implement and provide accurate estimates.

bib | DOI | .pdf ] Back


This file was generated by bibtex2html 1.92.