## DIMACS Mixer Series, September 19, 2002

DIMACS, Room 401 CoRE Building

Abstracts:

Title: Datascope - a new tool for Logical Analysis of Data (LAD)

Logical Analysis of Data (LAD) is a method for data analysis based on combinatorics, optimization, and logic. The LAD's input is a dataset consisting of positive and negative observations described by numerical and/or categorical attributes. A central role of LAD is played by "patterns", which are significant rules associated to subsets of observations in the same class. The LAD classification models are constructed by applying discriminant analysis in the pattern space (or "knowledge space"). LAD provides highly accurate and reproducible models, it is insensitive to noise, and may handle well missing data. The LAD method will be presented by applying the Datascope software - an implementation of LAD - to a benchmark of datasets, publicly available at the Irvine Repository of Machine Learning, University of California.

Title: Concurrent Zero Knowledge with Logarithmic Round-Complexity

We show that every language in NP has a (black-box) concurrent zero-knowledge proof system using $\tilde{O}(\log n)$ rounds of interaction. The number of rounds in our protocol is optimal, in the sense that any language outside BPP requires at least $\tilde{\Omega}(\log n)$ rounds of interaction in order to be proven in black-box concurrent zero-knowledge. The (standard) cryptographic assumption involved is that there exists a collection of claw-free functions. (Joint work with Manoj Prabhakaran, Alon Rosen, Amit Sahai)

Title: What's new in primality testing

I describe the new deterministic, polynomial time primality test of Agrawal, Kayal, and Saxena.

Title: Randomness-Optimal Characterizations of Two NP Proof Systems

While randomness seemed crucial to help polynomial-time algorithms, a large body of research has been devoted to derandomization of these algorithms. When studying higher complexity classes, an analogue derandomization problem naturally arises, especially since randomness seems even more crucial to achieve certain desirable goals. We investigate quantitative aspects of randomness in two types of proof systems for NP, motivated by cryptographic applications: two-round public-coin witness-indistinguishable proof systems and non-interactive zero-knowledge proof systems. Our main results are the following:

(1) if NP has a 2-round public-coin witness-indistinguishable proof system then it has one using $\Theta(n^{\epsilon}+\log(1/s))$ random bits,

(2) if NP has a non-interactive zero-knowledge proof system then it has one using $\Theta(n^{\epsilon}+\log(1/s))$ random bits, where $s$ is the soundness error, $n$ the length of the input, and $\epsilon$ can be any constant greater than $0$.

Both results are essentially optimal for the most typical values of the parameters and only assume that NP $\neq$ average-BPP. As a consequence, assuming the existence of one-way functions, both classes of proof systems are characterized by the same randomness complexity as BPP algorithms. In order to achieve these results, we formulate and investigate the problem of randomness-efficient error reduction for two-round public-coin witness-indistinguishable proofs, non-interactive witness-indistinguishable proofs, and non-interactive zero-knowledge proofs, which generalizes the analogue and well-studied problem for BPP computation. (This talk combines papers presented at ICALP 1999 and RANDOM 2002, co-authored with Alfredo De Santis and Giuseppe Persiano, from University of Salerno, Italy)

Title: Revealing Information While Preserving Privacy

Envision a database of a hospital containing the medical history of some large population. On one hand, the hospital is interested in keeping the privacy of its patients, and leaking no medical information that could be related to a specific patient. On the other hand, the hospital would like to advance scientific research which is based (among other things) on statistics of the information in the database.

We propose a mathematical model for this problem, and present (tight) lower bounds limiting the amount of information that can be revealed without violation of privacy. (Based on joint work with Cynthia Dwork and Kobbi Nissim)