## DIMACS Mixer Series, September 19, 2002

DIMACS, Room 401 CoRE Building
**Abstracts:**

**Sorin Alexe**, RUTCOR, Rutgers University

Title: Datascope - a new tool for Logical Analysis of Data (LAD)

Logical Analysis of Data (LAD) is a method for data analysis based on
combinatorics, optimization, and logic. The LAD's input is a dataset
consisting of positive and negative observations described by numerical
and/or categorical attributes. A central role of LAD is played by
"patterns", which are significant rules associated to subsets of
observations in the same class. The LAD classification models are
constructed by applying discriminant analysis in the pattern space (or
"knowledge space"). LAD provides highly accurate and reproducible models,
it is insensitive to noise, and may handle well missing data. The LAD
method will be presented by applying the Datascope software - an
implementation of LAD - to a benchmark of datasets, publicly available at
the Irvine Repository of Machine Learning, University of California.

**Majoj M. Prabhakaran**, Graduate Student,
Computer Science, Princeton University

Title: Concurrent Zero Knowledge with Logarithmic Round-Complexity

We show that every language in NP has a (black-box)
concurrent zero-knowledge proof system using $\tilde{O}(\log n)$
rounds of interaction. The number of rounds in our protocol is
optimal, in the sense that any language outside BPP requires at least
$\tilde{\Omega}(\log n)$ rounds of interaction in order to be proven
in black-box concurrent zero-knowledge. The (standard) cryptographic
assumption involved is that there exists a collection of claw-free
functions. (Joint work with Manoj Prabhakaran, Alon Rosen, Amit Sahai)

**Carl Pomerance**, DIMACS member, Bell Labs

Title: What's new in primality testing

I describe the new deterministic, polynomial time primality test of
Agrawal, Kayal, and Saxena.

**Giovanni Di Crescenzo**, DIMACS member,
Telcordia

Title: Randomness-Optimal Characterizations of Two NP Proof Systems

While randomness seemed crucial to help polynomial-time algorithms,
a large body of research has been devoted to derandomization of these
algorithms. When studying higher complexity classes, an analogue
derandomization problem naturally arises, especially since randomness
seems even more crucial to achieve certain desirable goals. We investigate
quantitative aspects of randomness in two types of proof systems for NP,
motivated by cryptographic applications: two-round public-coin
witness-indistinguishable proof systems and non-interactive zero-knowledge
proof systems. Our main results are the following:

(1) if NP has a 2-round public-coin witness-indistinguishable proof system
then it has one using $\Theta(n^{\epsilon}+\log(1/s))$ random bits,

(2) if NP has a non-interactive zero-knowledge proof system then it has
one using $\Theta(n^{\epsilon}+\log(1/s))$ random bits, where $s$ is the
soundness error, $n$ the length of the input, and $\epsilon$ can be any
constant greater than $0$.

Both results are essentially optimal for the
most typical values of the parameters and only assume that NP $\neq$
average-BPP. As a consequence, assuming the existence of one-way
functions, both classes of proof systems are characterized by the same
randomness complexity as BPP algorithms. In order to achieve these
results, we formulate and investigate the problem of randomness-efficient
error reduction for two-round public-coin witness-indistinguishable
proofs, non-interactive witness-indistinguishable proofs, and
non-interactive zero-knowledge proofs, which generalizes the analogue and
well-studied problem for BPP computation. (This talk combines papers
presented at ICALP 1999 and RANDOM 2002, co-authored with Alfredo De
Santis and Giuseppe Persiano, from University of Salerno, Italy)

**Irit Dinur**, Post Doc at NEC

Title: Revealing Information While Preserving Privacy

Envision a database of a hospital containing the medical history of some
large population. On one hand, the hospital is interested in keeping the
privacy of its patients, and leaking no medical information that could
be related to a specific patient. On the other hand, the hospital would
like to advance scientific research which is based (among other things)
on statistics of the information in the database.

We propose a mathematical model for this problem, and present (tight)
lower bounds limiting the amount of information that can be revealed
without violation of privacy. (Based on joint work with Cynthia Dwork
and Kobbi Nissim)