DIMACS Center, CoRE Building, Room 401, Rutgers University

**Abstracts:**

Title: Clustering with qualitative information

We consider the problem of clustering a collection of elements based on pairwise judgments of similarity and dissimilarity. Bansal, Blum and Chawla cast the problem thus: given a graph G whose edges are labeled "+" (similar) or "-" (dissimilar), partition the vertices into clusters so that the number of pairs correctly (resp incorrectly) classified with respect to the input labeling is maximized (resp minimized). Complete graphs, where the classifier labels every edge, and general graphs, where some edges are not labeled, are both worth studying. We answer several questions left open in the paper of Bansal et al.

In this presentation I will show a factor 4 approximation algorithm for minimization on complete graphs, and a factor O(log n) approximation for general graphs. For the maximization version, a PTAS (polynomial time approximation scheme) for complete graphs is shown by Bansal et al.; I will demonstrate a factor 0.7664 approximation for general graphs, using semi-definite programming. If time permits, I will outline the reductions used in showing that some of these problems are APX-hard.

This is joint work with Venkatesan Guruswami and Tony Wirth.

Title: Questions regarding harmonic analysis of Boolean functions

Harmonic analysis has, in recent years, proved to be a powerful tool in the study of Boolean functions, and has forwarded research in the study of influences, learning, computational complexity, and more. In this talk we will go over some open questions concerning Boolean functions, the solution of which may require harmonic analysis. No prior knowledge is assumed expect for the very basics of linear algebra and normed spaces.

Title: "Learning DNF Formulas: the current status, and how you can win $1000"

The most central problem in computational learning theory is that of learning functions with polynomial-sized DNF formulas. Despite the optimism of the originator of the problem ([Valiant84]), it appears that a polynomial time algorithm in the most general model is out of reach. In this talk we survey the fastest known algorithms in a variety of learning models: PAC learning, uniform distribution learning, membership query learning, and the random walk model. We also include a discussion of why the problem seems hard in the uniform distribution model, and how solving a certain simple-to-state special case -- that of learning "juntas" -- can win you up to $1000.

This talk includes joint work with Nader Bshouty (Technion), Elchanan Mossel (Berkeley), and Rocco Servedio (Columbia).

Title: Private matching and information retrieval via homomorphic encryption

A homomorphic encryption system enables anyone, even parties who do not know the decryption key, to apply arithmetic operations to encrypted messages. We describe how this property can be used to design secure, privacy-preserving protocols for the following two tasks.

Private matching: Imagine two parties, Alice and Bob. Each party has an input which is a private list of K items. The output is the intersection of the two lists, and must be computed without revealing to the parties any other information. We describe protocols for computing private matching with O(K) communication and essentially K computation, for either malicious or semi-honest parties. (Interestingly, the asymptotic overhead of the computation is larger than O(K) but for any reasonable choice of parameters the actual overhead is K times a very small constant).

Private information retrieval: Alice's input is a list of N pairs, each consisting of a keyword and a payload. The input of Bob is a keyword. If Bob's keyword appears in Alice's list then Bob's output is the corresponding payload. The protocol has o(N) communication, and hides Bob's keyword from Alice and Alice's list from Bob. This is a generalization of the well investigated PIR problem (in which the keywords are the indices 1..N, and each payload is a single bit).

Title: Areas of Polygons from Heron of Alexandria to Dave Robbins

In 1994 Dave Robbins found a formula for the area of inscribed pentagons and hexagons, in terms of the side-lengths, thereby continuing earlier work of Heron (c. 60 BC) and Brahmagupta (c. 650 AD) for the triangle and quadrilateral. Now Dave is thinking about the septagon and beyond. I will outline his work, and some preliminary attempts on my part at simplifying it.