DIMACS TR: 98-09

Convexity and Logical Analysis of Data

Authors: Oya Ekin, Peter L. Hammer and Alexander Kogan


A Boolean function is called $k$-convex if for any pair $x,y$ of its true points at Hamming distance at most $k$, every point ``between'' $x$ and $y$ is also true. Given a set of true points and a set of false points, the central question of Logical Analysis of Data is the study of those Boolean functions whose values agree with those of the given points. In this paper we examine data sets which admit $k$-convex Boolean extensions. We provide polynomial algorithms for finding a $k$-convex extension, if any, and for finding the maximum $k$ for which a $k$-convex extension exists. We study the problem of uniqueness, and provide a polynomial algorithm for checking whether all $k$-convex extensions agree in a point outside the given data set. We estimate the number of $k$-convex Boolean functions, and show that for small $k$ this number is doubly exponential. On the other hand, we also show that for large $k$ the class of $k$-convex Boolean functions is PAC-learnable.

Paper Available at: ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-09.ps.gz
DIMACS Home Page