DIMACS TR: 2000-37

An inequality limiting the number of maximal frequent sets



Authors: E. Boros,V. Gurvich, L. Khachiyan and K. Makino

ABSTRACT

Given an arbitrary $m\times n$ binary matrix $A$ and a threshold $t \in \{1, \ldots, m\}$, a subset $C$ of the columns is called $t$-{\em frequent} if there are at least $t$ rows having a $1$ in each of these columns, and otherwise $C$ is called $t$-{\em infrequent}. Denoting by $\alpha$ the number of maximal $t$-frequent, and by $\beta$ the number of minimal $t$-infrequent column sets in the given matrix $A$, we prove that the inequality $\alpha \leq (m-t+1)\beta$ holds. This inequality is sharp, and allows for an incremental quasi-polynomial algorithm for generating all minimal $t$-infrequent sets, while the analogous problem for maximal $t$-frequent sets is NP-hard. Our proof is based on an inequality from extremal set theory, which may be of independent interest.

Paper Available at: ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2000/2000-37.ps.gz
DIMACS Home Page