DIMACS TR: 96-56

Vapnik-Chervonenkis Dimension of Recurrent Neural Networks



Authors: Pascal Koiran, Eduardo D. Sontag

ABSTRACT

Most of the work on the Vapnik-Chervonenkis dimension of neural networks has been focused on feedforward networks. However, recurrent networks are also widely used in learning applications, in particular when time is a relevant parameter. This paper provides lower and upper bounds for the VC dimension of such networks. Several types of activation functions are discussed, including threshold, polynomial, piecewise-polynomial and sigmoidal functions. The bounds depend on two independent parameters: the number $w$ of weights in the network, and the length $k$ of the input sequence. In contrast, for feedforward networks, VC dimension bounds can be expressed as a function of $w$ only. An important difference between recurrent and feedforward nets is that a fixed recurrent net can receive inputs of arbitrary length. Therefore we are particularly interested in the case~$k \gg w$. Ignoring multiplicative constants, the main results say roughly the following: \begin{itemize} \item For architectures with activation $\sigma $ = any fixed nonlinear polynomial, the VC dimension is $\approx \nopar\leninp$. \item For architectures with activation $\sigma $ = any fixed {\it piecewise\/} polynomial, the VC dimension is between $\nopar\leninp$ and $\nopar^2\leninp$. \item For architectures with activation $\sigma = {\cal H}$ (threshold nets), the VC dimension is between $\nopar\log(\leninp/\nopar)$ and $\min\{\nopar\leninp\log\nopar\leninp,\nopar^2+\nopar\log\nopar\leninp\}$. \item For the standard sigmoid $\sigma(x)=1/(1+e^{-x})$, the VC dimension is between $\nopar\leninp$ and %PK: $\nopar^4\leninp^2$. %$\nopar^2\leninp^4$. \ei

Paper Available at: ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-56.ps.gz
DIMACS Home Page