DIMACS TR: 96-56
Vapnik-Chervonenkis Dimension of
Recurrent Neural Networks
Authors: Pascal Koiran, Eduardo D. Sontag
ABSTRACT
Most of the work on the Vapnik-Chervonenkis dimension of neural
networks has been focused on feedforward networks. However,
recurrent networks are also widely used in learning applications,
in particular when time is a relevant parameter.
This paper provides lower and upper bounds for the VC dimension
of such networks. Several types of activation functions are discussed,
including threshold, polynomial, piecewise-polynomial and sigmoidal functions.
The bounds depend on two independent parameters: the number $w$ of weights in
the network, and the length $k$ of the input sequence. In contrast,
for feedforward networks, VC dimension bounds can be expressed as a
function of $w$ only. An important difference between recurrent and
feedforward nets is that a fixed recurrent net can receive inputs
of arbitrary length. Therefore we are particularly interested in the
case~$k \gg w$.
Ignoring multiplicative constants,
the main results say roughly the following:
\begin{itemize}
\item
For architectures with activation $\sigma $ = any fixed nonlinear polynomial,
the VC dimension is $\approx \nopar\leninp$.
\item
For architectures with activation $\sigma $ = any fixed {\it piecewise\/}
polynomial,
the VC dimension is between $\nopar\leninp$ and $\nopar^2\leninp$.
\item
For architectures with activation $\sigma = {\cal H}$ (threshold nets),
the VC dimension is between $\nopar\log(\leninp/\nopar)$ and
$\min\{\nopar\leninp\log\nopar\leninp,\nopar^2+\nopar\log\nopar\leninp\}$.
\item
For the standard sigmoid $\sigma(x)=1/(1+e^{-x})$,
the VC dimension is between $\nopar\leninp$ and
%PK:
$\nopar^4\leninp^2$.
%$\nopar^2\leninp^4$.
\ei
Paper Available at:
ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-56.ps.gz
DIMACS Home Page