The most basic definition of recursion is that it takes place when a method calls itself. Recursion also takes place when a method called by a method calls the method that called it originally. Recursive routines are useful when there is a need to process an object that is built from objects like itself. Directories are recursive objects that most computer users know and use often. Inside a directory are files and other directories. The directories inside a directory may also include files and directories. This process can go as deep as there are directories inside directories.
Directories are an example of a tree structure. Trees abound in computer science. This makes recursion a very important concept for computer scientists.
Intentional recursive methods call themselves. Accidentally recursive methods call other methods that call the original method. Accidental recursive methods are best avoided as they are unnecessarily complex. Intentional methods are quite safe and very powerful as long a few guidelines are honored.
The first guideline is that each recursive method must have a well-defined stopping state of terminating condition. This means that the method must have a condition that will cause it to not call itself again. Recursive methods must choose to call themselves. If the routine has no other option but to call itself again there will be an infinite loop.
The second guideline is that the recursive step, the action that calls the method, must do something to move the program closer to the stopping state. If the recursive state passes the same values in its call every time then there is no movement toward a stopping state. Likewise, if the change in values is improperly set then the stopping state may be missed. For example if the stopping state is based on an even number and all calls to the method pass odd numbers an infinite loop will occur.
When a method calls another method, information about the currently running method and the information needed by the next method are placed on the system stack. The information about and data for the calling method will be saved on the stack so that the method can resume at the same place it left off when it made its call. When one instance of a method finishes the system will take its information off the stack and use the information below it to determine where to continue processing. Recursive routines do take more system and memory overhead then iterative methods but this is not a reason to avoid recursive methods. There are several times when recursion should be avoided.
A recursive program in an infinite loop will eventually crash with a stack overflow because more information then the stack can hold will have been placed on it. This is different from iterative infinite loops that usually do not use increasing resources the longer they execute.
Recursive methods are sometimes the cleanest, shortest and most elegant solution to a problem. Some problems, such as searching a tree structure, are very difficult to solve iteratively but easy to solve recursively.
Ideally, a recursive should call itself more then once with different parameter values. A method that calls itself inside a loop while processing several pieces of information is a good example of multiple calls. A method that calls itself only once can generally be designed iteratively.
It is important that recursive methods be documented so that they can be clearly understood by programmers who must maintain them later.
As mentioned earlier, directories are recursive objects and so recursive methods are useful for processing them. In this example, you will list all the files in a directory. C# supports a DirectoryInfo class and this example starts with creating a DirectoryInfo object. The parameter in the DirectoryInfo class constructor is a string that holds the name of a directory.
DirectoryInfo myDirectory = new DirectoryInfo(dirName);
Once the directory is declared, a program can use the GetFiles method to retrieve information about the files in the directory. This information will be copied into an array of FileInfo objects. A foreach loop can be used to process the list of files as in the code below.
FileInfo [] myFiles = myDirectory.GetFiles();
foreach (FileInfo myFile in myFiles)
{
Console.WriteLine("File: {0} \tSize: {1}",myFile.Name, myFile.Length);
}
This code would be enough if directories only held files. Because directories can include other directories, some additional code is required. Fortunately, the code already given will work for subdirectories and can be reused. Think about the process that needs to take place.
As you can see, these steps take us deeper and deeper into the directory. If the method to process a directory is designed properly it can be reused to process each sub directory. The DirectoryInfo class has a GetDirectories method that will return an array of DirectoryInfo objects. This list can be processed using a foreach loop. This loop will pass the name of each directory to the method that processes directories. This meets the suggestion that multiple calls are an indicator for recursion.
The following recursive method lists all the files and all the subdirectories in a directory.
static void listDirectory(string dirName)
{
try
{
// Open a new DirectoryInfo object with the directory name
DirectoryInfo myDirectory = new DirectoryInfo(dirName);
// Get the names of all the subdirectories
DirectoryInfo [] mySubs = myDirectory.GetDirectories();
// Process the sub directories before the parent
foreach (DirectoryInfo Sub in mySubs)
{
// This is where it gets recursive
// Call this method to process the sub directory
listDirectory(Sub.FullName);
}
// Get the files in the directory
// This happens only when there are no more
// sub directories to process for this directory
FileInfo [] myFiles = myDirectory.GetFiles();
foreach (FileInfo myFile in myFiles)
{
Console.WriteLine("File: {0} \tSize: {1}",myFile.Name, myFile.Length);
}
// After the files have been listed, list the directory
Console.WriteLine("Directory: {0} \n", myDirectory.FullName);
}
catch (Exception e) // Handle errors her
{
Console.WriteLine("Problem with: {0} - {1}", dirName, e.Message);
return;
}
}
Some details in this method bear highlighting. The most important is that the important code is executed inside a try block. If there are exceptions thrown, for example if a badly specified or non-existent directory name is passed in, they will be handled with an informative message.
How is an infinite loop avoided? There is no if or switch statement in this code but there is a decision point in it. The method only calls itself inside the foreach loop that processes directories. If there are no subdirectories in a directory the method will not be called again. The nature of directories handles the problem of setting a stopping state for us. When there are no more subdirectories in a directory, the files in the directory will be listed. When there are no more files in the directory the method ends without calling itself. No matter how many files or directories there are, there will still be a finite number of items to process.
Iteration
As discussed earlier, recursion may be used to process iterative actions. Recursion is used for calculations where the answer can be described in a function that relates to itself. For example, the Factorial calculation involves multiplying a set of number from n to 1. In other words, 5 Factorial (shown mathematically as 5!) is 5 is multiplied by 4. That result is multiplied by 3 and so on until the last number is multiplied by 1.
5! = 5 x 4 x 3 x 2 x 1
This can be expressed mathematically as F(n) = n x F(n-1). As you can see, this translates easily into a C# recursive function. A sample is below. Notice that the return type of the function is long. Factorial numbers can become very large.
static long Fact(int N)
{
if (N == 1)
return 1;
else
return (N * Fact(N - 1));
}
The functions stopping state is when the value passed into the function is one. This will always happen as long as the first call is greater then zero. This problem can also be solved using normal iteration. A for loop such as the code below will arrive at the same answer.
long Fac = x;
for (int z = x - 1; z > 1; z--)
Fac = Fac * z;
For many problems the iterative solution is easier for most people to follow. For other people, the recursive solution has an elegance and closeness to the mathematical notation that makes it attractive. The iterative solution uses somewhat fewer system resources but on a large memory system with a fast processor this in not a significant factor for this problem.
Summary
Recursive methods call themselves. Recursive methods are particularly suited for objects, such as directories, that contain objects like themselves. Care must be taken to insure that a recursive method moves toward a clearly defined and understood stopping point.
Recursion can also be used for iterative tasks. Recursive iteration should avoid applications that involve large amount of data.
Stack Overflow The stack holds context information for each subroutine call. When the stack runs out of room to hold this information it "overflows" and the program crashes.
Recursion A method that calls itself.
Projects