Algorithms
· Learn new vocabulary
· Demonstrate searching through unsorted information
· Demonstrate searching through sorted information
· Demonstrate how to swap the contents of variables
· Explain how to sort information in ascending and descending order
The ability of the computer to search through large amounts of information in has been one of the important features of computers since they first started being used. Sorting information so that it can be searched more easily and faster is an important part of the searching process. While C# supports array methods such as Sort and BinarySearch it is important that computer scientists and programmers understand how those methods work. It is not practical to put every dataset into an array. There are many applications where large amounts of data must still be searched and sorted by programmer written code.
People search for information in many different ways. You probably search through an assignment book by looking at the top of the page and reading down until you find a particular assignment. You will search for a phone number quite differently. You search will search for a phone number by jumping somewhere in the middle of the phone book, finding a name close to the one you are looking for and moving forward and backward until you zero in on the name you need.
The way you program a search, just like the ways you perform a search on your own, will depend on how much data must be searched and how the data is stored. It will also depend on how often the data must be searched. If data is searched often then the fastest and most efficient search should be used. This will often mean that the data should be sorted before searching. Sorted data can be searched very quickly and efficiently by using a binary search such as the one used in the BinarySearch method. If the amount of data is small or is seldom searched then the work involved in sorting first may not be a good use of time. A sequential search will meet the need to occasionally search small amounts of data.
Think about searching for a name in a list of names. If the list is not in alphabetical order your only option is to start at the beginning of the list, compare each name in the list until either you find it or you read the end of the page without finding it. That in a nutshell is a sequential search. The next question is turning that idea into C# code.
The foreach statement makes processing an array or a collection of object very easy. These statements form the basis of many sequential search algorithms. The additional step that is needed is to compare the item being processed with a key field that identifies the whole record or object. Once a match has been found, the loop can be exited and the index of the found item can be returned. If the end of the list is reached without a match being found then an appropriate code should be returned.
The following is a static member of a class, the Student class in this case, that performs a sequential search based on a student’s last name.
public static int SequentialSearchL(Student [] studentArray,
string findName)
{
int index = 0;
foreach (Student s in studentArray)
{
if (s.LastName == findName)
return index;
index++;
}
return -1;
}
An if statement inside the foreach loop compares the LastName for each student in the array. If a match is found then the current index value is returned. If a match is not found then an index counter is incremented. If the loop exits then it means that no match was found and a negative one is returned. This code can be used with any array of Student type. Because it is a static method, it is called using the class name. The following code demonstrates this method in action.
index = Student.SequentialSearchL(stu, "Jones");
if (index >= 0)
Console.WriteLine("{0} {1} {2}", stu[index].FirstName, stu[index].LastName, index, stu[index].age);
else
Console.WriteLine("Student not found");
Notice that the return code is checked before being used. This prevents attempts to access no existent elements in the array.
The are times when a file needs to be searched sequentially for multiple matches for a key value. The Student class has a gender property that returns “M” for all male students. The following code demonstrates a simple search for all male students in an array.
foreach (Student s in studentArray)
if (s.Gender == "M")
Console.WriteLine("{0} {1}",s.FirstName, s.LastName);
Sequential searches are adequate for small data samples. They are the only option for unsorted information or for cases when every item must be checked. They are very inefficient for large amounts of sorted data. Consider a list of 1,000 names. On average it will take 500 compares to find the item you want. Half of your searches will find the item in the bottom half of the data but another half will take until the last half of the data. If the data is searched a hundred times a day that means 50,000 compares. That may not take too long on a modern computer but it gets rapidly worse as the amount of data items or searches required increases.
A binary search is how you probably look up words in a dictionary or numbers in a telephone book. You jump to the middle and see if what you want is in the first half or the second half of the book. Then you go to the middle of that half and repeat the process. In a short number of compares you will find yourself at the right place. This is a lot faster then starting at the beginning of the book and checking each item until you get to the one you want.
Using a binary search, you can reach any of 128 items in 7 tries or fewer. To find an item in a list of 256 takes 8 or fewer tries. As you can see, each doubling of the number of elements only increases the maximum number of compares by one. This is the maximum because it is possible to reach an item in fewer compares.
A binary search only works with sorted data because it involves finding where an item relates to a position of the sorted list. If the dictionary or the phone book were not sorted you would not be able to use this method to find words or phone numbers.
Once you understand the theory it is not difficult to code this algorithm in C#. First you must identify the top and bottom of the array. All arrays start at zero and the Length property returns the length. The index of the last element in an array is Length – 1.
The next step is finding the middle and comparing what is there to the value you are looking for. If there is a match, the search is over. If there is not a match then the search range must be adjusted. If the item you are looking for is higher then the middle then the element above the middle becomes the new bottom of the area to search. If the element you are looking for is lower then the middle, the element below the middle becomes the new top of the search area. In either case you do not have to check the middle location again.
Eventually one of two cases must happen. Either a match is found or the bottom gets a larger value then the top. When the latter happens it means that the item you are looking for is not in the list. This is shown in the following code.
public static int BinarySearch(Student [] studentArray, Student whoFind)
{
int bottom = 0;
int top = studentArray.Length - 1;
int middle = (top + bottom)/2;
while (top >= bottom )
{
switch (studentArray[middle].LastName.CompareTo(whoFind.LastName))
{
case 1 : // look below
top = middle - 1;
break;
case -1 : // look above
bottom = middle + 1;
break;
default: // found it
return middle;
}
middle = (top + bottom)/2;
}
return -1; // Item not found
}
The CompareTo method is part of the string class and is used to compare the names in two Student objects. Numeric fields can also be used in searches but a set of if statements might be easier to use in that case.
Sorting means to place in order. You have probably sorted things most of your life. You have put things in order by size, by number, and alphabetized lots of lists. You have probably been asked to put the encyclopedia back into order as well. Have you ever really thought about the actual process of sorting? Sorting actually involves many steps that you do without even thinking. In fact you probably do not even realize it but you sort different things in different ways. Sometimes you place things in small groups or stacks and sort each group before putting the groups in order. Sometimes you create a sorted pile and an unsorted pile. Then you look at each unsorted item, locate the right place for it in the sorted pile and slide it in to place. You can probably think of other ways you sort things if you try.
For you to write a computer program to sort information you actually have to think about all the steps involved and determine how to do them.
In many sort algorithms, the first step is to compare two items and see if one is greater or less than the other. It the items you want are not in the correct order, then you must change or “swap” the order. That means that each must be placed in the others position.
Swapping is not as simple at it seems. If you copy one item into a second item the second item is lost forever. The second item must be saved before a new value can replace it. Temporary variables are used for this purpose. The temporary variable holds a value during the swap so that it is not lost. The following code swaps two values.
Temp = myValue;
myValue = yourValue;
yourValue = Temp;
The temporary variable must be the same type as the variables it is used to swap. Swapping two values of a given class requires a temporary variable of that class. You will see swapping code like this in a number of the sort algorithms in this chapter.
One method of sorting information is called the Selection Sort. This sort method works by selecting one number at a time and putting it in order. The first step is to select the highest value in the list. The value that is last in the list is swapped with the largest value in the list. The next step is to find the second largest value and swap it into the next to last position. This continues until all the values are in the proper order.
The figures below show how this works with a small set of data. Fig 12-x Unsorted Numbers shows a list of numbers in random order.


The largest number in the list is 9. This number will be swapped with the 7 in the last box as shown in Fig. 12-x First Swap.
Fig. 12-x First Swap
After the first swap, the 9 is now in the correct position. (Fig. 12-x Second Swap) The 8 in the first position is the next largest value and must be swapped with the 1 in the next to last position.

Fig. 12-x Second Swap
After the second swap, the 8 and 9 are both in the correct position. The next largest value is the 7. Fig. 12-x Third Swap shows that the 7 and 3 must be swapped.

Fig. 12-x Third Swap
Fig. 12-xThree in Order shows the list after three swaps have been completed.

Fig. 12-x Three in Order
This process will continue until all the numbers are in order.
Programming a selection sort starts search for the largest value in the list. This is done with a loop inside another loop. The outside loop moves a pointer to the last element in the array as each value is place at the end.
The heart of the selection sort is choosing the highest value so that it can be placed into the end of the list. The following code is part of a routine to sort a Student array passed as studentArray. The variable last holds the pointer to the last unsorted element in the array. Initially there are no sorted elements so this is the last element in the array. As elements of the array are compared, the program keeps track of the index of the highest value. The variable largest does not hold the highest value. It holds the index into the array of the highest value. No values are moved until the search is completed. When the search is completed, the largest unsorted value will be placed at the end of the array and the current value at that location will replace it in the unsorted area of the list.
public static void SelectionSort(Student [] studentArray)
{
int index;
int last = studentArray.Length;
int largest;
Student temp = new Student();
while (last > 1)
{
largest = 0;
// Find the largest value
for (index = 1; index < last; index++)
if (studentArray[largest].LastName.CompareTo(studentArray[index].LastName) == -1)
largest = index;
// Place the largest value at the end of the array
temp = studentArray[largest];
studentArray[largest] = studentArray[last-1];
studentArray[last-1] = temp;
last--;
}
}
The counter variable (last) for the outer loop sets the terminating value for the inner loop. As values move into their correct location at the end of the array they no longer need to be compared. The selection loop will compare fewer and fewer elements of the array as more of the values are in their proper place.
The insertion sort works differently from the selection sort. It does not swap elements. The insertion sort slides them to the right.
public static void InsertionSort(Student [] studentArray)
{
int i, j, k;
int max = studentArray.Length;
Student current = new Student();
for (i = 1; i < max; i++)
{
current = studentArray[i];
// Find the correct position for current element
// among the first i-1 elements
for (j = 0; j< i; j++)
if (studentArray[j].LastName.CompareTo(current.LastName) >= 0)
break;
// Slide all the elements after this location one
// element to the right
for (k = i-1; k >= j; k--)
studentArray[k+1] = studentArray[k];
// Insert current element in its new position
studentArray[j] = current;
}
}
A sequential search starts at the beginning of the data to be searched. Each piece of data from the beginning will be compared until the end of the data is read or a match for the search is found. On average, half the data in the file will have to be compared before the search is completed. The times when information is found near the beginning of the data will be balanced by the times when the information will not be found until near the end of the data. If the information is not in the data then all the data will have to be compared before that can be determined.
A binary search starts in the middle of the data. If the information being sought is below the middle then the search will continue in the middle of the lower half of the data. The search will continue to identify a range of data where the information might be found and compare the middle piece of the range. The search will finish when the information is found or when there is no possible search range to check. Only a small number of data pieces relative to the size of the data will ever have to be compared.
Sorting data involves changing the location of information to place it in a specific order. A programmer must be careful when moving a piece of information to a new location that the information already in that location is not lost. The use of temporary variables to hold information during the sorting process helps prevent losing data.
The selection sort works by first finding the piece of information that belongs at one end of the data. The information that belongs at the end is put at the end. The program then searches for the information that belongs in the position next to the end. This process speeds up as more information is put in its proper place and fewer pieces are left to search. Only one piece of data is moved closer to its final location on each swap.
1. Why is a temporary variable needed when the values in two variables must be swapped?
2. Why can sorted data be searched faster then unsorted data?
3. What things determine the value of sorting data before searching?
4. Does having data in sorted order improve the speed of sequential searches? Why or why not?
5. What kind of data is required for a binary search to work properly?
6. What kind of data is required for a sequential search to work properly?
7. What other kinds of data besides arrays may be sorted?
8. Which type of search must compare the most pieces of information on average? Sequential or binary?
Key A key is a field in a record that identifies the whole record. Keys are used to help locate larger pieces of information.
Parallel Arrays Two or more arrays of the same length that sort related information. The information in a location or index in one array is related to the information in the other arrays that have the same location or index. When parallel arrays are sorted all related arrays must be moved in the same ways.
Selection Sort A selection sort orders information by selecting individual items and placing them in their correct position one at a time.
Sort Key A sort key is a field or array that is used to sort data. All the data in a record or group of parallel arrays will be moved during a sort but the sort key is the field that is used to determine the new order of the data.
Temporary variable A temporary variable is a variable that is used to hold information while an operation is processed or a permanent location is being prepared.
Projects