Lesson 3 Linked Lists
n
Understand what kinds of problems require sophisticated methods of
dynamic memory manipulation and the properties of a data structure that can
solve these types of problems.
n Understand the characteristics of a linked list that make it suitable for solving some kinds of problems, but not others.
n Visualize the logical structure of a linked list.
n Understand the operations on a linked list.
n Learn how to use pointers in C++ to manipulate dynamic data and to represent a linked list.
n Understand the difference between the address of a memory location and the value stored in a memory location.
n Understand how computer memory is organized to support different data types.
n Understand how the logical structure of a linked list can be independent of its structure in memory.
n Understand the costs and benefits of using linked lists.
Address-of operator (&) The C++ operator that, when applied to a variable, returns the specific location of that variable in computer memory.
Arrow operator (->) The C++ operator that allows you to select a member of an object or struct that is referenced by a pointer variable.
Dereference The operation by which a program uses a pointer to access the contents of dynamic memory.
Empty link (null pointer) A pointer value in a linked list (or other data structure) used to signify that no specific memory location is being referenced.
External pointers Links in a linked list (or other data structure) that are not contained in nodes within the structure.
Garbage collector A special module in some programming languages that automatically recovers unused dynamic memory when it is needed.
Heap store An area of computer memory where storage for dynamic data is available.
Heap underflow A condition known in which new dynamic memory can no longer be obtained from the heap.
Link (pointer) A pointer within a node of a data structure that references another node in the structure.
Linked list A list of data items where each item is linked to the next one by means of a pointer.
Logical structure The relationship among nodes in a data structure that is indicated by pointers instead of physical position in memory.
Memory leakage A program's exhausting the supply of dynamic memory because locations referenced by pointers have not been returned to the heap when they are no longer needed.
Random access data structure A structure in which we have the ability to access any elements without first accessing all preceding elements.
Run-time stack An area of memory used for the variables and data necessary for function calls, including the main program function.
Sequential access data structure A structure in which, to access a data element, we must start with a pointer to the first node and run a sequence of n - 1 next operations to reach the node at the nth position.
Sequential traversal A traversal that visits each data item in a list, from the beginning in sequence to the end.
To understand what linked lists are is to reflect on the analogy of tape players and CD’s: An array (as we used in previous lessons) is like a CD because random access to data is provided. For example, it would take the same amount of time to retrieve information from location a[3] as it would to reach data in a[100]. A linked list, on the other hand, works more like a tape. For example, when trying to access the fourth song on a tape, one must go through songs one through three. Similarly, linked lists have sequential access such as this. The benefit of linked lists lies within the behavior of inserting and deleting. Linked lists allows for more efficient use of insertion and deletion operations.
Look at the source code files for Example 3-1 are Fileprob.cpp and myfile.old.
Important Items:
· The file input problem. Discuss the original size of the apvector (list) when it is declared. There are 2 downfalls to declaring such a size: 1) It may be too large (causing wasted space) and 2) It may be too small (causing a logic error). One way around this is to use the resize operation. This is a great solution, but point out to students that there is still a price, particularly for large files (space).
·
The data movement problem. Discuss the behavior of
data when it is stored in a vector. When an item is inserted into the middle,
all data following that item must also be moved. Likewise, when a data item is
removed from a vector, all data items following it must be moved. These
situations were outlined in addFirst and removeFirst operations of the ordered
collection class. Similarly, inserting and removing data from files can lack
efficiency. The linked list ADT will allow insertions or removals without
causing the physical movement of any other data values in the list.
1. What are some of the shortcomings of previous methods we used for lists of information?
2. How are linked lists different from vectors?
3. In what ways are linked lists more efficient than using arrays?
Important Items:
· The characteristics and logical structure of a linked list. As you are already familiar with the concept of pointers and arrays, understanding nodes will be an easy concept for you. To understand a node is point at another student. The student is the value and your finger is the pointer.
· “Nodes” are similar to cells in arrays except that instead of only containing data, they also contain a link (or pointer) to the next node. The last node usually points to 0 (null).
·
Linked list operations. When creating a linked list
it is important to initialize three pointers (current, previous, first) and the
length. (Why are the three pointers necessary?)
Whenever a move forward is made the current pointer is moved to the next node after the current one and the previous pointer is moved ahead one node. Look at the figures in the book (see Figure 3-4).
Think of the situation when an array is accessed beyond its length. This will help you to detect the end of the list with nodes.
Look at the examples in the book.
Refer to a list of numbers (or other data) to use as examples to see the flow of
information. Four different cases are discussed in the text: empty list, if the
current pointer is pointing at the first node and the new data element is to be
inserted before the first node, the new data item is to be placed in the middle
of other pre-existing nodes, or if the new data is just to be appended to the
end of the list. Figure 3-12 offers an example, step by step of the process of
inserting a node into the middle of a linked list.
· Removing data from the list. Intuitively, now that you know how nodes work with linked lists, you may have an idea about how the remove operation works. Think of how the current and previous pointers behave. This will help you to start thinking about the remove operation. Again, a step by step process for this operation are offered in Figure 3-13.
·
Using a linked list to solve problems. Walk through
the code offered in Example 3-1 using actual data.
This section is concerned with developing the C++ class
template for a linked list. Handout
3.3A and 3.3B have the declaration file for linklist.h and the operations for
the ADT (given in the last section) on separate sheets which may help you
compare.
The source code for this section is contained in the Linklist.h
file.
Important items:
·
Defining a pointer to a node. Students should be
familiar at this point with the idea of a struct for representing data. This is
the most logical way to represent a node. To define a pointer, the syntax used:
<type of object pointed to> * <pointer variable name>
·
Declaring the get_node function. The line of code:
node * getNode(const E &item);
This function expects a data element as a parameter (item). It will return a pointer to a new node(node *) that contains the data element and a null pointer.
· Creating a linked list. The default constructor should set first, current, and previous pointers all to null and length to zero.
· The copy constructor for a linked list. While looking at the code for the copy constructor, go over each of the following steps as they occur.
To copy all of the data from the parameter list into the receiver list, the following steps are taken:
1. The probe pointer starts at the first node of the list.
2. The loop control condition returns false as long as probe points at a node in the list, but returns true when probe has reached the end of the list.
3. The expression probe -> data accesses the data component in the node pointed to by probe. This value is passed to insert to add it to the end of the receiver list.
4. The assignment statement probe = probe -> next; sets probe to the value of the next component in the node pointed to by probe. This has the effect of advancing probe to the next node in the list.
Also note the syntax for accessing a member of a node is:
<pointer variable> -> <node member name> (the arrow operator directs the computer to follow the arrow from the pointer variable to the designated member in the node.)
· Dereference. This is the operation by which a program uses a pointer to access the contents of dynamic memory. It’s also seen by using *.
· Selection. This is the process by which a member of a struct or class is accessed. It’s also seen by using . . Therefore, probe -> next is the same as *probe.next.
· The destructor for linked lists. Go through the code for the destructor to see how the nodes in the list are returned to the system by using first, empty and remove operations.
· The first and next operations. By walking through the code in both operations, you will see the current and previous pointers are changed.
· Accessing and modifying data in a node. Note that the user must test for the end of the list. If this is not done, the assert function will let the user know.
· Testing for the end of the list. By walking through this short line of code, you will see how either a TRUE or FALSE will be returned depending on what/ where the list is at. If the list is empty a TRUE is returned, or (because myCurrent is a node pointer) if myCurrent is pointing at a 0 (NULL) .
·
Inserting data into a linked list.
· Implementing getNode.
·
Removing data from a linked list.
It is important to either walk through the linked list code, using actual data. For example, if all of the items in the list were integers. Note the delete function is new here, it returns the memory pointed to by its operand to the system.
|
Operation |
Preconditions |
Postconditions |
|
create |
The list is in an unknown state. |
The list is empty. |
|
empty |
The list is initialized. |
Returns true if empty, false otherwise. |
|
atEnd |
The list is initialized. |
Returns true if the current pointer has run off the end of the list, false otherwise. |
|
length |
The list is initialized. |
Returns the number of nodes in the list. |
|
first |
The list is initialized. |
Moves the current pointer to the first node in the list. |
|
next |
The current pointer points to a node. |
Advances the current and previous pointers ahead by one node. |
|
access |
The current pointer points to a node. |
Returns the data element stored in the current node. |
|
modify(newData) |
The current pointer points to a node. |
Sets the data in the current node to the new data. |
|
insert(newData) |
The list is initialized. |
If atEnd is true, then inserts the new data into a new node at the end of the list. Otherwise, inserts the new data into a new node before the current one. Makes the new node the current one. |
|
remove |
The current pointer points to a node. |
Removes the current node from the list, makes the node following this node the current one, and returns the removed data to the caller. |
Handout 3.3B
|
// Class declaration file: linklist.h #ifndef LINKLIST_H template <class E> class LinkedList
// assignment
// accessors
// modifiers E remove(); private:
// Data members // External pointers to the list
node * myFirst;
// The number of nodes in the list
int mySize;
// Member functions
node * getNode(const E &item); |
Important Items:
· Addresses and values of simple variables. So far we have illustrated memory locations where values can be stored/ accessed in an abstract way. By looking at the figures in the text, you will see the way in which integers and real numbers are stored in memory, both abstractly and at the machine level. note that two cells are required for the real number.
· Address and values of vector variables. When looking at the figures in this section, note that the binary address of the first cell in the vector is how the machine “names” the vector. Each cell following is 1 greater than the previous one.
· Addresses and values of pointer variables. The address-of operator (&) is used to access the address of a variable. It obtains the address of any C++ variable and stores this value in a pointer variable. The example given in this lesson helps illustrate this point. (Note how the machine knows the difference between storing an integer value like 10 and an address value like 1000. The deference operator (*) works as the inverse of the address-of operator. It takes a pointer value and returns the value being stored at that address. The segment of code and figures (3-18 and 3-19) will help you understand this concept.
· Pointers and dynamic allocation of memory. You will greatly benefit by carefully inspecting each of the figures in this section(3-21 through 3-25) . A dynamic data structure reserves memory as it is needed where a static data structure occupies the same amount of memory every time the program is run.
· · The costs of using linked lists. Access time is an issue. Random access indexing with linked lists cannot be used. This is because the logical structure of a linked list is not dependant on its physical structure in memory. Instead, linked lists are a sequential access data structure, where we start with a pointer to the first node and step through the nodes in sequence until we reach the one we are after. Therefore, the time of access is dependant on the position of the element being sought in the list.
· A linked list of a million nodes would require a million such cells. By contrast, an array of the same logical size would require one million fewer memory cells. Likelihood of program errors is another potential cost of using linked lists. The manipulation of pointers can be complicated and error prone. Also, if one forgets to return dynamic memory with delete (after asking for it with the new operator) could result in dynamic memory being lost (memory leakage). If this is severe enough, a condition in which dynamic memory becomes unavailable (heap underflow) may occur.
·
The benefits of using linked lists. Two examples of
benefits are: (1) The low cost of inserting/ removing a given data element, and
(2) problems like the file input problem (or data movement problem) which use
less memory than previously studies methods.
1. Discuss the difference between storing information such as an integer and storing information such as a reference to memory. How does the machine know the difference?
2. Discuss the situations where linked list would be beneficial and when they may not.
When using the one-key table class with vectors, we needed
a fixed upper bound. If more memory was needed, the problem would not be solved.
But with the use of linked lists, this is not a problem as the memory is
provided on an as-needed basis. This section redefines and implements the
one-key table class so that it uses linked lists rather than the vector method
used in Lesson 2.
The main focus should be on the protected data members section, as it is the only change in the definition file. Also, there is no need to pass the length as a parameter as this length will be the same as the length of the linked list. Also the implementation code becomes easier, as we no longer have to deal with indices, but rather high level operations (first, next, insert, remove, etc.) will be used to manipulate pointers.
Project 3-14 in the end-of-lesson Projects involves the
large integer case which a past AP exam covered. The college board previously
offered a manual for students involving this case at: http://www.collegeboard.org/ap/computer-science/html/indx001.html.
The source code files supplied for this section are Onetable.cpp, Onetable.h, Assoc.cpp, and Assoc.h. They are contained in Lesson3. A short demo program, named Tabdriv.cpp, is also provided to verify that the one-key table implementation works correctly.
You should use the Radix.cpp file in Lesson3
as the starting point for this Case Study.