Iterators
Goals
- Understand the purpose of the iterator pattern.
- Use iterators in direct form.
- Implement
Iterator<E>
andIterable<T>
. - Use a custom iterable in an enhanced
for(…)
loop.
Concepts
- abstract data type (ADT)
- data structure
- design pattern
- iterable
- iterate
- iterator
- state
Language
for(…)
Library
java.lang.Iterable<T>
java.lang.Iterable.iterator()
java.util.Iterator<E>
java.util.Iterator.hasNext()
java.util.Iterator.next()
java.lang.Iterator.remove()
java.util.ListIterator<E>
Lesson
Abstract Data Types
The name of the LinkedList
interface we have been implementing so far has been a little too restrictive. The interface has a getLength()
method, and you can access any element by index using get(int index)
. But nothing about this interface requires that you use a linked list specifically. You'll note that Java arrays also provide a length, and also allow you to look up elements based upon an index. Arrays don't provide a get(int index)
method, but you could wrap a class around an array and provide your own get(…)
method to get access to the underlying array elements.
Arrays and linked lists are examples of data structures, and their names indicate how information is literally arranged in memory. We can step back and talk about information on a higher level: how the information is logically arranged and accessed. At this level, we could say that arrays and linked lists provide different implementations of how one might store a list
of items. In this view list
is an abstract data type (ADT) and describes how data is logically stored and accessed. Array
and linked list
are two data structures that can be used to implement a list
abstract data type.
With this more refined view, let's rename the LinkedList
interface to simply ListADT
for list abstract data type
. We'll also remove the getFirstNode()
method, because nodes are specific to graph-based data structures such as linked lists, but aren't necessary for other data structures such as arrays.
Array-Based List
To illustrate how the list ADT is distinct from the data structure used in implementation, we can construct a list ADT that is based on an array rather than a series of linked nodes.
Iterator<E>
When working with your LinkedListImpl
you might have stepped through the elements with a for(…)
loop, using an index variable to look up each element using get(int index)
. You might have realized that this approach is somewhat inefficient, because finding a particular node in a linked list required walking through and counting all the nodes, one at a time, until you get to the requested node. The earlier LinkedList
interface exposed a getFirstNode()
method that would have allowed a more efficient traversal of the nodes, but we have removed that from ListADT<E>
, and for good reason: it wouldn't work for other list ADT implementation such as ArrayListADT<E>
.
Java provides an iterator interface java.util.Iterator<E>
which allows us to step through or iterate across a series of items. It keeps track of the underlying state for us, shielding the developer from the underlying implementation of the ADT. If the ADT is based on an array, for instance, the iterator will keep track of the current index. If the ADT is based on a linked list, the ADT will keep track of the current node. The Iterator<E>
interface is simple; it has only two methods that must be implemented (although it has others with default implementations): Iterator.hasNext()
and Iterator.next()
.
If you had an iterator to a ListADT
, for example, you could use the following pattern without worrying about the underlying implementation.
Now we need to get an iterator to our ListADT<E>
. This is usually done by asking the ListADT<E>
itself for an iterator, as the list must provide an iterator that knows about its internal implementation details. We can thus add a method newIterator()
to the ListADT<E>
interface.
Because the iterator must know about the specific implementation of the ListADT<E>
, this indicates that there is a very close relationship between the iterator and list implementation. It is also unlikely that an iterator for a specific list type could be used generally by other classes. These considerations indicate that it might be a good idea to make our iterator implementation an inner class of the list implementation class. Because the iterator will need to access actual data from the list ADT, we will make the iterator a non-static inner class. The iterator will use an internal index
variable to keep track of the state of iteration—in this case the current position in the list.
Now you can use the iterator to iterate over a list.
Removing Items
The Iterable<E>
interface also provides an Iterator.remove()
method. By default this method throws an UnsupportedOperationException
, but if the underlying ADT is mutable and supports removal its iterator will implement Iterable.remove()
so that it removes the last retrieved item from the iterator. Here's how it could be used to remove all low miles-per-gallon vehicles from a list.
ListIterator<E>
An Iterator<E>
is extremely simple, allowing one to walk through any sequence of items in one direction. Java provides an iterator sub-interface called java.util.ListIterator<E>
which provides additional functionality for iterating sequences that behave like lists. A ListIterator<E>
has methods for going forwards and backwards in a sequence, and can even provide the previous and next indexes in a list.
Iterable<T>
Java further enhanced and standardized the way to retrieve iterators from an ADT using an iterable. The java.lang.Iterable<T>
interface indicates that this object knows how to return a new iterator to its items
and provides one central method for retrieving this iterator: Iterable.iterator()
. Java chose to use the simple method name "iterator()
" rather than "newIterator()
" as we used above, but the method functions the same.
We can consequently remove the newIterator()
method from our ListADT<E>
interface and simply extend the Iterable<T>
interface.
Enhanced for(…)
Loop
The Iterable<T>
interface does more than simply provide a standard way for retrieving iterators, however. Any Iterable<T>
implementation can be used in an enhanced for(…)
loop! Because our ListADT<E>
interface (and our ArrayListADT<E>
implementation) now ultimately implement Iterable<T>
, our printAnimals()
method becomes a lot simpler.
Yes, it really is that simple. Behind the scenes Java will invoke animals.iterator()
and use similar logic to what we used above, calling hasNext()
and next()
to step through the elements.
Review
Think About It
How an Iterator<E>
works inside may seem a little mysterious at first, but you should remember that an Iterator<E>
merely does the same job you would do if you were to iterate over your data structure at a low level. If you were manually iterating over an array (not using an enhanced for(…)
loop), you would create an index variable an increment it each time you go to the next item. An Iterator<E>
simply internalizes or encapsulates this variable, allowing you to update the variable using the Iterator<E>
methods.
If you were iterating over a linked list manually, you would create a variable pointing to a particular node. When it came time to go to the next node, you would check the reference to the next node and then update
the node reference to point to the next node, and so on until you run out of nodes. The Iterator<E>
of a linked list then will do the same thing. It will not be based upon an index, but on a reference to a next node
that is updated.
We say that an Iterator<E>
"encapsulates state", or the current conditions of iteration. When we work with an Iterator<E>
, it updates some internal information (whether that is an index or a node reference), so that when we come back to it the internal information is in the same state as we left it. An Iterator<E>
will keep track of this state, updating it as necessary, until we no longer reference the Iterator<E>
(such as when it goes out of scope) and it is discarded by the garbage collector.
Self Evaluation
- What will happen if you call
Iterator.next()
twice in a row? - When is it better to use an enhanced
for(…)
loop, and when is it better to work directly with iterators?
Task
Create an Iterator<E>
implementation for your LinkedListImpl<E>
class.
- Rename
LinkedList<E>
interface toListADT<E>
and have it extendIterable<T>
as illustrated above. - Remove the
getFirstNode()
method fromListADT<E>
and its implementations. - Rename your
LinkedListImpl<E>
class toLinkedListADT<E>
. - Implement an appropriate
Iterator<E>
for yourLinkedListADT<E>
class. - Create a unit test for the iterator that verifies each value returned for some linked list of
String
s.
Add two methods getKeys()
and getValues()
to your hash table. You should be able to use these methods in an enhanced for(…)
loop to iterate over the keys and values, respectively, of your hash table. The order of the returned values does not matter.
Acknowledgments
Iterable<T>
,Iterator<E>
, andListIterator<E>
code snippets extracted from source code Copyright © 1997, 2003, 2011, 2013, Oracle and/or its affiliates.- Some symbols are from Font Awesome by Dave Gandy.