An iterator is just a class that defines an __iter__(self)
method. __iter__(self)
function is automatically called when iter(inst)
is called, where inst
is an instance of this class. After performing beginning-of-iteration initialization, the __iter__()
method returns any object that implements a __next__()
method. Sometimes it just return self
, because the class can implement __next__()
. __iter__()
is a good place to initialize the iterator with initial values.
The __next__()
method is called when next()
is called on an iterator of an instance of a class. It should raise a StopIteration
exception to stop generating values. To spit out the next value, simply return
s the value (do not use yield
!).
An example:
class Fib:
''' Generate Fibonacci series '''
def __init__(self, max):
self.max = max # class variable
def __iter__(self):
self.a = 0 # perform initialization
self.b = 1
return self
def __next__(self):
fib = self.a
if fib > self.max
raise StopIteration # stop
self.a, self.b = self.b, self.a + self.b
return fib # next value
To use iterator in for
loop:
for n in Fib(1000):
doSomething...
for
loop automatically creates the iterator, calls next()
on it, and stop when StopIteration
is raised.
There are several special methods with iterator:
__reversed__
: return a reversed iterator. The built-inreversed()
function calls that. Reversed iteration only works if the object in question has a size that can be determined or__reversed__
is implemented.enumerate(iter)
returns an instance of anenumerate
object, which is an iterator that returns successive tuples consisting of a counter (index) and the value returned by callingnext()
on the sequence you’ve passed in. It is useful to keep track of indexes in iteration.zip(a, b)
creates an iterator that produces tuples(x, y)
where x is taken from a and y is taken from b. One can iterate over multiple iterators at the same time. Iteration stops whenever one of the input sequences is exhausted. If this behavior is not desired, useitertools.zip_longest()
instead.zip
can also be used to create dictionary from two iterators:dict(zip(headers,values))
.
iter
method
It returns an itertor
object.
iter(obj)
:obj
must be a collection object that supports the iteration protocol or the sequence protocol, orTypeError
is raised.iter(callable, sentinel)
: it will callcallable
with no arguments for each call to its__next__()
method; if the value returned is equal tosentinel
,StopIteration
will be raised, otherwise the value will be returned.
One useful application of the second form is to read lines of a file until a certain line is reached.
with open('mydata.txt') as fp:
for line in iter(fp.readline, ''):
process_line(line)
itertools
module
It provides many useful functions on iterators.
itertools.islice(iter, start, end)
slices an iterator. It achieves this by going through the iterator and discard unwanted items.end
can beNone
if it returns everything beyondstart
.itertools.dropwhile(func, iter)
discards the first items initer
as long as the supplied function returnsTrue
.itertools.permutations(items, num)
generates permutations ofitems
with lengthnum
. Ifnum
is not specified, the permutation is the same length ofitems.
itertools.combinations(items, num)
generates combinations.itertools.combinations_with_replacement()
allows repetitions with same item initems
.itertools.chain()
chains multiple iterators together. It masks the actual type of each underlying iterators. It is more efficient than combining the sequences and then iterating.itertools.product(a_list, b_list)
: returns an iterator over all Cartesian product of two sequences.itertools.groupby(a_list, key)
: returns an iterator of iterators which groups elements ina_list
by keys generated bykey
function. It only works ifa_list
is sorted by key.itertools.zip_longest()
: does the same thing as the built-inzip()
except it stops at the end of the longest sequence, paddingNone
values for shorter sequences.
Unpacking
Any sequence (or iterable) can be unpacked into variables using a simple assignment operation. The only requirement is that the number of variables and structure match the sequence.
x, y = (4, 5)
a, b, c = [1, 2, 3]
a, b, c = {1, 2, 3}
m, n = {1: "1", 2: "2"} # m = 1, n = 2
Unpacking actually works with any object that happens to be iterable, not just tuples or lists. This includes strings, files, iterators, and generators. When unpacking, you may sometimes want to discard certain values. Python has no special syntax for this, but you can often just pick a throwaway variable name for it. For example:
x, y, _ = [1, 2, 3]
To unpack an iterable longer than the number of variables, use “star expression”:
first, *middle, last = iterable
It’s worth noting that the middle
variable will always be a list, regardless of how many items are unpacked (including none). There is a certain similarity between star unpacking and list-processing features of various functional languages. For example, if you have a list, you can easily split it into head and tail components:
head, *tail = items