Saturday, March 19, 2016

Python generators are pluggable

By Vasudev Ram


Generator image attribution

While working on a Python project, it crossed my mind that generators could be of use in it. A little research made me realize that generators are pluggable, i.e. they can be passed to functions, and then be used within those functions. This is because generators are a kind of Python object, and any Python object can be passed as an argument to a function.

This in turn is because almost everything in Python is an object (including generators), similar to how almost everything in Unix is a file. Both those concepts can enable some powerful operations.

Here is a program that demonstrates passing generator objects as arguments to another function, and then using those generators inside it:
# Program to show that generators are pluggable, i.e.,
# can be passed as function arguments, and then used 
# inside those functions to which they are passed.
# Author: Vasudev Ram - http://jugad2.blogspot.com
# Copyright 2016 Vasudev Ram

def gen_squares(fro, to):
    '''A generator function that returns a generator 
    that returns squares of values in a range.'''
    for val in range(fro, to + 1):
        yield val * val

def gen_cubes(fro, to):
    '''A generator function that returns a generator 
    that returns cubes of values in a range.'''
    for val in range(fro, to + 1):
        yield val * val * val

def use(gen):
    print "In use() function:"
    print "Using:", gen
    print "Items:",
    for item in gen:
        print item,
    print

print "Pluggable Python generators.\n"
print "In main module:"
print "type(use): ", type(use)
print "use:", use
print
print "type(gen_squares): ", type(gen_squares)
print "gen_squares: ", gen_squares
print "type(gen_squares(1, 5)): ", type(gen_squares(1, 5))
print "gen_squares(1, 5): ", gen_squares(1, 5)
print
print "type(gen_cubes): ", type(gen_cubes)
print "gen_cubes: ", gen_cubes
print "type(gen_cubes(1, 5)): ", type(gen_cubes(1, 5))
print "gen_cubes(1, 5): ", gen_cubes(1, 5)
print
for gen_obj in (gen_squares(1, 5), gen_cubes(1, 5)):
    use(gen_obj)
    print
Run the program with:
python pluggable_generators.py
Here is the output:
Pluggable Python generators.

In main module:
type(use):  <type 'function'>
use: <function use at 0x0202C3B0>

type(gen_squares):  <type 'function'>
gen_squares:  <function gen_squares at 0x0207BF30>
type(gen_squares(1, 5)):  <type 'generator'>
gen_squares(1, 5):  <generator object gen_squares at 0x020869B8>

type(gen_cubes):  <type 'function'>
gen_cubes:  <function gen_cubes at 0x0207BFB0>
type(gen_cubes(1, 5)):  <type 'generator'>
gen_cubes(1, 5):  <generator object gen_cubes at 0x020869B8>

In use() function:
Using: <generator object gen_squares at 0x020869B8>
Items: 1 4 9 16 25

In use() function:
Using: <generator object gen_cubes at 0x020869E0>
Items: 1 8 27 64 125
As you can see, I've printed both type(obj) and obj for many of the objects shown, to make it more clear what is going on. Also, a generator function and a generator object (the result of calling a generator function), are two different things, so they are printed separately as well.

A few points about generators and their use:

They can potentially lead to less memory usage, since values are only generated on demand, i.e. evaluation is lazy.

They can help with separation of concerns, a key technique that leads to program modularity; the code for the actual generator functions like gen_squares and gen_cubes does not have to be embedded in the use() function, which makes both the generators and the use() function more reusable.

Someone could say here that we could write gen_squares and gen_cubes as regular functions instead of as generator functions, and then just call them from use(), so their code still does not have to be embedded in the use() function, and that would be right. But in that case, the calls to them would return lists, and if the lists were very large, that would use a lot of memory, and maybe crash or slow down the program. Those issues will not happen with generators, though, because each item is generated just before it is used, and then it is thrown away, not stored. So the memory needed is not proportional to the number of items generated.

Here are some links about Python generators:

Generators - Python Wiki

Stack Overflow - Understanding generators in Python

The image at the top of the post is a Ferranti two-phase AC generator set.

- Vasudev Ram - Online Python training and programming

Signup to hear about new products and services I create.

Posts about Python  Posts about xtopdf

My ActiveState recipes

5 comments:

Vasudev Ram said...

Oops. I initially did not escape the less-than and greater-than characters in the output shown in the above post. So some of the output has disappeared (interpreted as HTML tags by the browser). Fixed now, but people seeing the post via blog aggregators / feed readers may not see the right output, if it was picked up before the correction was made. Sorry, readers, for that.

Wil Cooley said...

If you don't need parameterization, you can also use a generator expression:

cubes = (i**3 for i in range(10))

Vasudev Ram said...

Good point.

But the purpose of this post was to show parameterization using generators, for its potential benefits as described. Also, a generator function is more general than a generator expression, because it can include many statements in it, that do stuff, as part of computing values to be yielded, whereas a generator expression has to be a single expression (by definition, even though it can include a few ifs in it).

However, I agree that it is more concise to use a generator expression when possible.

Wil Cooley said...

I felt it worth mentioning because being able to name and pass generator expressions is surprising and sometimes passed over in introductory writings.

It's an interesting asymmetry that, while generator functions are just functions and can be passed like functions, generator expressions can be named and passed, but ordinary expressions require "lambda" (unless there's something I've missed, which is entirely possible).

Vasudev Ram said...

Interesting indeed. I've noticed that Python has some other such things. Another one is the anomaly about Python 2 being able to compare objects of different types, for which no sensible (or default) ordering exists - something like that - I forget the exact details right now. And that is fixed in Python 3.