Update: fellow algorithms researcher Francisco Claude just posted a great article about using lazy evaluation to solve Tic Tac Toe games in Common Lisp. Niki (my brother) also wrote a post using generators with asynchronous prefetching to hide IO latency. Worth a read I say!
I’ve recently been obsessing over this programming idea called streams (also known as infinite lists or generators), which I learned about from the Structure and Interpretation of Computer Programs book. It is kind of like an iterator that creates its own data as you go along, and it can lead to performance increases and wonderfully readable code when you utilise them with higher order functions such as map and reduce (which many things can be rewritten in). It also allows you to express infinitely large data structures.
When regular lists are processed with higher order functions, you need to compute the entire list at each stage; if you have 100 elements, and you map a function to them, then filter them, then partially reduce them, you may be doing up to 300 operations, but what if you only want to take the first 5 elements of the result? That would be a waste, hence streams are sometimes a better choice.
Although SICP details how to do it in Scheme, in this blog post I will show some languages that have it built in – Haskell and Python – and how to implement streams yourself if you ever find yourself in a language without it1.
Haskell is a lazy language. It didn’t earn this reputation from not doing the dishes when you ask it to (although that is another reason it is lazy). What it means in the context of formal languages is that evaluation is postponed until absolutely necessary (Here is a cute (illustrated) blog post describing this lazy evaluation stuff). Take this code for example:
Prelude> let x = [1..10]
At this stage you might be tempted to say that x is the list of numbers from 1 to 10. Actually it only represents a promise that when you need that list, x is your guy. The above code that creates a list from 1 to 10 still hasn’t been executed until I finally ask it to be (by referring to x):
Prelude> x [1,2,3,4,5,6,7,8,9,10]
It is kind of like telling your mum you’ll do the dishes, but waiting until she shouts your name out again before you put down your DS. Actually, it is sliiiiightly different – if I instead wrote:
Prelude> let x = [1..10] Prelude> let y = x ++ [11..20]
I have referred to x again when I declared y, but x still hasn’t evaluated. Only after I shout y’s name will y shout x’s name and give me back my whole list:
Prelude> y [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20]
Here you ask your robot to wash half the dishes, but he is too busy playing DS too (stupid robot). Finally when your mum shouts, you shout at the robot, and he does his set of dishes, and you do yours. But what is the benefit here? It isn’t that I can get more DS time in…
Take for example a list of positive integers from 1. Yes, all of them. In other languages it might be hard to express this, but in Haskell it is as simple as
[1..]. This means we have a list of infinite integers, but we will only calculate as far as we need:
Prelude> let z = [1..] Prelude> head z 1 Prelude> take 5 z [1,2,3,4,5]
The syntax here is amazingly terse, and it may make your code more efficient. But even if we don’t have the syntax for it in another language, we can provide it ourselves very easily.
Python has a similar concept called generators, which are made using the
yield keyword in place of
return, more than one time (or in a loop) in a function:
def integers_from(N): while(1): yield N N += 1 >>> z = integers_from(1) >>> z.next() 1 >>> z.next() 2
Note: Python generators are stateful and are hence slightly different to an infinite list in Haskell. For example
z.next() returns different values in two places, and thus is time sensitive – we cannot get z to ‘rewind’ like we could in Haskell, where
z is stateless. Statelessness can lead to easier to understand code, among other benefits.
Rolling Our Own
Let’s reinvent this wheel in Python (but in a stateless manner), so if we ever find ourselves craving infinite lists we can easily roll our own in pretty much any language with Lambdas.
I have chosen Python to implement this, even though it already supports infinite lists through generators, simply because its syntax is more accessible. Indeed, the below can already be done with Python’s built-in-functions (although with state). It is probably not a great idea to do it this way in Python, as it doesn’t have tail call optimisation (unless you use this hack using decorators and exceptions).
First we’ll look at adding lazy evaluation, however the syntax requires it to be explicit:
>>> x = lambda: 5 >>> y = lambda: 2 + x()
x is not 5, and
y is not 7, they are both functions that will evaluate to that when we finally run them; the expression inside the lambda won’t be evaluated until we do so explicitly:
>>> x() 5 >>> y() 7
And that’s pretty much all the heavy lifting. To make an infinite list, we basically make a linked list where we generate each node as we need it:
def integers_from(N): return (N, lambda: integers_from(N+1)) def head((H, _)): return H def tail((_, T)): return T()
And there is our infinite list. To access it use
tail() (recursively if necessary):
>>> z = integers_from(1) >>> head(z) 1 >>> head(tail(z)) 2
First we should make a way for us to look at our streams:
def to_array(stream): return reduce(lambda a, x: a + [x], , stream)
null_stream = (None, None) def reduce(f, result, stream): if stream is null_stream: return result return reduce(f, f(result, head(stream)), tail(stream))
We needed some way to tell if we had reached the end of a stream – not all streams are infinitely long. Meet our next function, which will help us terminate a stream:
def take(N, stream): if N <= 0 or stream is null_stream: return null_stream return (head(stream), lambda: take(N-1, tail(stream)))
This will take the first
N elements from the specified stream. So now we can inspect the first
>>> to_array(take(10, integers_from(1))) [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
For our upcoming example, we also need a
filter() method, which will filter out elements that meet a provided predicate:
def filter(pred, stream): if pred(head(stream)): return (head(stream), lambda: filter(pred, tail(stream))) return filter(pred, tail(stream))
Now onto our example :)
Here is the standard example to demonstrate the terseness of streams:
def sieve(stream): h = head(stream) return (h, lambda: sieve( filter(lambda x: x%h != 0, tail(stream))))
Here is a function which recursively filters anything which is divisible by any number we have previously seen in our stream. Math aficionados will notice that this is the Sieve of Eratosthenes algorithm.
>>> primes = sieve(integers_from(2)) >>> to_array(take(10, primes)) [2, 3, 5, 7, 11, 13, 17, 19, 23, 29]
Recursively defined data, and only as much of it as we want – pretty neat.
And Now For Something Almost Completely Different
When I first saw this, I wondered what application there might be to have a stream of functions. Here I have defined a stream which recursively applies a function to itself:
def rec_stream(f): return (f, lambda: rec_stream(lambda x: f(f(x))))
When might this be useful? It might yield speed improvements if you commonly want to recursively apply a function a certain amount of times, but have costly branching (so the condition check at each level is slow). It could also be used as a abstraction for recursive iteration 2, which gives you back the function ‘already recursed’ so to speak (although lazily).
One such recursive process I can think of is Newton’s method for approximating roots, defined recursively as:
f(x) = 0.
The more iterations you do the more accurate the solution becomes. One use of Newton’s method is to use it until you have reached a certain error tolerance. Another way, which I learned about recently when reading about the fast inverse square root algorithm, which uses just one step of Newton’s method as a cheap way to improve it’s (already pretty good) initial guess. There is a really great article here which explains it very well.
After reading that, I wondered about a stream that would consist of functions of increasing accuracy of Newton’s method.
def newton(f, fdash): return lambda x: x - f(x)/float(fdash(x))
newton() function accepts
f'(x), and returns a function that accepts a first guess.
def newton_solver(iters, f, fdash): def solve(v): n = newton(lambda x: f(x) - v, fdash) stream = rec_stream(n) return to_array(take(iters, stream))[-1] return solve
This one is a little more complicated. In order to have it solve for a value other than zero, I needed to either define it in
f(x) must equal zero, but I didn’t want the user to have to iterate over the stream each time they wanted to compute the square root of a different number, say. To allow it to return a function that solved for square roots in the general case, I had to make the internal function
solve(), which would bind for the value the caller specifies, hence solving
f(x) = v for
x. Hopefully this becomes clearer with an example:
>>> sqrt = newton_solver(1, lambda x: x**2, lambda x: 2*x) # 1 iter >>> sqrt(64)(4) # Sqrt of 64 with initial guess of 4 10.0 >>> sqrt = newton_solver(3, lambda x: x**2, lambda x: 2*x) # 3 iters >>> sqrt(64)(4) 8.000000371689179
Now we can pass around this square root function and it will always do 3 iterations of Newton’s method.
This may not be practical unless compilers can optimise the resulting function (or if there is a way to do the reduction myself easily), but it was fun to do :) As always comments and suggestions are appreciated. If anyone who reads this is good with compilers, advice would be great :D
What you can do now is read SICP for more cool things like streams and functional programming, or check out Sean Anderson’s bit hacks page for more cool hacks like the fast inverse square root. Or refactor your code to use map, reduce and streams :)
The reason I have chosen Python for this exercise is for reasons of accessibility. Here is a post about implementing streams in Ruby, and here is one for Erlang :) but of course it’s all pretty much the same deal. ↩
If a compiler could optimise this, simplifying the reapplied function, but keeping the generality, that’d be really cool :) I don’t think many compilers would/could do that for lambdas though. Any information would be great. ↩