Regularly Divisble

div3

Update: read the comments at Hacker News to see some succinct approaches to this, as discussed by gjm11, qntm and patio11. Thanks to Robin for providing this demonstration that can find a regex for testing divisibility of any number, in any base (he also made the code available, nice).

Earlier this year, at the advice (once more) of Chad Fowler, I took to the idea of practicing programming every day. Perhaps this appealed to me because it echoed the rituals of my better musician friends, and allowed me to draw parallels between programming and my fading dream of becoming a famous rockstar.

Possibly because of my failed interview at Google (hey, I wouldn’t have hired the back-then me either, so no hard feelings!), I was also interested in job-interview styled problems [1]. Not FizzBuzz though, more like the computer science ‘riddles’ found on this page [2].

At the time I was teaching Computing Theory [3], 80% of which was formal languages: regular expressions, context free and context sensitive grammars, Turing machines and other automata, and their locations in the Chomsky Hierarchy. So, this problem appealed to me:

Construct a finite state machine (or equivalently, write a regular expression) which accepts all strings over the alphabet {0,1} which are divisible by 3 when interpreted in binary.

It is pretty interesting that languages can be defined to communicate patterns in binary sequences that are divisible by 3. Let’s solve it in more detail than necessary :)…

We will be developing this regex incrementally. I will update the regex at each stage. Since it is easier to construct this regex using a finite state machine (which is how I worked this out the first time), I’ll also include a few diagrams along the way. You can test your regexes using the Ruby code here.

To get an idea for the pattern, I started by listing out a few multiples of 3 and their binary representations. You can use the following Ruby code if you want.

>> 30.times {|n| puts "#{n}: #{n.to_bin_s}" if n%3 == 0}
0: 0
3: 11
6: 110
9: 1001
12: 1100
15: 1111
18: 10010
21: 10101
24: 11000
27: 11011

Observe that the binary representations for 3, 6, 12 and 24 have something in common: they have the same sequence, with a little bit of zero-padding to the right. If X is a binary string divisible by 3, then the binary string X0 is divisible by 3 as well. This makes sense, as adding zeros to the right is equivalent to multiplying by 2, and we know that n * 3 * 2 is divisible by 3 for any integer n. This gives us:

r = A0*

A will represent the leftover regex at each stage.

This can trivially be applied to left zero-padding as well, because a binary string X is numerically equivalent to the binary string 0X (issues of endianness aside):

r = 0*A0*

Since we already covered even multiples, to work out A we just need to look at odd multiples of 3. Here are a few:

>> 50.times {|n| puts "#{n}: #{n.to_bin_s}" if n%3 == 0 and n%2 != 0}
3: 11
9: 1001
15: 1111
21: 10101
27: 11011
33: 100001
39: 100111
45: 101101

You might have noticed that 1111 (15) is just 11 (3) concatenated with itself. That’s the same as saying 15 = 3 * 2 * 2 + 3. If you add two multiples of three, you will of course get another multiple of 3 (it’s the same for any multiplier), and concatenating just involves doubling the first number a few times pre-addition. Hence, you will always get another multiple of 3 by concatenating two.

r = (0*A0*)+

From observation, here are the above numbers that are not concatenations:

3: 11
9: 1001
21: 10101
33: 100001
45: 101101

Take note that it always starts and ends with a 1. It should definitely end with a 1 to be odd, and start with a different 1 to be greater than 1. Our updated regex becomes:

r = (0*1A10*)+

autom1

It also appears that we can optionally insert an even number of zeros between the end 1s. A proof of this is attached at the end, if you’re into that sort of thing.

r = (0*1(0A0)*10*)+

autom2

So what about those 1s in the middle then? It appears that we’re allowed to have 0, 1 or 2… maybe more consecutive 1s? Maybe we can say our regex is r = (0*1(01*0)*10*)+. I’ll test the regex for the first 5000 integers:

False Negatives:
  0

Oops! I forgot about 0 being evenly divisible by 3 exactly 0 times. Our regex should account for this:

r = 0|(0*1(01*0)*10*)+

autom3

Running the test again we get this output:

Pass =]

So there you have it, the regex works for the first 5000 integers. I’ll leave a proof of this for all multiples of 3 as an exercise to the reader ;)


[1] I’m not sure I want to work for someone else anymore – I’d rather chase my own stupid dreams. Probably not the rockstar one though. I’ll write about these later.

[2] Other good sources of practice questions are Programming Pearls, Project Euler, or you could take a paper from ACM Transactions on Algorithms
(for example) and implement the algorithm/data structure. While you’re at it, why not learn Erlang (or any other programming language guaranteed to make you more appealing to the opposite sex) and implement it in that?

[3] Recently, a past student of mine told me that they have never found Computing Theory to be of any use. I said “What about regular expressions?” and they shook their head. This was a smart student too, but I find that regular expressions are so damn useful (perhaps a hammer I swing too often). Ironically, Computing Theory was the most practically useful subject in my (watered down) degree. In the near future I intend on climbing on my high-horse and writing a blog post about my ideal CS degree.


Appendix

Here is a proof that shows we can optionally insert an even number of zeros between two 1s to get an odd multiple of 3. This is equivalent to saying that our left-most 1 has to be in an odd position index (if 0 is the rightmost…).

RTP: 2^(2i + 1) + 1 = 3m, for m odd integer and i positive integer or 0

Let i = 0, then
LHS = 2^1 + 1
    = 2 + 1
    = 3 * 1
    = 3m (1 is an odd integer)
    = RHS
    => it holds for i = 0

Let i = k for any integer k, then assume
2^(2k + 1) + 1 = 3m

Let i = k + 1, then
LHS = 2^[2(k + 1) + 1] + 1
    = 2^(2k + 1 + 2) + 1
    = 4 * 2^(2k + 1) + 1
    = 3 * 2^(2k + 1) + 3m'
    = 3[ 2^(2k + 1) + m']
    = 3[ 2^(2k + 1) + 1 + (m' - 1)]
    = 3[3m' + (m' - 1)]
    = 3(4m' - 1)
    = 3m (an even number minus 1 will be odd)
    = RHS
QED

Yes, 2^n + 1 always yields an odd multiple of 3, for n odd. I should really set up a math plugin…