DRY Function Pointers in C

Just a quick post today about C function pointers. Over the past two years I have seen the occasional function pointer introduction post on Hacker News, but I rarely see this one weird trick. The most recent I have read was this one by Dennis Kubes. I haven’t hung out with C for a while […]

Succinct de Bruijn Graphs

This post will give a brief explanation of a Succinct implementation for storing de Bruijn graphs, which is recent (and continuing) work I have been doing with Sadakane. Using our new structure, we have squeezed a graph for a human genome (which took around 300 GB of memory if using previous representations) down into 2.5 […]

FM-Indexes and Backwards Search

Last time (way back in June! I have got to start blogging consistently again) I discussed a gorgeous data structure called the Wavelet Tree. When a Wavelet Tree is stored using RRR sequences, it can answer rank and select operations in $\mathcal{O}(\log{A})$ time, where A is the size of the alphabet. If the size of […]

Wavelet Trees – an Introduction

Today I will talk about an elegant way of answering rank queries on sequences over larger alphabets – a structure called the Wavelet Tree. In my last post I introduced a data structure called RRR, which is used to quickly answer rank queries on binary sequences, and provide implicit compression. A Wavelet Tree organises a […]

RRR – A Succinct Rank/Select Index for Bit Vectors

This blog post will give an overview of a static bitsequence data structure known as RRR, which answers arbitrary length rank queries in $\mathcal{O}(1)$ time, and provides implicit compression. As my blog is informal, I give an introduction to this structure from a birds eye view. If you want, read my thesis for a version […]

Generating Binary Permutations in Popcount Order

I’ve been keeping an eye on the search terms that land people at my site, and although I get the occasional “alex bowe: fact or fiction” and “alex bowe bad ass phd student” queries (the frequency strangely increased when I mentioned this on Twitter) I also get some queries that relate to the actual content. […]

Some Lazy Fun with Streams

Update: fellow algorithms researcher Francisco Claude just posted a great article about using lazy evaluation to solve Tic Tac Toe games in Common Lisp. Niki (my brother) also wrote a post using generators with asynchronous prefetching to hide IO latency. Worth a read I say! I’ve recently been obsessing over this programming idea called streams (also known as infinite lists or […]

Design Pattern Flash Cards

Last year I studied a subject which required me to memorise design patterns. I tried online flash card web sites, but I was irritated that I didn’t own the data I put up (they had no export option). So I wrote a something in Python to generate flash cards for me using LaTeX and the […]

Metaprogramming Erlang the Easy Way

I’ve recently taken Erlang back up1, and I wanted to use this blog post to talk about something cool I learned over the weekend. I am implementing a data structure. Reimplementing actually, as it is the structure from my thesis – a succinct text index (I will post a blog on this soon). Why am […]

Au Naturale – an Introduction to NLTK

This blog post is an introduction on how to make a key phrase extractor in Python, using the Natural Language Toolkit (NLTK). But how will a search engine know what it is about? How will this document be indexed correctly? A human can read it and tell that it is about programming, but no search […]