Au Naturale – an Introduction to NLTK

This blog post is an introduction on how to make a key phrase extractor in Python, using the Natural Language Toolkit (NLTK). But how will a search engine know what it is about? How will this document be indexed correctly? A human can read it and tell that it is about programming, but no search [...]

Advice to CS Undergrads

Since I’m starting my PhD this year, I have been reflecting on how I would be different if I went back in time and started my degree all over again. I am also continuing tutoring, in my 4th year, and I have been occasionally approached by students and asked for general advice with their studies. [...]

Regularly Divisble

Update: read the comments at Hacker News to see some succinct approaches to this, as discussed by gjm11, qntm and patio11. Thanks to Robin for providing this demonstration that can find a regex for testing divisibility of any number, in any base (he also made the code available, nice). Earlier this year, at the advice [...]

Hero Typing

“Who is your hero?” is a question I’ve been asked, but never had an answer for. Why is this a question that people are compelled to ask? Are we expected to have a hero, like a favourite colour or number?

Things Smarter People Said #1

I am currently reading Code Complete 2 as per Jeff Atwood’s Recommended Reading for Developers list, where I came across this interesting quote by Glenford Myers: We try to solve the problem by rushing through the design process so that enough time is left at the end of the project to uncover the errors that [...]

I Don’t Know What the F*** I’m Doing

This is a nice article I read back in February which discusses why it’s a good thing when you realize how very little you know. I feel just like that right now, and I’m enjoying it because there is so much territory left to explore… No One Knows What the F*** They’re Doing (or “The [...]

How to Win Friends and Generate People

I’m doing a project for a subject at RMIT which needs to manage thousands of patient records for a hospital. We haven’t been given any sample data though, so I wanted to write a generator (so we can test it with small or large data sets whenever needed). I started with the name generator (in [...]

Nanosecond Timing

My Uni (RMIT) uses a mixture of Solaris, Mac OS X, Linux, and Windows computer labs. Our programs are nearly always tested on Solaris, though. Sometimes we are required to provide nanosecond timings in our experiments using the (real-time) POSIX function gethrtime(). Depending on which lab I work in, or if I’m working from home, [...]