First visit? Hello! :) Check out the whole photo gallery, or just the good ones.

What does half a decade of email look like?

I have here nearly a hundred megabytes of email from the last 5 years, so I thought I'd have a look at it. But how do you look at something that big? Data visualisation to the rescue! If I plotted each byte as a pixel, I'd have a 125-screen wide bitmap, so instead I plotted the average value of each 16x16 block as a single pixel. And when the Python Imaging Library finished, I had a revelation; a picture! A picture of this:

Image of 5 years' email Click for biggitude

The image is mostly grey because each pixel is an average of 256 values; since null and high-ascii bytes are rare, they get lost in a sea of averageness. Art imitates life, eh? The latter 20% or so is spam from the last 2 months, which I keep as fuel for Bayesian magic.

Of course, please excuse the crudity of this image, I didn't have time to colour it.

Add a comment






Formatting help
You typeYou see
*italics*italics
**bold**bold
[link text](http://www.example.com) link text
* item 1
* item 2
* item 3
  • item 1
  • item 2
  • item 3
> quoted text
quoted text

← The Hitch-hikers guide to the Hutton Report | Home | Steel enlightenment →