Comparing JPEGs

6:41pm, 11th May 2008

Sometimes hard disks break - this is why we have backups. Sometimes parts of hard disks become silently corrupted and you don’t find out until months or years later when you come to view an old photo and find the top 20% intact followed by a field of grey - this is why we have incremental backups.

To compare today’s version of a file to last year’s version you might do something like this:

$ sha256sum newcopy.jpg oldcopy.jpg
ce5d948cf9bfe9d3709cef57016b4e1c6db5391b9efca4c1ef958fe557d81e1b  newcopy.jpg
56bb613661ebb20ab212f748caff2023f80f9854ada17c7001af4a1f9d30b6aa  oldcopy.jpg

The different checksums indicate differences in the file that a comparison of file size might not show up. But checking sums of JPEGs is complicated by software which modifies the EXIF header. Digikam, for example, will change the image header whenever you correct the timestamp, do a rotation, add a comment or tag, or basically do any metadata operation. The image itself might never be altered, but the JPEG file can go through many iterations. If your incremental backups are perfect, you should still be able to find a backed-up copy of what today’s file should be, but unless you’re using some kind of wicked insane ZFS system, your last backup will have a non-zero age.

In my particular case, I suspected a photo had become corrupted, but couldn’t prove it, as past versions from my rdiff-backup snapshots had a different checksum at different points in time, reflecting changes that had been made to the metadata by successive versions of Digikam and occasional tagging operations. Because the file had been changed, I couldn’t use checksums to prove the difference was due to corruption.

jhead to the rescue!

jhead is a JPEG metadata utility (and an Ubuntu package of the same name). Use it like this to strip all metadata out of a JPEG file, leaving only the pure image:

$ jhead -purejpg newcopy.jpg oldcopy.jpg
Modified: newcopy.jpg
Modified: oldcopy.jpg
$ sha256sum newcopy.jpg oldcopy.jpg
42f61a8e8d78153a41cc9c7fcb20055ed3e403c88366395a577bea64166218a9 newcopy.jpg
42f61a8e8d78153a41cc9c7fcb20055ed3e403c88366395a577bea64166218a9 oldcopy.jpg

As it turned out, the image was not corrupted at all! Yay!


Dark Season

9:21pm, 24th April 2008

In 1991 there was a childrens’ television series called Dark Season. I recall thinking it was brilliant at the time. It was written by Russell T Davies, who is now the executive producer and writer of Doctor Who. The story centred on a sinister plot to take over childrens’ minds by distributing free computers.

Meanwhile, One Laptop Per Child appears to be running into difficulties. The project’s “extreme dependence on scale to bring down cost” is worryingly familiar; like collective farming, their economies of scale have not yet been great enough to overcome the massive disadvantages of central planning.

And now Brazil is rolling out KDE to 52,000,000 children. The project is called “Um Computador por Aluno” - One Computer Per Student. It may not match the headline features of OLPC, like mesh networking and a cool low-power display, but it looks more likely to work in the real world. In this respect, OLPC may be more like the Soviet space programme than the Soviet state farms: the standout achievements of the Russians, like Sputnik and Yuri Gagarin, were impressive and noteworthy, but the system ultimately couldn’t compete with the more open and dynamic American space programme.


Milkybar White Moments

2:44pm, 20th April 2008

I tried Milkybar White Moments for the first time last week. They are a white-chocolate clone of Minstrels, and they taste like plasticine. They cannot be used for building Wallace & Gromit models, but when combined with a packet of real Minstrels, might make a handy set of Go stones.

2/10, will eat anyway.


War machine

8:24am, 13th April 2008

Apparently the US Army already has thousands of killbots in Iraq:

Killbot

I love the Army’s attitude:

While Fahey said that no inappropriate shots had been fired, and no casualties, Fahey stated sadly that the robot’s control failure might be the end of the program. Says Fahey, “Once you’ve done something that’s really bad, it can take 10 or 20 years to try it again.”

No, once you’ve done something that’s really bad, you get arrested, tried, convicted, sentenced, imprisoned, and you never do it again.


Genocide by default

5:50pm, 7th April 2008

Grain shortages are in the news. There’s now a stronger link than ever between food and fuel. Here’s the conclusion:

Cheap food, like cheap oil, may be a thing of the past.

The article draws an analogy between the current financial crisis and the emerging food crisis. Taking it further, if our collective debt problems are a result of us living beyond our means, then rising food prices paint a much more disturbing picture: hundreds of millions of people, if not billions, will be living beyond their means simply by existing. You can do your best to avoid debt by not doing stupid things like taking out loans to spend on a holiday, but you can’t choose not to eat.

Now, taking a detached, purely cold-blooded just-the-numbers-ma’am-please approach for a moment (it’s what I do best anyway), if x million people can’t feed themselves, is the most ethical course of action to let them die before they reproduce and become 2x million people that the world can’t afford to feed?

You could say “No! The most ethical course of action is to give food aid and help those x million people live!” That works while the rest of the world can afford to feed them, but if the population of chronic destitutes keeps on increasing, there will inevitably come a point where we can’t afford the aid, and there will be no choice in the matter: they will all die. By this time, their population will have doubled and doubled again, many times over, and we are back to the question: let x million people die, or let 2x million people die?

It is possible that our planet and our culture can support 20 billion people sustainably. It is also possible that it can only support 2 billion. Scienticians call this the carrying capacity, and I don’t know which it is, but I suspect it’s closer to 20 than 2, and that we still have the time and technology to keep everybody alive. That’s cautious optimism; we still need to educate people about birth control and make them richer - the two best ways to encourage people to have fewer children. Hence green secular libertarianism is good for my peace of mind: making people richer and opposing Catholic nonsense about condoms prevents me from having to pull a genocide-by-default on billions and billions of people at some point in the future. That would generally be considered a bad thing, wouldn’t it?


Epilepsy is a remote exploit

8:26pm, 30th March 2008

Modern version of the pcAnywhere + flashing Excel macro epilepsy attack


KDE vs GNOME

9:59am, 28th February 2008

Kubuntu 8.04 will mini-fork for KDE 4. I’m not sure I like the way this is going. The commercially supported, stable, feature-complete Kubuntu-KDE3 may draw users away from the still-experimental Kubuntu-KDE4, denying KDE 4.0.1 the extended testing that it needs. But then what choice do they have? Not many people are going to use KDE 4.0.1 anyway if it’s not finished.

The 3-to-4 transition is going to get worse before it gets better. KDE 4 won’t be properly stable until at least 4.1 in July, and won’t reach feature parity with KDE 3 until at least 4.2 in November or later. That will be 3 years since the first release of KDE 3.5, and 3 years is a long time for ordinary KDE users to wait for a new release.

The same thing happened to GNOME back in 2002 when they stripped all functionality from 1.4 and effectively started again with 2.0. I switched to KDE back then. I’d now be tempted to switch back to GNOME - and with it, the more polished Ubuntu - if it wasn’t for apps like Amarok and the promise of future awesomeness from what KDE 4 will still hopefully become.


Model number bling

10:49am, 27th February 2008

Am I the only one to notice that AMD, Intel and NVIDIA now all have products numbered 9500? Previously, ATI, Apple and Nokia have had the Radeon 9500, PowerMac 9500 and the 9500 Communicator. Quick googling reveals many more 9500s.

What is it about this number? I think it’s the 9 that’s important. 9 is the biggest single digit number, and hence the best. It’s the perfect marketing fiction: you simply can’t get better than 9 (the marketing department hopes you won’t notice the possibility of the existence of 10). The model number obviously has no relation to the actual specifications of the device, so it’s pure braggadocio and bling. To the marketroids, the number 9 and the letter X are talismans of bad-assedness, the computer equivalent of the hip hop chain. Look no further than ATI’s Radeon X1900 XTX. They had to scale it back a little from there with names like HD 2900, just like the James Bond producers had to settle for Casino Royale after they couldn’t figure out any new ways to fit the word “die” into the title. (Before Die Another Day came out, alt.fan.james-bond discussed plausible titles for the next film; the closest was Die Die Die).

I predict Microsoft will be next. Expect to see in 2010: The X-XBOX X9000 Yo Yo Yo X-Platinum, followed by simply Xbox 4 in 2015.


L2 cache

10:49am, 27th February 2008

Intel’s new Penryn-based Core 2 processors have as much as 12MB of L2 cache on the CPU. Considering my 2005-era Athlon 64 only has 512KB, that’s a lot - a 2500% boost in 3 years. So here’s the deal:

Dear Lazyweb, please write software to mount L2 cache as a drive à la RAM disks. Not for any particular reason. Just to do it.


10 years of lynz

10:59am, 26th February 2008

10 years ago today, Adam received his lines. His lynz. The grand total is now exactly:

Adam's Lynz after 10 years

which is approximately 10 to the power of a 1040-digit number. And if he doesn’t do them by tomorrow, there’ll be more.

The number of lines weighs in at just under the value of a particular universal constant implicated in inflationary universe theories, and well under Skewes’ number, the next pure mathematical entry in the list. However, last year the XKCD blag dug up the lynz’ mad cousin, ¥, and in the comments it was proved that “the clarkkkkson” is seriously large. Considerably bigger than Graham’s number, the very last entry in the The Penguin Dictionary of Curious and Interesting Numbers. I’ve had a copy of that book for years and between trips to Rusholme’s curry mile and full-blown bouts of diarrhoea, I have probably read every entry. But it’s nothing compared to the internet’s own Notable Properties of Specific Numbers by Robert Munafo, who’s my kind of geek. He even emailed me last night to remind me it was the Lynz’ 10th anniversary. What a guy!

As for Adam, I’ve not exactly been in regular contact with him, but as of today, his latest Facebook status is “Adam is wasting time.” He’d better get started soon.


Older stuff