Violining a contemptible fellow

I was going to write about a very strange and excellent dream I had the other day. I'd even written some notes down so that I'd remember it but I managed to delete those notes during some git fiddling.

So I'll write about that instead :)

The back story

I've been working on a photo gallery application to use for my photos as I'd become dissatisfied with everything else out there. Here's an example of where I've got with it so far.

The basic features I wanted were:

  • that it's easy to upload files and put them into a gallery
  • images don't have to go into a gallery
  • galleries and files each get a unique, unguessable id
  • There'll be no real security as such, but:
    • the site will operate over SSL
    • "public" images/galleries are simply those whose IDs have been published (on a blog or whatever)
    • only a gallery's owner can add things to it
  • a user can view a complete list of their files and galleries
  • galleries can also contain galleries
  • galleries and files can store arbitrary meta data
  • the default view for a gallery is a page full of thumbnails
  • selecting a thumbnail expands the image and provides a scrolling bar to move through the others

I've implemented all of those features but the UI needs work. For example, there's currently no good method of adding/editing meta data and I need to work out how to do multiple file uploads, I think HTML5 has features I want.

The git fiddling

During the development of the gallery, I've obviously tested it out with quite a number of different images and, at some point, decided to check some of those in to the repository. A while later (this was obvious in hindsight - as most stupid decisions are) I realised that was a bad idea and subsequently removed them from my working tree.

As any git user will know though, the history of those files stays around which means that your .git folder stays very big. After having checked out my repository on another machine and realising why it was taking so long to download, I decided I'd better do something about it.

Enter git filter-branch!

This is a feature of git that I wish I'd known about ages ago. Ok, it's not something that anyone should be using every day (not even every month) but if you find yourself in a situation like mine - the sole user of a repository, misguided check in of a large file in the past - it's a very handy tool to know about.

Here's the man page.

The basic idea is that you ask git to rebuild the current branch by working through every commit from the first to the most recent, run any commands you like and then re-commit that commit. Eventually, you end up with a branch that's made up of entirely new commits that are based on the original commits you made.

For my need, the git command I needed was:

git filter-branch --prune-empty --index-filter "git rm -rf --cached --ignore-unmatch gallery" HEAD

Translated, this means: Rebuild the current branch from the ground up, one commit at a time. Before re-commiting each commit, run "git rm -rf --cached --ignore-unmatch gallery". This means I'm asking git to remove the gallery folder at each commit. --prune-empty means that if, after removing the gallery folder, the resulting commit would consist of nothing else, skip that commit.

After that was done (it took a while), I was left with a nice clean history that contained no mention of the gallery folder.

Now, because git isn't reckless, the gallery files were still kicking around and, if sufficiently motivated, I could bring them back. Even the old commits still hang around in the .git folder just in case. To properly clean up, I git cloned the repository folder. Success! The new folder was smaller by around 90% :) I deleted the original folder and did a forced push to my remote. It's necessary to force push when you change history - something every time-travelling Jedi will know well.

Then I realised that any untracked files were lost forever! That's why I'm not writing about the dream I had that I am still convinced was extremely interesting and revealing.

Unlike this rather rambling article about git. I'm rather tired. Just pretend you read something great and feel a lovely warm glow inside. ALL GLORY TO THE HYPNOTOAD!

Tags: blog code git