Recovering without a reflog
I've been tinkering on a project for a few weeks and I decided it was time to publish it somewhere else, just to have the work in two locations in case something bad happened. Of course in doing that I caused something very bad to happen.
In the early stages of a project I often have useless git history so I first wanted to reimport the entire project as a single new commit. I ran these commands:
# Move the old git history aside.
mv .git oldgit
# Start up a new git history.
git init; git add .
# See what I'm about to check in.
git status
But with that I had unintentionally added all the files in the
oldgit
tree, which isn't what I wanted. So without thinking:
git reset --hard
This restored my tree to its initial state — that of one without any
files, deleting everything including the oldgit
directory!
Typically you don't need to worry much about resetting in git because
of the reflog, but there is no reflog here because I haven't made any
commits yet.
At first I thought I'd lost everything: there were no files, there was
no master branch, .git/index
was empty. I'd even gone as far as
composing an email to the friends I'd demoed the project to lamenting
my mistake. But then I remembered that whenever something is added to
the git index there's an associated object created under
.git/objects
, even if you never check it in. (These leftovers are
part of why git gc
is necessary to find things to delete.)
So here's how I recovered. The very first thing to do is to checkpoint where you started at, in case something else goes wrong!
cd ..
cp -a myproject what-were-you-thinking
cd myproject
Then I extracted the contents of all the objects.
ls .git/objects/??/* | sed -e 's|.git/objects/\(..\)/|\1|' |
while read obj; do
git cat-file -p $obj > obj/$obj
done
(After writing this post, Aristotle told me a better way to do this:
git fsck --unreachable --no-reflogs --no-progress |
while read status objtype objname ; do
git cat-file $objtype $objname > obj/$objname
done
Note that my manual approach will miss packed objects, which wasn't an issue in this particular case but could be in other scenarios.)
At first I thought I'd just be able to look through these to find the
git "commit" object for my old master branch in there, but recall that
all I had run on this git repository was a single git add
which just
added all the files as blobs to the index. All of the objects were
blobs, no git trees or commits.
The majority of these objects were from a git add
of the files
within the oldgit
directory, so they were blobs of files containing
git objects, or twice encoded. I figured I'd need to identify
all the ones containing source code and manually replace them.
But thankfully, running file obj/*
found one file that it identified
as Git index, version 2, 20 entries
. That was the .git/index
from
the old repository, before I ruined everything. So it's easy to just
paste that over over the new (empty) one.
cp obj/4a92e84106caea0835d918affe4735009b43d147 .git/index
With that, git status
showed that it expected there to be 20 files,
each in their original locations. So finally, for each file mentioned
in the index, I could just check it out again. (Recall that
"checkout" in git means "make the file on disk match the file in the
index".)
git ls-files | xargs git checkout
And my files were restored. And this time I will be more careful.