Friday, February 27, 2015

Nexus 4 still live and kicking

My everyday phone for the past few years has been Nexus 4. I also have a Nexus 5, which is slightly larger and with a much better screen, but I never felt a need to switch (I did try to have "Let's use N5 this week" every once in a while, though). Last year's Nexus 6 simply felt too large for me. Besides, it is too expensive for me to buy.

I noticed that my N4 recently stopped picking up NFC and charging via wireless. Later I learned that this is a typical sign that its battery needs replacement. Not because the battery got too weak to hold charge, but because the battery started bloating, pushing against the back cover, which necessary antennas are built onto. By slightly raising the back cover by bulging out, the bloated battery breaks the connection from the motherboard to these antennas, which is made only by contact. And that is how NFC and wireless charging are broken.

At least, that is the story I read.

After learning how to, and getting a replacement battery and a few small screw/torx drivers, I opened the phone (which took me some time) and saw this bloated battery.







No wonder the back cover looked warped. After placing the new battery and closing the back, NFC started picking up very reliably and it charges properly on a wireless charger.

Happy ;-)

Sunday, February 8, 2015

Fun with "git diff -B -M"

Git lets you view a change that renames an original file A to a new location B while doing some minor edits to its contents as a "rename" patch (i.e. "rename A to B, with the following content differences"). You can even view a change that renames two original files, A and B, by swapping their contents and optionally doing some minor edits to them as a patchset that contains two "rename" patches ("rename A to B" and "rename B to A"). These were invented by me early in Git's life, when Linus was still running the project, back in mid 2005. More recently, other tools (including GNU patch) started understanding patches that use these features.

I however recently noticed a few corner cases that git diff and friends produce a wrong patchset, or git apply fails to apply correctly constructed patches, and I have been thinking about the right fixes to these issues. This article will illustrate these tricky cases and describe my current thinking.

In this write-up, I'll use these terms:

patchset::
Output from a single git diff invocation, which may contain one or more patch.

patch::
A part of a patchset from a header line that begins with "diff --git" up to (but excluding) the next such header line.

tree::
git diff compares two collections of files; each collection is a tree. Left and right sides of the comparison are called old tree and new tree, respectively. The tree to which we attempt to apply a patchset is the target tree. The tree we would get after a successful application of a patchset is the resulting tree. A tree does not have to be a tree object—we may be comparing the index and the files in the working tree, for example.

preimage::
postimage::
The preimage consists of lines in a patch that are prefixed by "-" (minus) or " " (space) but not "+" (plus) that denote what the patched file ought to have for the patch to apply. The postimage consists of lines that are prefixed by "+" (plus) or " " (space) that denote what the patched result ought to look like.


1. Basics

First the basics. Let's think about a patchset with a single patch. What does this patchset tell us?

    diff --git a/major-08.txt b/major-08.txt
    index 680c5f6..5de90cb 100644
    --- a/major-08.txt
    +++ b/major-08.txt
    @@ -1,3 +1,3 @@
    -8. Fortitude.
    +8. Strength.

     This is one of the cardinal virtues, of which I shall speak later.

It obviously tells us that the new tree changed "Fortitude", that used to be in the old tree, to "Strength", but it actually tells us a bit more about the old tree. For this patchset to apply, the target tree must have a file "major-08.txt" that begins with lines we see as the preimage in the patch.


2. Renaming a file

Now let's get a bit fancier and study a patchset with a rename patch. What does this patchset tell us?

    diff --git rws/major-08.txt marseille/major-11.txt
    similarity index 97%
    rename from major-08.txt
    rename to major-11.txt
    index 680c5f6..2ab22a0 100644
    --- rws/major-08.txt
    +++ marseille/major-11.txt
    @@ -1,3 +1,3 @@
    -8. Fortitude.
    +11. Fortitude.

     This is one of the cardinal virtues, of which I shall speak later.

We can see that this is going from the same old tree as the previous one's old tree, renames major-08 to major-11 with slight modification.

It tells us more about the trees, compared to the previous example. For this patchset to apply, the target tree must satisfy the same pre-conditon as the previous one about major-08, and in addition it must lack major-11; otherwise we wouldn't be renaming a new file to it.

So far, things are straight-forward.

In summary:

Rule 1.
A patch from file A to file A requires that file A exists in the target tree with contents that match the preimage of the patch.

Rule 2.
A patch renaming file A to file B requires that file A exists the target tree with contents that match the preimage of the patch. It also requires that file B must not exist in the target tree.

Rule 3.
A patch that creates file A requires that file A does not exist in the target tree.

Rule 4.
A patch that deletes file A requires that file A exists in the target tree with contents that match the preimage of the patch.

The latter two I didn't illustrate with examples, but they should be obvious. Also, we can think of Rule 2 (rename) as a natural extension of Rule 3 (creation) and part of Rule 4 (deletion). When you rename file A to file B, optionally with some content changes, you are:

  • creating file B, so the target tree must not have file B already.
  • deleting file A, so the target tree must have file A with the content that matches (part of) it.
Similarly, a patch that creates file B by copying file A, optionally with some content changes, you are creating file B, so the target tree must not have file B already. Also, the target tree must have file A with the contents that match the preimage of the patch.



3. First twist: cross renaming

Now, here is the first twist. What does this patchset mean?

    diff --git rws/major-11.txt marseille/major-08.txt
    similarity index 99%
    rename from major-11.txt
    rename to major-08.txt
    index 517d9f8..44e8d3a 100644
    --- rws/major-11.txt
    +++ marseille/major-08.txt
    @@ -1,3 +1,3 @@
    -11. Justice.
    +8. La Justice

     That the Tarot, though it is of all reasonable antiquity, is not of
    diff --git rws/major-08.txt marseille/major-11.txt
    similarity index 97%
    rename from major-08.txt
    rename to major-11.txt
    index 5de90cb..a101d5f 100644
    --- rws/major-08.txt
    +++ marseille/major-11.txt
    @@ -1,3 +1,3 @@
    -8. Strength.
    +11. La Force

     This is one of the cardinal virtues, of which I shall speak later.

This is a "swap" patchset, that swaps major-08 and major-11 with small edit. You would have done something like this to prepare such a change, starting from an old tree with two files with substantially different contents, both of which are of meaningful sizes:

    $ mv major-11.txt tmp
    $ mv major-08.txt major-11.txt
    $ mv tmp major-11.txt
    $ edit major-08.txt major-11.txt ;# just a bit
    $ git commit -m swap major-08.txt major-11.txt
    $ git diff -B -M HEAD^

A patch renaming major-11 to major-08 (i.e. the first one in this two-patch patchset) still requires that major-11 must exist in the target tree for the patchset to apply, which is the first half of Rule 2.

But the other half of Rule 2 is not satisfied. The target of the rename, major-08, has to exist in the target tree; otherwise we cannot rename it to major-11 in the second patch in the patchset. The rule needs a bit of revising, perhaps like this:

Rule 2.
A patch renaming file A to file B requires that file A exists with contents that match its preimage. And file B must not exist in the target tree, unless another patch in the patchset renames file B to some other file (possibly but not necessarily file A).

Of course, for such an "other patch" to be able to rename file B to somewhere else, the target tree is required to have file B.

It is important to have that "unless" part in the revised Rule 2. We need to make sure that we do not allow the sample patchset in "2. Renaming a file" to overwrite an existing file major-11 in the target tree blindly.

4. Second twist: rewriting and copying

The previous one showed how git diff -B -M can be used to detect cross renaming files and apply the resulting patchset (you can circularly rename more than two, i.e. A -> tmp, B -> A, ..., Z -> Y, tmp -> Z). It can also detect when you did this:

    $ cp major-08.txt major-11.txt
    $ edit major-08.txt ;# extensively
    $ git add major-08.txt major-11.txt
    $ git commit -m 'create 11 out of 08, rewrite 08'
    $ git diff -B -M HEAD^

And you would see:

    diff --git a/major-08.txt b/major-08.txt
    dissimilarity index 99%
    index 5de90cb..44e8d3a 100644
    --- a/major-08.txt
    +++ b/major-08.txt
    @@ -1,10 +1,31 @@
    -8. Strength.
    -
    -This is one of the cardinal virtues, of which I shall speak later.
    -...
    -the principle of all force.
    +8. La Justice
    +
    +That the Tarot, though it is of all reasonable antiquity, is not of
    +...
    +via prudentiæ.
    diff --git a/major-08.txt b/major-11.txt
    similarity index 97%
    copy from major-08.txt
    copy to major-11.txt
    index 5de90cb..a101d5f 100644
    --- a/major-08.txt
    +++ b/major-11.txt
    @@ -1,3 +1,3 @@
    -8. Strength.
    +11. La Force

     This is one of the cardinal virtues, of which I shall speak later.

The first patch in the patchset is "a patch from file A to file A", even though it is an extensive rewrite. The target tree is required to have major-08 whose contents match the preimage of the patch (Rule 1). The second patch copies from major-08 to create a new file major-11. The target tree is required to lack major-11 (Rule 3; copying into A is creation of A). It also must have major-08 that begins with the preimage of the patch.

Another thing to note is that an application of a patchset in Git is not incremental. Even though the first patch in the patchset talks about extensively rewriting major-08, and the second patch talks about creating major-11 by copying major-08 and then making a minor edit to it, the latter patch is the difference between the major-08 in the old tree and the major-11 in the new tree. It is not the difference between these two files in the new tree, i.e. it is not "modify major-08 and then copy the result to major-11 and then edit". If you think about it, this is also consistent with the previous "cross renaming" section. The first patch in the patchset renames major-11 to major-08, and the second patch that renames major-08 to major-11 is not about remaing the file that originally was major-11 that the first patch renamed back to its original position. The two patches are not applied incrementally (or sequentially).

So far, all the examples shown above will work correctly with today's Git (some reimplementations of Git may lack support, but at least the one I maintain does work correctly). When you use the old tree as the target tree, git apply accepts the patchset and recreates the new tree correctly.

But if you use the new tree of this example as the target tree and try to use git apply -R to apply the patchset in reverse, it does not work correctly. It is a bug.

Currently git apply -R does a nonsense for a copying patch. To reverse any patch, it just swaps the preimage and the postimage, and then swaps the names of the files in the old tree and in the new tree.

But the reverse of "create major-11 by copying major-08 into it and then change Strength to La Force" (which is the second patch in the patchset in this section) is not "create major-08 by copying major-11 into it and then change La Force to Strength", which you would get by simply swapping the preimage and the postimage and swapping the names of the files in the second patch.

What should we do to "reverse" a patchset that has copies?

Reverse of "create major-11 by copying major-08" should at least be "remove major-11", and preferably accompanied by "while making sure that major-11 matches the postimage of the patch".

The "preferably" part is a moderately strong preference. When the copying was done without any modification, we would not have any preimage or postimage to enable us to check that the target tree of the reverse application is similar enough to the new tree the patchset was taken from. Instead, we would end up just checking "major-11 exists" and then removing it happily, even if the contents of the file major-11 is vastly different from that of the new tree the patchset was taken from, which feels somewhat unsafe.

Admittedly, the same "it feels unsafe" factor exists when applying a bog-standard pure rename patch (imagine that the example in "2. Renaming a file" was done without editing the first line and kept the original "8. Fortutide." without renumbering it. We would not have any preimage we can use to make sure we are patching the correct file).

But as long as we have patch text that we can use for sanity checking, we should use it, I would think.


5. Third twist: rewriting by copying

If you started from two vastly different files, both of which have contents of meaningful size, and did this:

    $ cp major-08.txt major-11.txt
    $ edit major-11.txt
    $ rm major-08.txt
    $ git commit -m 'rewrite 11 by copying 08' major-08.txt major-11.txt
    $ git diff -B -M HEAD^

You would see this patchset:

    diff --git a/major-08.txt b/major-11.txt
    similarity index 97%
    rename from major-08.txt
    rename to major-11.txt
    index 5de90cb..a101d5f 100644
    --- a/major-08.txt
    +++ b/major-11.txt
    @@ -1,3 +1,3 @@
    -8. Strength.
    +11. La Force

     This is one of the cardinal virtues, of which I shall speak later.

This is another bug. I sent out a warning to both the Git and the Linux kernel mailing list, not to use the "-B -M" options together for this reason.

The revised Rule 2. from "3. First twist" tells us that major-08 must exist in the target tree, which is OK, but also major-11, the target of the rename, must not exist. This makes the resulting patchset unapplicable to the old tree the patchset was taken from, which simply does not make sense.

If you take a diff between states X and Y, you should be able to apply that diff to the state X and the resulting state should be identical to the state Y, and you should be able to apply that diff in reverse to state Y to go back to the state X.

Worse, the reverse of this patchset would apply to the new tree without an error, but does not reproduce the old tree correctly, which is a more serious bug. It instead applies the patch in reverse and recreates the original major-08, but the other file, major-11, is lost.

The patchset does not have enough information for us to recreate its original contents of major-11 we had in the old tree. The patchset says that the contents of major-11 in the new tree came from the contents of major-08 in the old tree, and the major-11 in the new tree does not have any resemblance to major-11 in the old tree. That is not incorrect per-se, but that means that we cannot apply this patchset in reverse.

One possible way to fix this is to include another patch in the same patchset that shows the deletion of major-11. Rule 2. would be further revised to something like:

Rule 2 (revised again).
A patch renaming file A to file B requires that file A exists in the target tree with contents that match the preimage. It also requires that file B does not exist in the target tree, unless another patch in the patchset renames file B to some other file (possibly but not necessarily file A) or removes file B.

Again, that "other patch" in the patchset either renames or removes file B, so that requires that the target tree to have file B with contents that match the preimage of that patch.

More generally, the revised Rule 2. can be split into two parts; the former becomes an extension to Rule 4, and the latter becomes an extension to Rule 3.




  • A patch that causes a file A to disappear (i.e. removing file A, or renaming file A to file B) requires that the target tree to have file A, with contents that match the preimage of the patch.
  • A patch that causes a file B to appear (i.e. creating file B, or renaming/copying file A to file B) requires the target tree to lack file B, unless another patch in the patchset makes file B disappear (i.e. removing file B or renaming file B to something else).
In any case, a fixed patchset would look like this:

    diff --git a/major-08.txt b/major-11.txt
    similarity index 97%
    rename from major-08.txt
    rename to major-11.txt
    index 5de90cb..a101d5f 100644
    --- a/major-08.txt
    +++ b/major-11.txt
    @@ -1,3 +1,3 @@
    -8. Strength.
    +11. La Force

     This is one of the cardinal virtues, of which I shall speak later.
    diff --git a/major-11.txt b/major-11.txt
    deleted file mode 100644
    index 517d9f8..0000000
    --- a/major-11.txt
    +++ /dev/null
    @@ -1,31 +0,0 @@
    -11. Justice.
    -
    -That the Tarot, though it is of all reasonable antiquity, is not of
    -...
    -via prudentiæ.

And these patches, under the re-revised rules, would apply cleanly to the old tree.

What about the reverse application? It would be a patchset that creates major-11 from nothingness (which is the reversal of a "deletion" patch), and creates major-08 by renaming major-11 and editing. Is the Rule 2. re-revised above sufficient?

The new tree (which is the target of the reverse application) only has major-11 and not major-08, so this rename should go through. The reverse of the deletion of major-11 is a creation of it with the contents fully given as the preimage of the (original) patch before reversing it, so that should also be OK with Rule 3 that is revised in a similar way with that "unless" thing. That is, creating major-11 requires that the old tree does not have major-11, but if another patch in the same patchset renames major-11 away or deletes it, then it is OK for a patch to create major-11. And the reversal of the first patch does rename major-11 to major-08, so all is well.

One disturbing thing about the above plan is that we have this comment at the end of diffcore-rename.c:

         * We would output this delete record if:
         *
         * (1) this is a broken delete and the counterpart
         *     broken create remains in the output; or
         * (2) this is not a broken delete, and rename_dst
         *     does not have a rename/copy to move p->one->path
         *     out of existence.
         *
         * Otherwise, the counterpart broken create
         * has been turned into a rename-edit; or
         * delete did not have a matching create to
         * begin with.

That is, we have an explicit logic to omit the missing "delete major-11" patch from the patchset. This comes from the very first commit that introduced "diff -B" (f345b0a0 (Add -B flag to diff-* brothers., 2005-05-30); it is plausible that the above comment came from lack of thinking in the original and not something we did to fix some bugs (if it were the latter, by showing the deletion in the case under discussion to "fix" the patchset in this example would end up breaking the original "fix").

So I would think that the right way to fix this is to stop filtering out the deletion half of the broken pair, even when the other creation-half of the pair no longer is in the output.

Thursday, February 5, 2015

Git 2.3

The latest feature release of Git version control system, version 2.3, is now available at the usual places.

This one ended up to be a release with lots of small corrections and improvements without big uncomfortably exciting features. It is a lot smaller release than other recent feature releases, consisting of 255 non-merge commits (version 2.0, 2.1 and 2.2 had 475, 698 and 556 commits, respectively) by 61 contributors (among which 19 are new people—welcome!).

The recent security fix that went to 2.2.1 and older maintenance tracks is also contained in this update.

One of my favorite small changes in this release is that the "Conflicts:" section that is prepared in the buffer to write your commit log message during a merge is now commented out, just like all the other hints to help you prepare the log message (e.g. the list of files with changes you might want to mention in the log, and the list of untracked files you might have forgot to "git add"). For the full text of the release notes, please visit the list archive.

Enjoy.

Friday, January 2, 2015

Having fun with Crouton

Chromebooks run ChromeOS, which is based on Linux but is made to appear running only the browser. Even though we can do so many things with just the browser these days, I stil have a few reasons why I need to keep a notebook that is not a Chromebook around me: Gimp (very occassionally when I take photos and need to touch them up), Calibre (to manage and populate a Nook Glowlight with eBooks) and GnuCash (to balance my checkbook).

Since I replaced it with Toshiba Chromebook 2, my old Samsung ARM Chromebook was looking for a good alternative use, and I thought I may be able to use it to run GnuCash under Crouton. Crouton is a tool to let us run more traditional Linux distros in a chroot environment on ChromeOS devices. I learned that it recently got better by allowing its virtual desktops shown in separate windows, side by side with native Chrome browser windows. One downside of Crouton is that it can only run under developer mode, side-stepping the ChromeOS's security model.

Even though I cannot turn my primary Chromebook to developer mode (because it has to be enrolled for enterprise access to access the workstations at work), I can sacrifice the ARM Chromebook that has now become redundant.

So, following instructions from the primary site of crouton, here is what I did:
  • Turn Chromebook into developer mode (this wipes the device)
    • Turn off the machine
    • Hold ESC + Refresh and turn the machine on to go into Recovery
    • Ctrl-d to reboot into the developer mode
  • The usual Chromebook activation
  • Download crouton by visiting https://goo.gl/fd3zc
  • Install crouton extension by visiting the webstore
  • Type Ctrl-Alt-t to open a terminal-looking window, type shell and then type
    cd ~/Downloads; sudo bash to get a useful interactive shell running as root
  • Type sh ./crouton -r trusty -t xfce and let it run (takes some time)
  • Type sh ./crouton -r trusty -u -t extension and let it run (takes some more time)
  • Type sh ./crouton -r trusty -u -t xiwi and let it run (takes some more time)
  • Then type startxfce4 which will open a XFCE desktop environment, Ubuntu trusty distribution.
  • Open a terminal in that Ubuntu environment, install gnucash as I normally would (e.g.
    apt-get update
    apt-get install gnucash

    just like any Debian-derived distribution).
A few tips I had to figure out by trial and error that I didn't find on the Web (I am not saying these tips do not exist elsewhere; I am saying that I didn't find them ;-) are:
  • Even though crouton -t xfce,extension,xiwi is supposed to be the syntax to install multiple targets, I couldn't get it work well. Adding xiwi as an update (notice the -u option in the above) after everything else seemed to be a way to make it work.
  • After reading about crouton but before trying it out myself, I wondered how to make the two environments talk with each other (especially how to transfer "gnucash" data file across as running it is the primary reason why I am interested in this whole exercise), but it turns out that it was surprisingly easy and straightforward. In the Ubuntu environment that runs under crouton, ~/Downloads is the same Downloads local file shown in the Files application on the ChromeOS side.
  • Every time I turn the Chromebook in developer mode on, it goes into Recovery and needs Ctrl-d to continue booting. The Recovery screen looks scary but this seems normal.
  • Running the crouton environment is done by
    • Type Ctrl-Alt-t for a terminal-looking window
    • Type shell and then
    • Type sudo startxfce4 -b
  • Even though Samsung ARM Chromebook is not a speed daemon and has merely 2GB, it is more than adequate to fill my needs. I've seen people say xiwi (which lets the X session to be seen in its own window, instead of occupying the full screen and has to be switched with Ctrl-Alt-Back/Forth keys) is too slow to be usable, but I am not running graphical games. I have a suspicion that I will be cursing it when I start using Gimp, but until then ... ;-)
(Left side runs Crouton in its own window, right side is just a normal Chrome browser)




Monday, December 22, 2014

On CVE-2014-9390 and Git 2.2.1

Now the security-fix releases are behind us, let's briefly talk about the ramifications.

The recent Git/Hg vulnerability on case-insensitive or normalizing filesystems are serious for people who fetch and integrate (either pull or pull --rebase) from untrusted sources.

When you grab a tree that records a malicious path, say, ".Git/hooks/post-checkout" using an older version of Git on such a filesystem (e.g. Windows NTFS or Mac OS X HFS+), Git will tell the filesystem to check it out at ".Git/hooks/post-checkout", but the filesystem overwrites a file different from what Git asked it to write, namely ".git/hooks/post-checkout", which is a path reserved for you to store an executable hook that is run after running "git checkout".

For an attacker to victimize you through this vector, the attacker has to have a write access to a repository you pull from. As long as you do not interact with untrustworthy strangers (e.g. only pull from the projects' official history), you will not be affected. That is often true in corporate setting, where the access to the central repository everybody in the product group uses is tightly controlled, and if an untrustworthy stranger has a write access there, you already have a bigger problem.

But the open-source is all about collaboration, and we need to meet and interact with new people every day while doing so. The prudent thing to do is to (1) update to the version of Git recently released to work around this issue, and then (2) respond to a pull request from a stranger, in this order. Don't do it the other way around!

Thanks.

Thursday, December 18, 2014

Git 1.8.5.6, 1.9.5, 2.0.5, 2.1.4 and 2.2.1 and thanking friends in Mercurial land

We have a set of urgent maintenance releases. Please update your Git if you are on Windows or Mac OS X.

Git maintains various meta-information for its repository in files in .git/ directory located at the root of the working tree. The system does not allow a file in that directory (e.g. .git/config) to be committed in the history of the project, or checked out to the working tree from the project. Otherwise, an unsuspecting user can run git pull from an innocuous-looking-but-malicious repository and have the meta-information in her repository overwritten, or executable hooks installed by the owner of that repository she pulled from (i.e. an attacker).

Unfortunately, this protection has been found to be inadequate on certain file systems:
  • You can commit and checkout to .Git/<anything> (or any permutations of cases .[gG][iI][tT], except .git all in lowercase). But this will overwrite the corresponding .git/<anything> on case-insensitive file systems (e.g. Windows and Mac OS X).
  • In addition, because HFS+ file system (Mac OS X) considers certain Unicode codepoints as ignorable; committing e.g. .g\u200cit/config, where U+200C is such an ignorable codepoint, and checking it out on HFS+ would overwrite .git/config because of this.
The issue is shared with other version control systems and has serious impact on affected systems (CVE-2014-9390).

Credit for discovering this issue goes to our friends in the Mercurial land (most notably, the inventor of Hg, Matt Mackall himself). The fixes to this issue for various implementations of Git (including mine, libgit2, JGit), ports using these implementations (including Git for Windows, Visual Studio) and also Mercurial have been coordinated for simultaneous releases. GitHub is running an updated version of their software that rejects trees with these confusing and problematic paths, in order to protect its users who use existing versions of Git (also see their blog post).

A huge thanks to all those who were involved.

New releases of Git for Windows, Git OSx Installer, JGit and libgit2 have been prepared to fix this issue. Microsoft (which uses libgit2 in their Visual Studio products) and Apple (which distributes a port of Git in their Xcode) both have fixes, as well.

For people building from the source, fixed versions of Git have been released as versions v1.8.5.6, v1.9.5, v2.0.5, v2.1.4, and v2.2.1 for various maintenance tracks.

Thanks.

Tuesday, September 30, 2014

Fun (?) with GnuPG

We use GnuPG as part of the infrastructure to certify authenticity of development history in Git in various places:
  • Signed tags created by git tag -s is to say "This tag was created by me, the holder of the private GnuPG key that signed this object". Because the object name of any Git object is computed as a cryptographic hash over what the object records, and because a signed tag object records the object name of a tagged object (typically a commit) and the human readable name (typically a release number or name) the tagger wants to give the tagged object, an attacker cannot forge a phony tag that points at a different commit signed with the private key the attacker does not have. You are saying "You can verify that it is true that I wanted to make that commit release X" safely because of this. Also, because the commit object records all the objects and their location in a project tree, and the parent commit objects, such a signed tag also ensures that all the development history behind such a tagged commit cannot be tampered with.
  • When you merge a signed tag (either done by git merge or git pull), the content of the tag with its GnuPG signature is copied to the resulting commit object. This lets you ensure that the history behind the side branch that was merged to the history cannot be tampered with and the signature certifies that it came from the signer (typically a subsystem lieutenant).
  • Signed commits created by git commit -S is a way to say "This commit was created by me", and ensures that the history behind the commit cannot be tampered with and certifies that the change it introduces came from the signer.
  • Still under development is git push --signed, a way to certify that you wanted to put a particular commit at the tip of a particular branch.
GnuPG is also used as a mechanism to ensure the integrity and authenticity of tarballs that are sent to the kernel.org servers, which is a common distribution point for open source projects like the Linux kernel and Git itself. A maintainer prepares a tarball and a detached signature, uploads them, and the receiving end will verify that the signature is good.

It is a common practice to specify the expiration date when creating a signing key. For example, the key I have been using to sign Git release tags was originally set up to expire in 3 years since the key was created. But the thing is, a project may outlive that expiry date. An interesting question is what happens to the existing tags when the key expires.

Unluckily, the right thing happens. If the holder of the key does not do anything, the key becomes expired, and the signatures in the signed tags stops validating. Luckily, the validity of a key can be extended by the holder of the key, and once it is done, the signatures made before the key's original expiration date will continue to validate fine.

At least, that is the theory ;-)

As my key was originally set to expire early next month, I've extended the lifespan of the key 96AFE6CB I have been using a few days ago and uploaded the updated key to pgp keyservers, so existing signed tags (e.g. v2.0.0) should continue to be valid.

A few tips:
  • Although this page is a specific instruction to Debian contributors, it was very helpful when I had to figure out how to futz with GnuPG subkeys. It does not talk about how to update the expiration date for a subkey, though (you use "gpg --edit-key" and then use "expire" command).
  • In order to force a specific subkey to be used when signing for Git, you would need to use the ! suffix to the GnuPG key-id, e.g. in my ~/.gitconfig file:
      [user] signingkey = 96AFE6CB!
    Without the ! suffix, GnuPG tries to use the newest subkey you have associated with the same primary key, which may not be the subkey you would want to use.
I signed a new v2.1.2 maintenance release with the same key today. Hopefully it will validate OK for you (otherwise, you may have to fetch the public key from the keyserver).

Wednesday, May 28, 2014

Git 2.0

The real "Git 2.0" is finally out.

From the point of view of end users who are totally new to Git, this release will give them the defaults that are vastly improved compared to the older versions, but at the same time, for existing users, this release is designed to be as low-impact as possible, as long as they have been following recent releases along (instead of sticking to age-old releases like 1.7.x series). Some may even say, without remembering why it was a big deal to bring these new default behaviours to help new users, that the new release does not offer anything exciting—and that is exactly what we want to hear from existing users. In recent releases for the past year or so, we have added knobs to allow users to enable these new defaults before 2.0 happens, and added warnings to let users know when they perform an operation whose outcome will be different between 1.x series and 2.0 release. The existing users are hopefully very well prepared by now, and "Git 2.0" is designed to be the final "flipping the default" step.

We had to delay the final release by a week or so because we found a few problems  in earlier release candidates (request-pull had a regression that stopped it from showing the "tags/" prefix in "Please pull tags/frotz" when the user asked to compose a request for 'frotz' to be pulled; a code path in git-gui to support ancient versions of Git incorrectly triggered for Git 2.0), which we had to fix in an extra unplanned release candidate.

Hopefully the next cycle will become shorter, as topics that have been cooking on the 'next' branch had extra time to mature, so it all evens out in the end ;-).

Have fun.

Friday, April 25, 2014

Git 2.0 release candidate 1

This is the first release candidate for the upcoming Git 2.0. There are usual sort of updates and fixes one would expect to see between any two feature releases, but the primary reason why its name begins with "2" (as opposed to the last feature release whose name was "Git 1.9") is because it has a few backward incompatible changes that are all meant to improve the end-user experiences.

  • People almost always push to a single place, and many people would push a single branch they are currently on. The default behaviour of "git push" (that does not say which branches to push out to where on the command line) has been updated to better support this mode of working (as opposed to working on making all branches they are going to publish ready and then push all of them in one go). The old default of pushing out all the matching branches is available by setting the push.default configuration variable to matching.
  • Even though "git commit -a" can be run from any subdirectory to commit changes to all the tracked paths in the working tree, "git add -u" and "git add -A" (without specifying any path on the command line) used to operate only inside the current directory. This inconsistency bothered many people, and these commands have been updated to operate on all modified (for "-u") or all (for "-A") paths. Use "git add -u ." and "git add -A ." to restrict the command to the current directory.
  • "git add path" is now the same as "git add -A path" now, so that "git add directory/" will notice paths you removed from the directory and record the removal.  In older versions of Git, it used to ignore removals.  You can say "git add --ignore-removal path" to add only added or modified paths, if you really want to.

Some of the readers may remember that we didn't give users a very good transition experience when we introduced a backward incompatible change in Git 1.6.0. We used to install all the "git-cmd"s in the same directory as "git" itself and people were used to that "git commit" and "git-commit" can be used interchangeably before that release. Then we stopped installing what does not have to be on user's $PATH at that release, which is a change that breaks people's finger-memory and existing scripts. All we did to prepare users for that change was to warn about it in release notes since Git 1.5.4 and it was apparently not enough. Many people were unhappy.

In retrospect, perhaps we could have done better by adding code to somehow detect when "git-cmd" is invoked as the top-level command and warn that such usage would break in future versions to train users to use "git cmd" form way before releasing the version that actually delivered the change.

This time around, we have been trying to be a lot more careful. For the past handful of releases, we have added extra code to detect cases where exiting versions of Git and the upcoming Git 2.0 will behave differently and to warn about the upcoming change. As the result, the actual difference between Git 1.9 and Git 2.0 is mostly "flipping the default" for these changes.

Have fun.

Tuesday, March 18, 2014

Git 1.9.1

Traditionally, releases numbered with three Dewey-decimal digits were major releases that add new features, while ones with four were maintenance releases with only fixes. This was meant to give us some flexibility to say that the difference between 1.7.12 and 1.8.0 are larger than the difference between 1.8.1 and 1.8.2 (1.7.12 was the last major release in the 1.7.x series), while reserving the difference in the first digit for really big changes (i.e. 2.0 may finally toggle a switch that makes Git incompatible with older 1.x releases out of the box).

But we found out that in practice, we do not need to have three levels of changes (an incremental that changes the third digit between 1.8.1 to 1.8.2, a larger update that changes the second digit between 1.7.12 to 1.8.0, and a huge update that changes the first digit between 1.9 and 2.0). Hence the last major release was officially called "Git 1.9" when it was released on February 14, 2014.

It logically follows that, because we are dropping the third digit (or the second, depending on how you look at it) from the numbering of major releases, the first maintenance release for Git 1.9 is named with three digits, not four.

Git 1.9.1 is such a release. Among many changes we have been cooking on the development front towards the next major release, which will be called Git 2.0, this maintenance release contains only the fixes, and everybody is encouraged to upgrade to it.

Fixes since Git 1.9 are as follows:
  • "git clean -d pathspec" did not use the given pathspec correctly and ended up cleaning too much.
  • "git difftool" misbehaved when the repository is bound to the working tree with the ".git file" mechanism, where a textual file ".git" tells us where it is.
  • "git push" did not pay attention to branch.*.pushremote if it is defined earlier than remote.pushdefault; the order of these two variables in the configuration file should not matter, but it did by mistake.
  • Codepaths that parse timestamps in commit objects have been tightened.
  • "git diff --external-diff" incorrectly fed the submodule directory in the working tree to the external diff driver when it knew it is the same as one of the versions being compared.
  • "git reset" needs to refresh the index when working in a working tree (it can also be used to match the index to the HEAD in an otherwise bare repository), but it failed to set up the working tree properly, causing GIT_WORK_TREE to be ignored.
  • "git check-attr" when working on a repository with a working tree did not work well when the working tree was specified via the --work-tree (and obviously with --git-dir) option.
  • "merge-recursive" was broken in 1.7.7 era and stopped working in an empty (temporary) working tree, when there are renames involved.  This has been corrected.
  • "git rev-parse" was loose in rejecting command line arguments that do not make sense, e.g. "--default" without the required value for that option.
  • include.path variable (or any variable that expects a path that can use ~username expansion) in the configuration file is not a boolean, but the code failed to check it.
  • "git diff --quiet -- pathspec1 pathspec2" sometimes did not return correct status value.
  • Attempting to deepen a shallow repository by fetching over smart HTTP transport failed in the protocol exchange, when no-done extension was used.  The fetching side waited for the list of shallow boundary commits after the sending end stopped talking to it.
  • Allow "git cmd path/", when the 'path' is where a submodule is bound to the top-level working tree, to match 'path', despite the extra and unnecessary trailing slash (such a slash is often given by command line completion).
Have fun.