loading...

Git – Merges

How to change the time zone on Windows Server 2019

Git is a distributed version control
system
(DVCS). It allows, for example, a developer in Japan and
another in New Jersey to make and record changes independently, and it
permits the two developers to combine their changes at any time, all without
a central repository. In this chapter, we’ll learn how to combine two or
more different lines of development.

A merge unifies two or more commit history
branches. Most often, a merge unites just two branches, although Git
supports a merge of three, four, or more branches at the same time.

In Git, a merge must occur within a single repository—that is, all the
branches to be merged must be present in the same repository. How the
branches come to be in the repository is not important. (As you will see in
Chapter 12, Git provides mechanisms for
referring to other repositories and for bringing remote branches into your
current working repository.)

When modifications in one branch do not conflict with modifications
found in another branch, Git computes a merge result and creates a new
commit that represents the new, unified state. But when branches conflict,
which occurs whenever changes compete to alter the same line of the same
file, Git does not resolve the dispute. Instead, Git marks such contentious
changes as unmerged in the index and leaves reconciliation up
to you, the developer. When Git cannot merge automatically, it’s also up to
you to make the final commit once all conflicts are resolved.

Merge Examples

To merge other_branch
into branch, you should check out the
target branch and merge the other branches into it, like this:

    $ git checkout branch
    $ git merge other_branch

Let’s work through a pair of example merges, one without conflicts
and one with substantial overlaps. To simplify the examples in this
chapter, we’ll use multiple branches per the techniques presented in Chapter 7.

Preparing for a Merge

Before you begin a merge, it’s best to tidy up your
working directory. During a normal merge, Git creates new versions of
files and places them in your working directory when it is finished.
Furthermore, Git also uses the index to store temporary and intermediate
versions of files during the operation.

If you have modified files in your working directory or if
you’ve modified the index via git add
or git rm, then your repository has a
dirty working directory or index. If you start a
merge in a dirty state, Git may be unable to combine the changes from
all the branches and from those in your working
directory or index in one pass.

Tip

You don’t have to start with a clean
directory. Git performs the merge, for example, if the files affected
by the merge operation and the dirty files in your working directory
are disjoint. However, as a general rule, your Git life will be much
easier if you start each merge with a clean working directory and
index.

Merging Two Branches

For the simplest scenario, let’s set up a repository with
a single file, create two branches, and then merge the pair of branches
together again.

    $ git init
    Initialized empty Git repository in /tmp/conflict/.git/
    $ git config user.email "jdl@example.com"
    $ git config user.name "Jon Loeliger"

    $ cat > file
    Line 1 stuff
    Line 2 stuff
    Line 3 stuff
    ^D
    $ git add file
    $ git commit -m "Initial 3 line file"
    Created initial commit 8f4d2d5: Initial 3 line file
    1 files changed, 3 insertions(+), 0 deletions(-)
    create mode 100644 file

Let’s create another commit on the master branch:

    $ cat > other_file
    Here is stuff on another file!
    ^D
    $ git add other_file
    $ git commit -m "Another file"
    Created commit 761d917: Another file
     1 files changed, 1 insertions(+), 0 deletions(-)
     create mode 100644 other_file

So far, the repository has one branch with two commits, where each
commit introduced a new file. Next, let’s change to a different branch
and modify the first file.

    $ git checkout -b alternate master^
    Switched to a new branch "alternate"

    $ git show-branch
    * [alternate] Initial 3 line file
     ! [master] Another file
    --
     + [master] Another file
    *+ [alternate] Initial 3 line file

Here, the alternate branch is
initially forked from the master^
commit, one commit behind the current head.

Make a trivial change to the file so you have something to merge,
and then commit it. Remember, it’s best to commit outstanding changes
and start a merge with a clean working directory.

    $ cat >> file
    Line 4 alternate stuff
    ^D
    $ git commit -a -m "Add alternate's line 4"
    Created commit b384721: Add alternate's line 4
     1 files changed, 1 insertions(+), 0 deletions(-)

Now there are two branches and each has different development
work. A second file has been added to the master branch, and a modification has been
made to alternate the branch. Because
the two changes do not affect the same parts of a common file, a merge
should proceed smoothly and without incident.

The git merge operation is
context sensitive. Your current branch is always the target branch, and
the other branch or branches are merged into the current branch. In this
case, the alternate branch should be
merged into the master branch, so the
latter must be checked out before you continue:

    $ git checkout master
    Switched to branch "master"

    $ git status
    # On branch master
    nothing to commit (working directory clean)

    # Yep, ready for a merge!

    $ git merge alternate
    Merge made by recursive.
     file |    1 +
     1 files changed, 1 insertions(+), 0 deletions(-)

You can use another commit graph viewing tool, a part of
git log, to see what what’s been
done:

    $ git log --graph --pretty=oneline --abbrev-commit

    *   1d51b93... Merge branch 'alternate'
    |\
    | * b384721... Add alternate's line 4
    * | 761d917... Another file
    |/
    * 8f4d2d5... Initial 3 line file

That is conceptually the commit graph described earlier in the
section Commit Graphs (Chapter 6), except that this graph is turned sideways,
with the most recent commits at the top rather than the right. The two
branches have split at the initial commit, 8f4d2d5; each branch shows one commit each
( 761d917 and b384721); and the two branches merge again at
commit 1d51b93.

Tip

Using git log –graph
is an excellent alternative to graphical tools such as gitk. The visualization provided by git log –graph is well-suited to dumb
terminals.

Technically, Git performs each merge symmetrically to produce one
identical, combined commit that is added to your current branch. The
other branch is not affected by the merge. Because the merge commit is
added only to your current branch, you can say, I merged some
other branch into this one.

A Merge with a Conflict

The merge operation is inherently problematic because it
necessarily brings together potentially varying and conflicting changes
from different lines of development. The changes on one branch may be
similar to or radically different from the changes on a different
branch. Modifications may alter the same files or a disjoint set of
files. Git can handle all these varied possibilities, but often it
requires guidance from you to resolve conflicts.

Let’s work through a scenario in which a merge leads to a
conflict. We begin with the results of the merge from the previous
section and introduce independent and conflicting changes on the
master and alternate branches. We then merge the alternate branch into the master branch, face the conflict, resolve it,
and commit the final result.

On the master branch, create a
new version of file with a few
additional lines in it and then commit the changes:

    $ git checkout master

    $ cat >> file
    Line 5 stuff
    Line 6 stuff
    ^D

    $ git commit -a -m "Add line 5 and 6"
    Created commit 4d8b599: Add line 5 and 6
     1 files changed, 2 insertions(+), 0 deletions(-)

Now, on the alternate branch,
modify the same file differently. Whereas you made new commits to the
master branch, the alternate branch has not progressed
yet.

    $ git checkout alternate
    Switched branch "alternate"

    $ git show-branch
    * [alternate] Add alternate's line 4
     ! [master] Add line 5 and 6
    --
     + [master] Add line 5 and 6
    *+ [alternate] Add alternate's line 4

    # In this branch, "file" left off with "Line 4 alternate stuff"

    $ cat >> file
    Line 5 alternate stuff
    Line 6 alternate stuff
    ^D

    $ cat file
    Line 1 stuff
    Line 2 stuff
    Line 3 stuff
    Line 4 alternate stuff
    Line 5 alternate stuff
    Line 6 alternate stuff

    $ git diff
    diff --git a/file b/file
    index a29c52b..802acf8 100644
    --- a/file
    +++ b/file
    @@ -2,3 +2,5 @@ Line 1 stuff
     Line 2 stuff
     Line 3 stuff
     Line 4 alternate stuff
    +Line 5 alternate stuff
    +Line 6 alternate stuff

    $ git commit -a -m "Add alternate line 5 and 6"
    Created commit e306e1d: Add alternate line 5 and 6
     1 files changed, 2 insertions(+), 0 deletions(-)

Let’s review the scenario. The current branch history looks like
this:

    $ git show-branch
    * [alternate] Add alternate line 5 and 6
     ! [master] Add line 5 and 6
    --
    *  [alternate] Add alternate line 5 and 6
     + [master] Add line 5 and 6
    *+ [alternate^] Add alternate's line 4

To continue, check out the master branch and try to perform the
merge:

    $ git checkout master
    Switched to branch "master"

    $ git merge alternate
    Auto-merged file
    CONFLICT (content): Merge conflict in file
    Automatic merge failed; fix conflicts and then commit the result.

When a merge conflict like this occurs, you should almost
invariably investigate the extent of the conflict using the git diff command. Here, the single file named
file has a conflict in its
content:

    $ git diff
    diff --cc file
    index 4d77dd1,802acf8..0000000
    --- a/file
    +++ b/file
    @@@ -2,5 -2,5 +2,10 @@@ Line 1 stuf
      Line 2 stuff
      Line 3 stuff
      Line 4 alternate stuff
    ++<<<<<<< HEAD:file
     +Line 5 stuff
     +Line 6 stuff
    ++=======
    + Line 5 alternate stuff
    + Line 6 alternate stuff
    ++>>>>>>> alternate:file

The git diff command
shows the differences between the file in your working directory and the
index. In the traditional diff
command output style, the changed content is presented between
<<<<<<< and
=======, with an
alternate between ======= and >>>>>>>. However,
additional plus and minus signs are used in the combined
diff
format to indicate changes from multiple sources
relative to the final resulting version.

The previous output shows that the conflict covers lines 5 and 6,
where deliberately different changes were made in the two branches. It’s
then up to you to resolve the conflict. When resolving a merge conflict,
you are free to choose any resolution you would like for the file. That
includes picking lines from only one side or the other, or a mix from
both sides, or even making up something completely new and different.
Although that last option might be confusing, it is a valid
choice.

In this case, I chose a line from each branch as the makeup of my
resolved version. The edited file now has this content:

    $ cat file
    Line 1 stuff
    Line 2 stuff
    Line 3 stuff
    Line 4 alternate stuff
    Line 5 stuff
    Line 6 alternate stuff

If you are happy with the conflict resolution, you should git add the file to the index and stage it for
the merge commit:

    $ git add file

After you have resolved conflicts and staged final
versions of each file in the index using git
add
, it is finally time to commit the merge using git commit. Git places you in your favorite
editor with a template message that looks like this:

    Merge branch 'alternate'

    Conflicts:
            file
    #
    # It looks like you may be committing a MERGE.
    # If this is not correct, please remove the file
    #       .git/MERGE_HEAD
    # and try again.
    #

    # Please enter the commit message for your changes.
    # (Comment lines starting with '#' will not be included)
    # On branch master
    # Changes to be committed:
    #   (use "git reset HEAD <file>..." to unstage)
    #
    #       modified:   file
    #

As usual, the lines beginning with the octothorp
( #) are comments and meant solely for
your information while you write a message. All comment lines are
ultimately elided from the final commit log message. Feel free to alter
or augment the commit message as you see fit, perhaps adding a note
about how the conflict was resolved.

When you exit the editor, Git should indicate the successful
creation of a new merge commit:

    $ git commit

    # Edit merge commit message

    Created commit 7015896: Merge branch 'alternate'

    $ git show-branch
    ! [alternate] Add alternate line 5 and 6
     * [master] Merge branch 'alternate'
    --
     - [master] Merge branch 'alternate'
    +* [alternate] Add alternate line 5 and 6

You can see the resulting merge commit using:

    $ git log

Working with Merge Conflicts

As demonstrated by the previous example, there are instances
when conflicting changes can’t be merged automatically.

Let’s create another scenario with a merge conflict to explore the
tools Git provides to help resolve disparities. Starting with a common
hello with just the contents
hello, let’s create two different branches with two
different variants of the file.

    $ git init
    Initialized empty Git repository in /tmp/conflict/.git/

    $ echo hello > hello
    $ git add hello
    $ git commit -m "Initial hello file"
    Created initial commit b8725ac: Initial hello file
     1 files changed, 1 insertions(+), 0 deletions(-)
     create mode 100644 hello

    $ git checkout -b alt
    Switched to a new branch "alt"
    $ echo world >> hello
    $ echo 'Yay!' >> hello
    $ git commit -a -m "One world"
    Created commit d03e77f: One world
     1 files changed, 2 insertions(+), 0 deletions(-)

    $ git checkout master
    $ echo worlds >> hello
    $ echo 'Yay!' >> hello
    $ git commit -a -m "All worlds"
    Created commit eddcb7d: All worlds
     1 files changed, 2 insertions(+), 0 deletions(-)

One branch says world, whereas
the other says worlds—a deliberate
difference.

As in the earlier example, if you check out master and try to merge the alt branch into it, a conflict arises.

    $ git merge alt
    Auto-merged hello
    CONFLICT (content): Merge conflict in hello
    Automatic merge failed; fix conflicts and then commit the result.

As expected, Git warns you about the conflict found in the hello file.

Locating Conflicted Files

But what if Git’s helpful directions scrolled off the
screen or if there were many files with conflicts? Luckily, Git keeps
track of problematic files by marking each one in the index as
conflicted, or unmerged.

You can also use either the git
status
command or the git ls-files
-u
command to show the set of files that remain unmerged in
your working tree.

    $ git status
    hello: needs merge
    # On branch master
    # Changed but not updated:
    #   (use "git add <file>..." to update what will be committed)
    #
    #       unmerged:   hello
    #
    no changes added to commit (use "git add" and/or "git commit -a")

    $ git ls-files -u
    100644 ce013625030ba8dba906f756967f9e9ca394464a 1       hello
    100644 e63164d9518b1e6caf28f455ac86c8246f78ab70 2       hello
    100644 562080a4c6518e1bf67a9f58a32a67bff72d4f00 3       hello

You can use git diff to show
what’s not yet merged, but it will show all of the gory details,
too!

Inspecting Conflicts

When a conflict appears, the working directory copy of
each conflicted file is enhanced with three-way diff or merge markers.
Continuing from where the example left off, the resulting conflicted
file now looks like this:

    $ cat hello
    hello
    <<<<<<< HEAD:hello
    worlds
    =======
    world
    >>>>>>> 6ab5ed10d942878015e38e4bab333daff614b46e:hello
    Yay!

The merge markers delineate the two possible versions of
the conflicting chunk of the file. In the first version, the chunk says
worlds; in the other version, it says
world. You could simply choose one phrase or the other,
remove the conflict markers, and then run git
add
and git commit, but
let’s explore some of the other features Git offers to help resolve
conflicts.

Tip

The three-way merge marker lines ( <<<<<<<<, ========, and >>>>>>>>) are
automatically generated, but they’re just meant to be read by you, not
(necessarily) a program. You should delete them with your text editor once you resolve the
conflict.

git diff with conflicts

Git has a special, merge-specific variant of git diff to display the changes made against
both parents simultaneously. In the example, it
looks like this:

    $ git diff
    diff --cc hello
    index e63164d,562080a..0000000
    --- a/hello
    +++ b/hello
    @@@ -1,3 -1,3 +1,7 @@@
      hello
    ++<<<<<<< HEAD:hello
     +worlds
    ++=======
    + world
    ++>>>>>>> alt:hello
      Yay!

What does it all mean? It’s the simple combination of two diffs:
one versus the first parent, called HEAD, and one against the second parent, or
alt. (Don’t be surprised if the
second parent is an absolute SHA1 name representing some unnamed
commit from some other repository!) To make things easier, Git also
gives the second parent the special name MERGE_HEAD.

You can compare both the HEAD
and MERGE_HEAD versions against the
working directory ( merged) version:

    $ git diff HEAD
    diff --git a/hello b/hello
    index e63164d..4e4bc4e 100644
    --- a/hello
    +++ b/hello
    @@ -1,3 +1,7 @@
     hello
    +<<<<<<< HEAD:hello
     worlds
    +=======
    +world
    +>>>>>>> alt:hello
     Yay!

And then this:

    $ git diff MERGE_HEAD
    diff --git a/hello b/hello
    index 562080a..4e4bc4e 100644
    --- a/hello
    +++ b/hello
    @@ -1,3 +1,7 @@
     hello
    +<<<<<<< HEAD:hello
    +worlds
    +=======
     world
    +>>>>>>> alt:hello
     Yay!

Tip

In newer versions of Git, git
diff –ours
is a synonym for git
diff HEAD
, because it shows the differences between
our version and the merged version. Similarly,
git diff MERGE_HEAD can be
written as git diff –theirs. You
can use git diff –base to see
the combined set of changes since the merge base, which would
otherwise be rather awkwardly written as:

    $ git diff $(git merge-base HEAD MERGE_HEAD)

If you line up the two diffs side by side, all the text except
the + columns are the same, so Git
prints the main text only once and prints the + columns next to each other.

The conflict found by git
diff
has two columns of information prepended to each line
of output. A plus sign in a column indicates a line addition, a minus
sign indicates a line removal, and a blank indicates a line with no
change. The first column shows what’s changing versus your version,
and the second column shows what’s changing versus the other version.
The conflict marker lines are new in both versions, so they get a
++. The world and worlds lines are new only in one version or
the other, so they have just a single + in the corresponding column.

Suppose you edit the file to pick a third option, like
this:

    $ cat hello
    hello
    worldly ones
    Yay!

Then the new git diff output
is

    $ git diff
    diff --cc hello
    index e63164d,562080a..0000000
    --- a/hello
    +++ b/hello
    @@@ -1,3 -1,3 +1,3 @@@
      hello
    - worlds
     -world
    ++worldly ones
      Yay!

Alternatively, you could choose one or the other original
version, like this:

    $ cat hello
    hello
    world
    Yay!

The git diff output would
then be:

    $ git diff
    diff --cc hello
    index e63164d,562080a..0000000
    --- a/hello
    +++ b/hello

Wait! Something strange happened there. Where does it show where
the world line was added to the
base version? Where does it show that the worlds line was removed from the HEAD version? As you have resolved the
conflict in favor of the MERGE_HEAD
version, Git deliberately omits the diff because it thinks you
probably don’t care about that section anymore.

Running git diff on a
conflicted file only shows you the sections that really have a
conflict. In a large file with numerous changes scattered throughout,
most of those changes don’t have a conflict; either one side of the
merge changed a particular section or the other side did. When you’re
trying to resolve a conflict, you rarely care about those sections, so
git diff trims out uninteresting
sections using a simple heuristic: if a section has changes versus
only one side, that section isn’t shown.

This optimization has a slightly confusing side effect: once you
resolve something that used to be a conflict by
simply picking one side or the other, it stops showing up. That’s
because you modified the section so that it only changes one side or
the other (i.e., the side that you didn’t choose), so to Git it looks
just like a section that was never conflicted at all.

This is really more a side effect of the implementation than an
intentional feature, but you might consider it useful anyway: git diff shows you only those sections of
the file that are still conflicted, so you can
use it to keep track of the conflicts you haven’t fixed yet.

git log with conflicts

While you’re in the process of resolving a conflict,
you can use some special git log
options to help you figure out exactly where the changes came from and
why. Try this:

    $ git log --merge --left-right -p

    commit <eddcb7dfe63258ae4695eb38d2bc22e726791227
    Author: Jon Loeliger <jdl@example.com>
    Date:   Wed Oct 22 21:29:08 2008 -0500

        All worlds

    diff --git a/hello b/hello
    index ce01362..e63164d 100644
    --- a/hello
    +++ b/hello
    @@ -1 +1,3 @@
     hello
    +worlds
    +Yay!

    commit >d03e77f7183cde5659bbaeef4cb51281a9ecfc79
    Author: Jon Loeliger <example@example.com>
    Date:   Wed Oct 22 21:27:38 2008 -0500

        One world

    diff --git a/hello b/hello
    index ce01362..562080a 100644
    --- a/hello
    +++ b/hello
    @@ -1 +1,3 @@
     hello
    +world
    +Yay!

This command shows all the commits in both parts of the history
that affect conflicted files in your merge, along with the actual
changes each commit introduced. If you wondered when, why, how, and by
whom the line worlds came to be
added to the file, you can see exactly which set of changes introduced
it.

The options provided to git
log
are as follows:

  • --merge shows only commits related to files
    that produced a conflict

  • --left-right displays < if the commit was from the
    left side of the merge ( our version,
    the one you started with), or > if the commit was from the
    right side of the merge ( their
    version, the one you’re merging in)

  • -p shows the commit message and the patch
    associated with each commit

If your repository were more complicated and several files had
conflicts, you could also provide the exact filename(s) you’re
interested in as a command line option, like this:

    $ git log --merge --left-right -p hello

The examples here have been kept small for demonstration
purposes. Of course, real-life situations are likely to be
significantly larger and more complex. One technique to mitigate the
pain of large merges with nasty, extended conflicts is to use several
small commits with well-defined effects contained to individual
concepts. Git handles small commits well, so there is no need to wait
until the last minute to commit large, widespread changes. Smaller
commits and more frequent merge cycles reduce the pain of conflict
resolution.

How Git Keeps Track of Conflicts

How exactly does Git keep track of all the information
about a conflicted merge? There are several parts:

  • .git/MERGE_HEAD
    contains the SHA1 of the commit you’re merging in. You don’t really
    have to use the SHA1 yourself; Git knows to look in that file
    whenever you talk about MERGE_HEAD.

  • .git/MERGE_MSG
    contains the default merge message used when you git commit after resolving the
    conflicts.

  • The Git index contains three copies of each
    conflicted file: the merge base, our version, and
    their version. These three copies are assigned
    respective stage numbers 1, 2, and
    3.

  • The conflicted version (merge markers and all) is
    not stored in the index. Instead, it is stored
    in a file in your working directory. When you run git diff without any parameters, the
    comparison is always between what’s in the index with what’s in your
    working directory.

To see how the index entries are stored, you can use the
git ls-files plumbing command as
follows:

    $ git ls-files -s
    100644 ce013625030ba8dba906f756967f9e9ca394464a 1       hello
    100644 e63164d9518b1e6caf28f455ac86c8246f78ab70 2       hello
    100644 562080a4c6518e1bf67a9f58a32a67bff72d4f00 3       hello

The -s option to git
ls-files
shows all the files with
all stages. If you want to see only the conflicted
files, use the -u option instead.

In other words, the hello file is stored three times, and each
has a different hash corresponding to the three different versions. You
can look at a specific variant by using git
cat-file
:

    $ git cat-file -p e63164d951
    hello
    worlds
    Yay!

You can also use some special syntax with git diff to compare different versions of the
file. For example, if you want to see what changed between the merge
base and the version you’re merging in, you can do this:

    $ git diff :1:hello :3:hello
    diff --git a/:1:hello b/:3:hello
    index ce01362..562080a 100644
    --- a/:1:hello
    +++ b/:3:hello
    @@ -1 +1,3 @@
     hello
    +world
    +Yay!

Tip

Starting with Git version 1.6.1, the git checkout command accepts the
--ours or --theirs option as
shorthand for simply checking out (a file from) one side or the other
of a conflicted merge; your choice resolves the conflict. These two
options can only be used during a conflict resolution.

Using the stage numbers to name a version is different from
git diff –theirs, which shows the differences between their
version and the resulting, merged (or still conflicted) version in your working
directory. The merged version is not yet in the index, so it doesn’t
even have a number.

Because you fully edited and resolved the working copy version in
favor of their version, there should be no difference now:

    $ cat hello
    hello
    world
    Yay!

    $ git diff --theirs
    * Unmerged path hello

All that remains is an unmerged path reminder to add it to the
index.

Finishing Up a Conflict Resolution

Let’s make one last change to the hello file before declaring it merged:

    $ cat hello
    hello
    everyone
    Yay!

Now that the file is fully merged and resolved, git add reduces the index to just a single
copy of the hello file
again:

    $ git add hello
    $ git ls-files -s
    100644 ebc56522386c504db37db907882c9dbd0d05a0f0 0       hello

That lone 0
between the SHA1 and the path name tells you that the stage number for a
nonconflicted file is zero.

You must work through all the conflicted files as recorded
in the index. You cannot commit as long as there is an unresolved
conflict. Therefore, as you fix the conflicts in a file, run git add (or git
rm
, git update-index, etc.)
on the file to clear its conflict status.

Warning

Be careful not to git add
files with lingering conflict markers. Although that will clear the
conflict in the index and allow you to commit, your file won’t be
correct.

Finally, you can git
commit
the end result and use git
show
to see the merge commit:

    $ cat .git/MERGE_MSG
    Merge branch 'alt'

    Conflicts:
            hello

    $ git commit

    $ git show

    commit a274b3003fc705ad22445308bdfb172ff583f8ad
    Merge: eddcb7d... d03e77f...
    Author: Jon Loeliger <@example.com>
    Date:   Wed Oct 22 23:04:18 2008 -0500

        Merge branch 'alt'

        Conflicts:
            hello

    diff --cc hello
    index e63164d,562080a..ebc5652
    --- a/hello
    +++ b/hello
    @@@ -1,3 -1,3 +1,3 @@@
      hello
    - worlds
     -world
    ++everyone
      Yay!

You should notice three interesting things when you look at a
merge commit:

  • There is a new, second line in the header that says Merge:. Normally there’s no need to show
    the parent of a commit in git log
    or git show, since there is only
    one parent and it’s typically the one that comes right after it in
    the log. But merge commits typically have two (and sometimes more)
    parents, and those parents are important to understanding the merge.
    Hence, git log and git show always print the SHA1 of each
    ancestor.

  • The automatically generated commit log message helpfully notes
    the list of files that conflicted. This can be useful later if it
    turns out a particular problem was caused by your merge. Usually,
    problems caused by a merge are caused by the files that had to be
    merged by hand.

  • The diff of a merge commit is not a normal diff. It is
    always in the combined diff or
    conflicted merge format. A successful merge in Git is
    considered to be no change at all; it is simply the combination of
    other changes that already appeared in the history. Thus, showing
    the contents of a merge commit shows only the
    parts that are different from one of the merged branches, not the
    entire set of changes.

Aborting or Restarting a Merge

If you start a merge operation but then decide for some
reason that you don’t want to complete it, Git provides an easy way to
abort the operation. Prior to executing the final git commit on the merge commit, use:

    $ git reset --hard HEAD

This command restores both your working directory and the
index to the state immediately prior to the git
merge
command.

If you want to abort or discard the merge after it has finished
(that is, after it’s introduced a new merge commit), use the
command:

    $ git reset --hard ORIG_HEAD

Prior to beginning the merge operation, Git saves your original
branch HEAD in the ORIG_HEAD ref for just this sort of
purpose.

You should be very careful here, though. If you did not start the
merge with a clean working directory and index, you could get in trouble
and lose any uncommitted changes you have in your directory.

You can initiate a git merge
request with a dirty working directory, but if you execute git reset –hard then your dirty state prior
to the merge is not fully restored. Instead, the reset loses your dirty
state in the working directory area. In other words, you requested a --hard reset
to the HEAD state! (See Using git reset.)

Starting with Git version 1.6.1, you have another choice. If you
have botched a conflict resolution and want to return to the original
conflict state before trying to resolve it again, you can use the
command git checkout -m.

Merge Strategies

So far, our examples have been easy to handle because there
are only two branches. It might seem like Git’s extra complexity of
DAG-shaped history and long, hard-to-remember commit IDs isn’t really worth it.
And maybe it isn’t for such a simple case. So, let’s look at something a
little more complicated.

Imagine that instead of just one person working on in your
repository there are three. To keep things simple, suppose that each
developer—Alice, Bob, and Cal—is able to contribute changes as commits on
three separate eponymous branches within a shared repository.

Because the developers are all contributing to separate branches,
let’s leave it up to one person, Alice, to manage the integration of the
various contributions. In the meantime, each developer is allowed to
leverage the development of the others by directly incorporating or
merging a coworker’s branch, as needed.

Eventually, the coders develop a repository with a commit history as
shown in Figure 9-1.

Figure 9-1. Criss-cross merge setup

Imagine that Cal started the project and Alice joined in. Alice
worked on it for a while, then Bob joined in. In the meantime, Cal has
been working away on his own version.

Eventually, Alice merged in Bob’s changes, and Bob kept on working
without merging Alice’s changes back into his tree. There are now three
different branch histories (Figure 9-2).

Figure 9-2. After Alice merges in Bob

Let’s imagine that Bob wants to get Cal’s latest changes. The
diagram is looking pretty complicated now, but this part is still
relatively easy. Trace up the tree from Bob, through Alice, until you
reach the point where she first diverged from Cal. That’s A, the merge base between Bob and Cal. To merge
from Cal, Bob needs to take the set of changes between the merge base,
A, and Cal’s latest, Q, and three-way merge them into his own tree,
yielding commit K. The result is the
history shown in Figure 9-3.

Figure 9-3. After Bob merges in Cal

Tip

You can always find the merge base between
two or more branches by using git
merge-base
. It is possible for there to be more than one
equally valid merge base for a set of branches.

So far, so good.

Alice now decides that she, too, wants to get Cal’s latest changes,
but she doesn’t realize Bob has already merged Cal’s tree into his. So she
just merges Cal’s tree into hers. That’s another easy operation because
it’s obvious where she diverged from Cal. The resulting history is shown
in Figure 9-4.

Figure 9-4. After Alice merges in Cal

Next, Alice realizes that Bob has done some more work, L, and wants to merge from him again. What’s the
merge base (between L and E) this time?

Unfortunately, the answer is ambiguous. If you trace all the way
back up the tree, you might think the original revision from Cal is a good
choice. But that doesn’t really make sense: both Alice and Bob now have
Cal’s newest revision. If you ask for the differences
from Cal’s original revision to Bob’s latest then it will also include
Cal’s newer changes, which Alice already has, which is likely to result in
a merge conflict.

What if you use Cal’s latest revision as the base? It’s better, but
still not quite right: if you take the diff from Cal’s latest to Bob’s
latest, you get all Bob’s changes. But Alice already
has some of Bob’s changes, so you’ll probably get a
merge conflict there, too.

And what if you use the version that Alice last merged from Bob,
version J? Creating a diff from there
to Bob’s latest will include only the newest changes from Bob, which is
what you want. But it also includes the changes from Cal, which Alice
already has!

What to do?

This kind of situation is called a criss-cross
merge
because changes have been merged back and forth between
branches. If changes moved in only one direction (e.g., from Cal to Alice
to Bob, but never from Bob to Alice or from Alice to Cal), then merging
would be simple. Unfortunately, life isn’t always that easy.

The Git developers originally wrote a straightforward mechanism to
join two branches with a merge commit, but scenarios like the one just
described soon led them to realize that a more clever approach was needed.
Hence, the developers generalized, parameterized, and introduced
alternate, configurable merge strategies to handle
different scenarios.

Let’s look at the various strategies and see how to apply each
one.

Degenerate Merges

There are two common degenerate scenarios that lead to
merges and are called already up-to-date and
fast-forward. Because neither of these scenarios
actually introduces a new merge commit after performing the git merge,[19] some might consider them not to be true merge
strategies.

  • Already up-to-date. When all the
    commits from the other branch (its HEAD) are already present in your target
    branch, even if it has advanced on its own, the target branch is
    said to be already up-to-date. As a result, no new commits are added
    to your branch.

    For example, if you perform a merge and immediately follow it
    with the exact same merge request, then you will be told that your
    branch is already up-to-date.

        # Show that alternate is already merged into master
    
        $ git show-branch
        ! [alternate] Add alternate line 5 and 6
         * [master] Merge branch 'alternate'
        --
         - [master] Merge branch 'alternate'
        +* [alternate] Add alternate line 5 and 6
    
        # Try to merge alternate into master again
    
        $ git merge alternate
        Already up-to-date.
  • Fast-forward. A fast-forward merge
    happens when your branch HEAD is
    already fully present and represented in the other branch. This is
    the inverse of the Already up-to-date
    case.

    Because your HEAD is
    already present in the other branch (likely due to a common
    ancestor), Git simply tacks on to your HEAD the new commits from the other
    branch. Git then moves your branch HEAD to point to the final, new commit.
    Naturally, the index and your working directory are also adjusted
    accordingly to reflect the new, final commit state.

    The fast-forward case is particularly
    common on tracking branches because they simply fetch and record the
    remote commits from other repositories. Your local tracking branch
    HEADs will always be fully
    present and represented, because that is where the branch HEAD was after the
    previous fetch operation. See Chapter 12 for more details.

It is important for Git to handle these cases without introducing
actual commits. Imagine what would happen in the fast-forward case if
Git created a commit. Merging branch A into B
would first produce Figure 9-5. Then merging
B into A would produce Figure 9-6, and merging back
again would yield Figure 9-7.

Figure 9-5. First nonconverging merge
Figure 9-6. Second nonconverging merge
Figure 9-7. Third nonconverging merge

Each new merge is a new commit, so the sequence will never
converge on a steady state and reveal that the two branches are
identical.

Normal Merges

These merge strategies all produce a final commit, added
to your current branch, that represents the combined state of the
merge.

  • Resolve. The
    resolve strategy operates on only two
    branches, locating the common ancestor as the merge basis and
    performing a direct three-way merge by
    applying the changes from the merge base to the tip of the other
    branch HEAD onto the current
    branch. This method makes intuitive sense.

  • Recursive. The
    recursive strategy is similar to the
    resolve strategy in that it can only join two
    branches at once. However, it is designed to handle the scenario
    where there is more than one merge base between the two branches. In
    these cases, Git forms a temporary merge of all of the common merge
    bases and then uses that as the base from which
    to derive the resulting merge of the two given branches via a normal
    three-way merge algorithm.

    The temporary merge basis is thrown away, and the final merge
    state is committed on your target branch.

  • Octopus. The
    octopus strategy is specifically designed to
    merge together more than two branches simultaneously. Conceptually,
    it is fairly simple; internally, it calls the
    recursive merge strategy multiple times, once
    for each branch you are merging.

    However, this strategy cannot handle a merge that requires any
    form of conflict resolution that would necessitate user interaction.
    In such a case, you are forced to do a series of normal merges,
    resolving the conflicts one step at a time.

Recursive merges

A simple criss-cross merge example is shown in Figure 9-8.

Figure 9-8. Simple criss-cross merge

The nodes a and b are both merge bases for a merge between
A and B. Either one could be used as the merge
base and yield reasonable results. In this case, the recursive
strategy would merge a and b into a temporary merge base, using that as
the merge base for A and B.

Because a and b could have the same problem, merging them
could require another merge of still older commits. That is why this
algorithm is called recursive.

Octopus merges

The main reasons why Git supports merging multiple branches
together all at once are generality and design elegance. In Git, a
commit can have no parents (the initial commit), one parent (a normal
commit), or more than one parent (a merge commit). Once you have more
than one parent, there is no particular reason to limit that number to
only two, so Git data structures support multiple parents.[20] The octopus merge strategy is a natural consequence of
the general design decision to allow a flexible list of commit
parents.

Octopus merges look nice in diagrams, so Git users tend to use
them as often as possible. You can just imagine the rush of endorphins
a developer gets when merging six branches of a program into one.
Besides looking pretty, octopus merges don’t actually do anything
extra. You could just as easily make multiple merge commits, one per
branch, and accomplish exactly the same thing.

Specialty Merges

There are two special merge strategies that you should be
aware of because they can sometimes help you solve strange problems.
Feel free to skip this section if you don’t have a strange problem. The
two special strategies are ours and
subtree.

These merge strategies each produce a final commit, added
to your current branch, that represents the combined state of the
merge.

  • Ours. The ours
    strategy merges in any number of other branches, but it actually
    discards changes from those branches and uses only the files from
    the current branch. The result of an ours
    merge is identical to the current HEAD, but any other named branches are
    also recorded as commit parents.

    This is useful if you know you already have all the changes
    from the other branches but want to combine the two histories
    anyway. That is, it lets you record that you have somehow performed
    the merge, perhaps directly by hand, and that future Git operations
    shouldn’t try to merge the histories again. Git can treat this as
    real merge no matter how it came to be.

  • Subtree. The
    subtree strategy merges in another branch,
    but everything in that branch is merged into a particular subtree of
    the current tree. You don’t specify which subtree; Git determines
    that automatically.

Applying Merge Strategies

So how does Git know or determine which strategy to use?
Or, if you don’t like Git’s choice, how do you specify a different
one?

Git tries to keep the algorithms it uses as simple and
inexpensive as possible, so it first tries using the already up-to-date and fast-forward to eliminate the trivial, easy
scenarios if possible.

If you specify more than one other branch to be merged into your
current branch, Git has no choice but trying the octopus strategy because that is the only one
capable of joining more than two branches in a single merge.

Failing those special cases, Git must use a default strategy that
works reliably in all other scenarios. Originally, resolve was the default merge strategy used by
Git.

In criss-cross merge situations such as those described
previously, where there is more than one possible merge basis, the
resolve strategy works like this: pick one of the
possible merge bases (either the last merge from Bob’s branch or the
last merge from Cal’s branch) and hope for the best. This is actually
not as bad as it sounds. It often turns out that Alice, Bob, and Cal
have all been working on different parts of the code. In that case, Git
detects that it’s remerging some changes that are already in place and
just skips duplicate changes, avoiding the conflict. Or, if there are
slight changes that do cause a conflict, at least the conflicts should
be fairly easy for a developer to handle.

Because resolve is no longer
Git’s default, if Alice wanted to use it then she would make an explicit
request:

    $ git merge -s resolve Bob

In 2005, Fredrik Kuivinen contributed the new
recursive merge strategy, which has since become
the default. It is more general than resolve and has been shown to result in fewer
conflicts, without fault, on the Linux kernel. It also handles merges
with renames quite well.

In the previous example, where Alice wants to merge all of Bob’s
work, the recursive strategy would
work like this:

  1. Start with the most recent revision from Cal that
    both Alice and Bob have. In this case, that’s
    Cal’s most recent revision, Q,
    which has been merged into both Bob’s and Alice’s branches.

  2. Calculate the diff between that revision and the most recent
    revision that Alice merged from Bob, and patch that in.

  3. Calculate the diff between that combined version and Bob’s
    latest version, and patch that in.

This method is called recursive because there may
be extra iterations, depending on how many levels of criss-crossing and
merge bases Git encounters. And it works. Not only does the recursive method make intuitive sense, it has
also been proven to result in fewer conflicts in real-life situations
than the simpler resolve strategy.
That’s why recursive is now the
default strategy for git
merge
.

Of course, no matter which strategy Alice chooses to use, the
final history looks the same (Figure 9-9).

Figure 9-9. Final criss-cross merge history
Using ours and subtree

You can use these two merge strategies together. For
example, once upon a time, the gitweb program (which is now part of
git) was developed outside the main
git.git repository. But at revision
0a8f4f, its entire history was
merged into git.git under the
gitweb subtree. If you wanted to do
something similar, you could proceed as follows

  1. Copy the current files from the gitweb.git project into the gitweb subdirectory of your
    project.

  2. Commit them as usual.

  3. Pull from the gitweb.git
    project using the ours
    strategy:

        $ git pull -s ours gitweb.git master

    You use ours here because
    you know that you already have the latest version of the files and
    you have already put them exactly where you want them (which is
    not where the normal recursive strategy would
    have put them).

  4. In the future, you can continue to pull the latest changes
    from the gitweb.git project
    using the subtree
    strategy:

        $ git pull -s subtree gitweb.git master

    Because the files already exist in your repository, Git
    knows automatically which subtree you put them in and performs the
    updates without any conflicts.

Merge Drivers

Each of the merge strategies described in this chapter
uses an underlying merge driver to resolve and
merge each individual file. A merge driver accepts the names of three
temporary files that represent the common ancestor, the target branch
version, and the other branch version of a file. The driver modifies the
target branch version to have the merged result.

The text merge driver leaves the
usual three-way merge markers, ( <<<<<<<<, ========, and >>>>>>>).

The binary merge driver simply
keeps the target branch version of the file and leaves the file marked
as a conflict in the index. Effectively, that forces you to handle
binary files by hand.

The final built-in merge driver,
union, simply leaves all the lines from both
versions in the merged file.

Through Git’s attribute mechanism, Git can tie specific files or
file patterns to specific merge drivers. Most text files are handled by
the text driver and most binary files
by the binary driver. Yet, for
special needs that warrant an application-specific merge operation, you
can create and specify your own custom merge driver and tie it to your
specific files.

Tip

If you think you need custom merge drivers, you may want to
investigate custom diff drivers as well!

How Git Thinks About Merges

At first, Git’s automatic merging support seems nothing short of
magical, especially compared to the more complicated and error-prone
merging steps needed in other VCSs.

Let’s take a look at what’s going on behind the scenes to make it
all possible.

Merges and Git’s Object Model

In most VCSs, each commit has only one parent. On such a
system, when you merge some_branch
into my_branch, you create a new
commit on my_branch with the changes
from some_branch. Conversely, if you
merge my_branch into some_branch then this creates a new commit on
some_branch containing the changes
from my_branch. Merging branch A into
branch B and merging branch B into branch A are two different
operations.

However, the Git designers noticed that each of these two
operations results in the same set of files when you’re done. The
natural way to express either operation is simply to say, Merge
all the changes from some_branch and
another_branch into a single
branch.

In Git, the merge yields a new tree object with the merged files,
but it also introduces a new commit object on only the target branch.
After these commands:

    $ git checkout my_branch
    $ git merge some_branch

the object model looks like Figure 9-10.

Figure 9-10. Object model after a merge

In Figure 9-10, each
C x is a
commit object and each T x represents the
corresponding tree object. Notice how there is one common merged commit
( CZC) that has both CC and CZ
as commit parents, but it has only one resulting set of files
represented in the TZC tree. The
merged tree object symmetrically represents both source branches
equally. But because my_branch was
the checked out branch into which the merge happened, only my_branch has been updated to show the new
commit on it; some_branch remains
where it was.

This is not just a matter of semantics. It reflects Git’s
underlying philosophy that all branches are created equal.[21]

Squash Merges

Suppose some_branch had
contained not just one new commit, but instead, 5 or 10 or even hundreds
of commits. In most systems, merging some_branch into my_branch would involve producing a single
diff, applying it as a single patch onto my_branch, and creating one new element in the
history. This is called a squash commit because
it squashes all the individual commits into one big
change. As far as the history of my_branch is concerned, the history of
some_branch would be lost.

In Git, the two branches are treated as equal, so it’s improper to
squash one side or the other. Instead, the entire history of commits on
both sides is retained. As users, you can see from Figure 9-10 that you pay for this
complexity. If Git had made a squash commit, you wouldn’t have to see
(or think about) a diagram that diverges and then rejoins again. The
history of my_branch could have been
just a straight line.

Note

Git can make squash commits if desired. Just give the
--squash option to git
merge
or git pull.
Beware, however! Squashing commits will upset Git’s history, and that
will complicate future merges because the squashed comments alter the
history of commits (see Chapter 10).

The added complexity might appear unfortunate, but it is
actually quite worthwhile. For example, this feature means that the
git blame and git bisect commands, discussed in Chapter 6, are much more powerful than equivalents in
other systems. And as you saw with the recursive merge strategy, Git is able to
automate very complicated merges as a result of this added complexity
and the resulting detailed history.

Tip

Although the merge operation itself treats both parents
as equal, you can choose to treat the first parent as special when you
go back through the history later. Some commands (e.g., git log and gitk) support the
--first-parent option, which follows only the first
parent of every merge. The resulting history looks much the same as if
you had used --squash on all your merges.

Why Not Just Merge Each Change One by One?

You might ask wouldn’t it be possible to have it both ways: a
simple, linear history with every individual commit represented? Git
could just take all the commits from some_branch and apply them, one by one, onto
my_branch. But that wouldn’t be the
same thing at all.

An important observation about Git’s commit histories is that each
revision in the history is real. (You can read more
about treating alternate histories as equal realities in Chapter 13.)

If you apply a sequence of someone else’s
patches on top of your version, you will create a series of entirely new
versions with the union of their changes and yours. Presumably, you will
test the final version as you always would. But what about all those
new, intermediate versions? In reality, those versions never existed:
nobody actually produced those commits, so nobody can say for sure
whether they ever worked.

Git keeps a detailed history so that you can later revisit what
your files were like at a particular moment in the past. If some of your
merged commits reflect file versions that never really existed, then
you’ve lost the reason for having a detailed history in the first
place!

This is why Git merges don’t work that way. If you were ask
What was it like five minutes before I did the merge?
then the answer would be ambiguous. Instead you must ask about either
my_branch or some_branch specifically, because both were
different five minutes ago and Git can give the true answer for each
one.

Even though you almost always want the standard history
merging behavior, Git can also apply a sequence of patches (see Chapter 14) as described here. This process is called
rebasing and is discussed in Chapter 10. The implications of changing commit
histories are discussed in Changing Public History of Chapter 13.


[19] Yes, you can force Git to create one anyway by using the
--no-ff option in the fast-forward case. However,
you should fully understand why you want to do so.

[20] That’s the “Zero, One, or Infinity Principle” at
work.

[21] And, by extension, so are all complete repository
clones.

Comments are closed.

loading...