Git manages change. Given that intent, Git shares much with other
version control systems. Many tenets—the notion of a commit, the change log,
the repository—are the same, and workflow is conceptually similar among the
corpus of tools. However, Git offers many novelties, too. The notions and
practices of other version control systems may work differently in Git or
may not apply at all. Yet no matter what your experience, this book explains
how Git works and teaches mastery.
Let’s get started.
The Git Command Line
Git is simple to use. Just type git. Without any arguments, Git lists its
options and the most common subcommands.
$ git
git [--version] [--exec-path[=GIT_EXEC_PATH]]
[-p|--paginate|--no-pager] [--bare] [--git-dir=GIT_DIR]
[--work-tree=GIT_WORK_TREE] [--help] COMMAND [ARGS]
The most commonly used git commands are:
add Add file contents to the index
bisect Find the change that introduced a bug by binary search
branch List, create, or delete branches
checkout Checkout and switch to a branch
clone Clone a repository into a new directory
commit Record changes to the repository
diff Show changes between commits, the commit and working trees, etc
fetch Download objects and refs from another repository
grep Print lines matching a pattern
init Create an empty git repository or reinitialize an existing one
log Show commit logs
merge Join two or more development histories
mv Move or rename a file, a directory, or a symlink
pull Fetch from and merge with another repository or a local branch
push Update remote refs along with associated objects
rebase Forward-port local commits to the updated upstream head
reset Reset current HEAD to the specified state
rm Remove files from the working tree and from the index
show Show various types of objects
status Show the working tree status
tag Create, list, delete, or verify a tag object signed with GPG
For a complete (and somewhat daunting) list of git subcommands, type git help –all.
As you can see from the usage hint, a small handful of options apply
to git. Most options, shown as [ARGS] in the hint, apply to specific
subcommands.
For example, the option --version
affects the git command and produces a version
number.
$ git --version
git version 1.6.0
In contrast, --amend
is an example of an option
specific to the git subcommand commit.
$ git commit --amend
Some invocations require both forms of options. (Here, the extra
spaces in the command line merely serve to visually separate the
subcommand from the base command and are not required.)
$ git --git-dir=project.git repack -d
For convenience, documentation for each git subcommand
is available using git help
,
subcommand
git --help
or
subcommand
git
.subcommand
--help
Historically, Git was provided as a suite of many simple, distinct,
standalone commands developed according to the “ Unix toolkit”
philosophy: build small, interoperable tools. Each command sported a
hyphenated name, such as git-commit and
git-log. However, modern Git
installations no longer support the hyphenated command forms and instead
use a single git executable with a
subcommand.
Note
You can visit http://www.kernel.org/pub/software/scm/git/docs/ to read
the complete Git documentation online.
Git commands understand both “ short” and
“ long” options. For example, the git
commit command treats the following examples as
equivalents.
$ git commit -m "Fixed a typo."
$ git commit --message="Fixed a typo."
The short form, -m
, uses a single hyphen, whereas
the long form, --message
, uses two. (This is consistent
with the GNU long options extension.) Some options exist only in one
form.
Finally, you can separate options from a list of arguments via the
“ bare double dash” convention. For instance, use the double
dash to contrast the control portion of the command line from a list of
operands, such as filenames.
$ git diff -w master origin -- tools/Makefile
You may need to use the double dash to separate and explicitly
identify filenames if they might otherwise be mistaken for another part of
the command. For example, if you happened to have both a file and a tag
named main.c, then you will get
different behavior:
# Checkout the tag named "main.c"
$ git checkout main.c
# Checkout the file named "main.c"
$ git checkout -- main.c
Quick Introduction to Using Git
To see git in action, let’s
create a new repository, add some content, and manage a few
revisions.
There are two fundamental techniques for establishing a Git
repository. You can either create it from scratch, populating it with an
existing body of work, or you can copy, or clone,
an existing repository. It’s simpler to start with an empty repository, so
let’s start there.
Creating an Initial Repository
To model a typical situation, let’s create a repository for
your personal website from the directory ~/public_html and place it in a Git
repository.
If you don’t have content for your personal website in ~/public_html, create the directory and place some simple content in
a file called index.html:
$ mkdir ~/public_html
$ cd ~/public_html
$ echo 'My website is alive!' > index.html
To turn ~/public_html or any
directory into a Git repository, run git
init:
$ git init
Initialized empty Git repository in .git/
Git doesn’t care whether you start with a completely empty
directory or if you start with a directory full of files. In either
case, the process of converting the directory into a Git repository is
the same.
To signify that your directory is a Git repository, the git init command creates a hidden directory,
called .git, at the top level of
your project. Whereas CVS and SVN place revision information in CVS and .svn subdirectories within each of your
project’s directories, Git places all its revision information in this
one, top-level .git directory. The
contents and purpose of the data files are discussed in more detail in
Inside the .git Directory of Chapter 4.
Everything in your ~/public_html directory remains untouched. Git considers it your project’s
working directory, or the directory where you
alter your files. In contrast, the repository hidden within .git is maintained by Git.
Adding a File to Your Repository
The command git init
creates a new Git repository. Initially, each Git repository is empty.
To manage content, you must explicitly deposit it in the repository.
Such a conscious step separates scratch files from important
files.
Use git add
file
to add file
to the
repository:
$ git add index.html
Tip
If you have a directory populated with several files, let Git
add all the files in the directory and all subdirectories with
git add .. (The argument ., the single period or “ dot”
in Unix parlance, is shorthand for the current directory.)
After an add, Git knows that
the file, index.html, is to remain
in the repository. However, so far, Git has merely
staged the file, an interim step before
committal. Git separates the add and
commit steps to avoid volatility.
Imagine how disruptive, confusing, and time-consuming it would be to
update the repository each time you add, remove, or change a file.
Instead, multiple provisional and related steps, such as an add, can be
“ batched,” keeping the repository in a stable, consistent
state.
Running git status reveals
this in-between state of index.html:
$ git status
# On branch master
#
# Initial commit
#
# Changes to be committed:
# (use "git rm --cached <file>..." to unstage)
#
# new file: index.html
The command reports that the new file index.html will be added to the repository
during the next commit.
In addition to actual changes to the directory and to file
contents, Git records several other pieces of metadata with each commit,
including a log message and the author of the change. A fully qualified
git commit command supplies a log message and an author:
$ git commit -m "Initial contents of public_html" \
--author="Jon Loeliger <jdl@example.com>"
Created initial commit 9da581d: Initial contents of public_html
1 files changed, 1 insertions(+), 0 deletions(-)
create mode 100644 index.html
You can provide a log message on the command line, but it’s more typical to create the
message during an interactive editor session. This gives you an
opportunity to compose a complete and detailed log message in your
favorite editor. To configure Git to open your favorite editor during a git
commit, set your GIT_EDITOR
environment variable.
# In tcsh
$ setenv GIT_EDITOR emacs
# In bash
$ export GIT_EDITOR=vim
After you commit the addition of the new file into the repository,
git status indicates that there are
no outstanding, staged changes to be committed.
$ git status
# On branch master
nothing to commit (working directory clean)
Git also takes the time to tell you that your working directory is clean, which
means the working directory has no unknown or modified files that differ
from what is in the repository.
Git tries hard to determine the author of each commit. If you haven’t set up your name and
email address in a way that Git can find it, you may encounter some
odd warnings.
But there is no need to have an existential crisis if you see a
cryptic error message like one of these:
You don't exist. Go away!
Your parents must have hated you!
Your sysadmin must hate you!
The error indicates that Git is unable to determine your real
name, likely due to a problem (existence, readability, length) with
your Unix “ gecos” information. The problem can be fixed
by setting your name and email configuration information as described
in Configuring the Commit Author.
Configuring the Commit Author
Before making many commits to a repository, you should establish
some basic environment and
configuration options. At a bare minimum, Git must know your
name and email address. You may specify your identity on every commit command
line, as shown previously, but that is the hard way and quickly becomes
tedious.
Instead, save your identity in a configuration file using the
git config command.
$ git config user.name "Jon Loeliger"
$ git config user.email "jdl@example.com"
You can also tell Git your name and email address using
the GIT_AUTHOR_NAME
and
GIT_AUTHOR_EMAIL
environment
variables. If set, these variables override all configuration
settings.
Making Another Commit
To show a few more features of Git, let’s make some modifications
and create a complex history of changes within the repository.
Let’s commit an alteration to the index.html file. Open the file, convert it to
HTML, and save the file.
$ cd ~/public_html
# edit the index.html file
$ cat index.html
<html>
<body>
My web site is alive!
</body>
</html>
$ git commit index.html
If you are already somewhat familiar with Git, you may be
tempted to think “ Aha! You need to git
add index.html before you can commit that file!” But that isn’t true.
Because the file was already added to the repository (in Adding a File to Your Repository), there’s no need to tell the index about
the file; it already knows. Furthermore, file changes are captured when
directly committing a file named on the command line! Using a generic
git commit without naming the file
would not have worked in this case.
When your editor comes up, enter a commit log entry such as
“ Convert to HTML” and exit the editor. There are now two
versions of index.html in the
repository.
Viewing Your Commits
Once you have one or more commits in the repository, you can
inspect them in a variety of ways. Some Git commands show the sequence
of individual commits, others show the summary of an individual commit,
and still others show the full details of any commit in the
repository.
The command git log yields
a sequential history of the individual commits within the repository:
$ git log
commit ec232cddfb94e0dfd5b5855af8ded7f5eb5c90d6
Author: Jon Loeliger <jdl@example.com>
Date: Wed Apr 2 16:47:42 2008 -0500
Convert to HTML
commit 9da581d910c9c4ac93557ca4859e767f5caf5169
Author: Jon Loeliger <jdl@example.com>
Date: Thu Mar 13 22:38:13 2008 -0500
Initial contents of public_html
The entries are listed, in order, from most recent to
oldest[6] (the original file); each entry shows the commit author’s
name and email address, the date of the commit, the log message for the
change, and the internal identification number of the commit. The commit
ID number is explained in Content-Addressable Names of Chapter 4, and commits are discussed in Chapter 6.
To see more detail about a particular commit, use git show
with a commit number:
$ git show 9da581d910c9c4ac93557ca4859e767f5caf5169
commit 9da581d910c9c4ac93557ca4859e767f5caf5169
Author: Jon Loeliger <jdl@example.com>
Date: Thu Mar 13 22:38:13 2008 -0500
Initial contents of public_html
diff --git a/index.html b/index.html
new file mode 100644
index 0000000..34217e9
--- /dev/null
+++ b/index.html
@@ -0,0 +1 @@
+My web site is alive!
If you run git show
without an explicit commit number, it simply shows the details of the
most recent commit.
Another view, show-branch,
provides concise, one-line summaries for the current development branch:
$ git show-branch --more=10
[master] Convert to HTML
[master^] Initial contents of public_html
The phrase --more=10
reveals up to an additional
10 more versions, but only two exist so far and so both are shown. (The
default in this case would list only the most recent commit.) The name
master
is the default branch
name.
Branches are covered extensively in Chapter 7. Viewing Branches
describes the git show-branch command in more
detail.
Viewing Commit Differences
To see the differences between the two revisions of
index.html, recall both full commit
ID names and run git diff:
$ git diff 9da581d910c9c4ac93557ca4859e767f5caf5169 \
ec232cddfb94e0dfd5b5855af8ded7f5eb5c90d6
diff --git a/index.html b/index.html
index 34217e9..8638631 100644
--- a/index.html
+++ b/index.html
@@ -1 +1,5 @@
+<html>
+<body>
My web site is alive!
+</body>
+</html>
This output should look familiar: It resembles what the diff program
produces. As is the convention, the first revision named, 9da581d910c9c4ac93557ca4859e767f5caf5169
, is
the earlier version of the content and the second revision, named
ec232cddfb94e0dfd5b5855af8ded7f5eb5c90d6
is
the newer one. Thus, a plus sign ( +
)
precedes each line of new content.
Scared yet? Don’t worry about those intimidating hex numbers.
Thankfully, Git provides many shorter, easier ways to do commands like
this without having to produce large complicated numbers.
Removing and Renaming Files in Your Repository
Removing a file from a repository is analogous to adding a file
but uses git rm. Suppose you have the file poem.html in your website content and it’s no
longer needed.
$ cd ~/public_html
$ ls
index.html poem.html
$ git rm poem.html
rm 'poem.html'
$ git commit -m "Remove a poem"
Created commit 364a708: Remove a poem
0 files changed, 0 insertions(+), 0 deletions(-)
delete mode 100644 poem.html
As with an addition, a deletion requires two steps: git
rm expresses your intent to remove the file and stages the
change, and then git commit realizes
the change in the repository. Again, you can omit the
-m
option and type a log message such as “ Remove
a poem” interactively in your favorite text editor.
You can rename a file indirectly by using a combination of git rm and git
add, or you can rename it more quickly and directly with
git mv. Here’s an example of the former:
$ mv foo.html bar.html
$ git rm foo.html
rm 'foo.html'
$ git add bar.html
In this sequence, you must execute mv
foo.html bar.html at the onset lest git rm permanently delete the foo.html file from the filesystem.
Here’s the same operation performed with git mv.
$ git mv foo.html bar.html
In either case, the staged changes must be committed
subsequently:
$ git commit -m "Moved foo to bar"
Created commit 8805821: Moved foo to bar
1 files changed, 0 insertions(+), 0 deletions(-)
rename foo.html => bar.html (100%)
Git handles file move operations differently than most akin
systems, employing a mechanism based on the similarity of the content
between two file versions. The specifics are described in Chapter 5.
Making a Copy of Your Repository
If you followed the previous steps and made an initial
repository in your ~/public_html
directory, then you can now create a complete copy, or
clone, of that repository using the git clone command. This is how people around
the world use Git to pursue pet projects on the same files and keep in
sync with other repositories.
For the purposes of this tutorial, let’s just make a copy in your
home directory and call it my_website:
$ cd ~
$ git clone public_html my_website
Although these two Git repositories now contain exactly the same
objects, files, and directories, there are some subtle differences. You
may want to explore those differences with commands such as:
$ ls -lsa public_html my_website
$ diff -r public_html my_website
On a local filesystem like this, using git clone to make a copy of a repository is
quite similar to cp -a or rsync. However, Git supports a richer set of
repository sources, including network names, for naming the repository
to be cloned. These forms and usage are explained in Chapter 12.
Once you clone a repository, you are able to modify the cloned
version, make new commits, inspect its logs and history, and so on. It
is a complete repository with full history.
Configuration Files
Git’s configuration files are all simple text files in the style of .ini files. They record various choices and
settings used by many Git commands. Some settings represent purely
personal preferences (should a color.pager
be used?); others are vital to a
repository functioning correctly
( core.repositoryformatversion
); and
still others tweak command behavior a bit ( gc.auto
).
Like many tools, Git supports a hierarchy of configuration files. In
decreasing precedence they
are:
- .git/config
-
Repository-specific configuration settings manipulated
with the--file
option or by default. These settings have the highest
precedence. - ~/.gitconfig
-
User-specific configuration settings manipulated with
the--global
option. - /etc/gitconfig
-
System-wide configuration settings manipulated with
the--system
option if you have proper Unix file write permissions on it.
These settings have the lowest precedence. Depending on your actual
installation, the system settings file might be somewhere else
(perhaps in /usr/local/etc/gitconfig), or may be
entirely absent.
For example, to establish an author name and email address that will
be used on all the commits you make for all of your repositories,
configure values for user.name
and
user.email
in your $HOME/.gitconfig file using git config –global:
$ git config --global user.name "Jon Loeliger"
$ git config --global user.email "jdl@example.com"
Or, to set a repository-specific name and email address that would
override a --global
setting, simply omit the --global
flag:
$ git config user.name "Jon Loeliger"
$ git config user.email "jdl@special-project.example.org"
Use git config -l to list the
settings of all the variables collectively found in the complete set of configuration files:
# Make a brand new empty repository
$ mkdir /tmp/new
$ cd /tmp/new
$ git init
# Set some config values
$ git config --global user.name "Jon Loeliger"
$ git config --global user.email "jdl@example.com"
$ git config user.email "jdl@special-project.example.org"
$ git config -l
user.name=Jon Loeliger
user.email=jdl@example.com
core.repositoryformatversion=0
core.filemode=true
core.bare=false
core.logallrefupdates=true
user.email=jdl@special-project.example.org
Because the configuration files are simple text files, you can view
their contents with cat and edit them with your favorite text editor, too.
# Look at just the repository specific settings
$ cat .git/config
[core]
repositoryformatversion = 0
filemode = true
bare = false
logallrefupdates = true
[user]
email = jdl@special-project.example.org
Oh, and, if you use a Pacific Northwest-based OS, you may see some differences here. Maybe something like
this:
[core]
repositoryformatversion = 0
filemode = true
bare = true
logallrefupdates = true
symlinks = false
ignorecase = true
hideDotFiles = dotGitOnly
Many of these differences allow for different file system
characteristics.
Use the --unset
option to remove a setting:
$ git config --unset --global user.email
The behavior of the git config
command changed between versions 1.6.2 and 1.6.3. Earlier versions
required option --unset
to follow option
--global
; newer versions allow either order.
Multiple configuration options and environment variables frequently
exist for the same purpose. For example, the editor to be used when
composing a commit log message follows these steps, in order:
-
GIT_EDITOR
environment
variable -
core.editor
configuration
option -
VISUAL
environment
variable -
EDITOR
environment
variable -
the vi command
There are more than a few hundred configuration parameters. I’m not
going to bore you with them, but I will point out important ones as we go
along. A more extensive (yet still incomplete) list can be found on the
git config manual page.
Configuring an Alias
For starters, here is a tip for setting up command aliases.
If there is a common but complex Git command that you type frequently,
consider setting up a simple Git alias for it.
$ git config --global alias.show-graph \
'log --graph --abbrev-commit --pretty=oneline'
In this example, I’ve made up the show-graph alias and made it available for use
in any repository I make. Now when I use the command git show-graph, it is just like I had typed
that long git log command with all
those options.
Inquiry
You will surely have a lot of unanswered questions about how Git
works, even after the actions performed so far. For instance, how does Git
store each version of a file? What really makes up a commit? Where did
those funny commit numbers come from? Why the name master
? And is a “ branch” what I
think it is? Good questions.
The next chapter defines some terminology, introduces some Git
concepts, and establishes a
foundation for the lessons found in the rest of the book.
[6] Strictly speaking, they are not in
chronological order but rather are a
topological sort of the commits.