loading...

Git – Git and GitHub

Installing Apache On CentOS 8

Although the other chapters of this book have focused on the
Git command line tool, the years since the 2005 inception of Git have
allowed and fostered the growth of a community of tools around it. Those
tools number in the hundreds and take on many forms, from desktop GUIs like
SmartGit to disk backup
tools like SparkleShare. But
out of this ocean of Git tooling, one stands at the forefront of the mind of
many developers and even nondevelopers: GitHub.

This website, shown in Figure 21-1,
introduced the phrase that seemed dismissible just a few years ago, but now
feels like the way many of us should consider working: social coding. This
model of social coding was first applied to open source, but the last two
years have seen this idea of code as a point of geographically distributed
collaboration grow even in the closed source enterprise. Let’s take a look
at what GitHub has to offer.

Figure 21-1. GitHub homepage

Repo for Public Code

Statistics indicate that the first interaction that many
developers have with Git is in cloning a repository from GitHub. This is
the original function of GitHub. It offers an interface to repositories
over the git://, https://, and git+ssh:// protocols. Accounts are free for open
source projects and all accounts can create unlimited publicly accessible
repositories. This has greatly fostered the adoption of Git within the
open source community for languages from JavaScript to
ClojureScript.

Creating an account begins with opening http://github.com in your web browser and clicking the Sign
Up link shown in Figure 21-2.

Figure 21-2. Choosing an account type

GitHub has four types of account and plan combinations: free
personal, paid personal, free organization and paid organization. A
personal account is a pre-requisite for joining as an organization. Be
thoughtful in your choice of username because only one rename action is
allowed per account by default (Figure 21-3).
Multiple email addresses can be associated with a single username and can
be changed at any time. Thus, the username is the most permanent part of
the sign up information.

Figure 21-3. Free personal account

At the conclusion (Figure 21-4) of creating a free personal
account, which is the most common type, users are directed to the GitHub
help pages, which offer tutorials on setting up a few necessary
configuration parameters of the developer desktop installation of
Git.

Figure 21-4. Account creation complete

Creating a GitHub Repository

New Repository Information

Once you’ve created an account, creating a repository
is as easy as clicking on the New Repository button on the top-most
toolbar, visible at all times once logged in, or by navigating
directly to the New Repository page by typing http://github.com/new.

The only required data is the name of the repository, but an
optional description of the project’s objectives and the URL of its
home page signal attentiveness of the maintainers (Figure 21-5).

Next, the repository must be given its initial content. There
are two distinct approaches
based on whether or not you have existing commits to
preserve.

README Seeding (Option 1)

If the first step in working with the project was to
create the GitHub repository before writing any code, you’ll want to
create a placeholder file as the first commit. During the new
repository creation on the GitHub site, you are presented with
choices to optionally seed the repository with an initial README file and a .gitignore file. Projects use
this README text file to
describe the intent of the project.

The project is then ready to be cloned with the command
git clone
url
, after which new code can
begin to be locally added and committed.

Adding a Remote (Option 2)

If you already have a local Git repository with commits, you
can connect the GitHub address
to the existing local repo. You do this by adding the GitHub URL (a
Git remote) to an existing local Git repository with the git remote add
url
command.

Pushing the Local Contents to GitHub

Once one of the two options has been followed to
connect the local repository to the remote repository, the contents
of the local repo can be pushed to GitHub. This is done with the
git push remote
branch
command. If the branch
has never been published before, the more specific invocation
git push -u origin master is
appropriate, in which -u tells Git to track the
pushed branch, push it to the origin remote, and to push just the
master branch.

Figure 21-5. Creating a public repo

Once the connection with the upstream (GitHub) server is established
with one of the previous techniques, further code changes can be easily
pushed with additional git push calls.
This exhibits a core benefit of a centrally accessible Git repository
host, even in a very distributed worker–focused tool such as Git: the
ability to see the changes that all members of the project have completed
and pushed (Figure 21-6), even if they
are offline.

Figure 21-6. Commit history on GitHub

Social Coding on Open Source

GitHub can be minimally thought of as a place to host open
source projects. However, the pedestrian concept of repository hosting
isn’t new and has been well pioneered by forges such as SourceForge and
Google Code, among many others, each with their own user interface
strengths. The extended idea of a forge with organizational policies,
licenses, and commit rights through
meritocracy was furthered by The Apache Foundation, Codehaus, and The Eclipse
Foundation.

But GitHub took a different approach to extending the mere concept
of a forge with community contributions (Figure 21-7). GitHub offers social aspects of the Web,
as seen on Twitter, Facebook, and other social networks, mapped onto the
previously considered solitary activity of programming. With the concepts
of watching other users that seem to be making
interesting contributions, repository forks to permit
anyone to copy a project, pull requests to signal
project owners that another programmer has an interesting set of code to
potentially merge in, and line-level comments on
commits to allow simple iterative refinement to contributions, GitHub has
made coding a social activity. In the process, a vast amount of open
source has benefited from a much wider range of contributors than were
afforded in the days of patch files attached to bug tickets.

Figure 21-7. Social coding

Watchers

The simplest of the social coding features to be found on
GitHub is watching, which begins with the press of
the Watch button as shown in Figure 21-8.
Watching, a concept similar to that of Twitter followers or Facebook
friends, signals interest in a GitHub user, organization, or particular
project.

Figure 21-8. Watch button

The watcher count can often be a signal of the usefulness of an open
source project. The GitHub explore page allows for site-wide searching or
browsing based on repository follower count (Figure 21-9). When combined with a particular
programming language search, watcher count data can yield useful public
domain code examples.

Figure 21-9. Explore and search watcher count

News Feed

Besides the technical high-five that watching a user, organization,
or repository provides, it also shapes the content of your personalized
news feed as shown in Figure 21-10. This news feed
reports the interesting activities of the user repositories and
organizations that you’re watching.

News feeds are offered both as a web page that you can view
on the GitHub.com site, as well as an
RSS feed for consumption in the reader application of your
choosing.

Forks

The next idea that GitHub popularized, so much so that the
phrase has spread to other domains, is personal forks of projects (Figure
21-11). The term forking has commonly carried a negative connotation. In
the coding landscape of yesteryear, forking often meant an aggressive
parting of ways with the primary copy of the project with the intent of
taking the program in a different direction.

Figure 21-10. News feed
Figure 21-11. Fork button

GitHub’s idea of forking is a positive one that enables a greater
number of contributors to make a greater number of contributions in a
controlled and highly visible way. Forking is the democratic ability of
any potential contributor to get a personal copy of a
project’s code. This personal copy (a fork in GitHub parlance) can then be
changed at will without any explicit permission from the original author.
This does not pose any risk to the core project because the changes are
happening in the forked repository, not the original repository.

This is the inverse of the protected core concept of an
Apache or Eclipse project, where patches are submitted as file attachments
to bug reports. A primary benefit of this model is the transparency and
public visibility of the community contributions (Figure 21-12), even before they are submitted back to
the core project for discussion and potential incorporation.

The network graph, shown in Figure 21-12, displays the relation of the core
project’s branches and commits to those of other branches and commits,
including repository forks. This provides a high level overview of the
community’s activity on this project and whether a given fork is diverging
significantly from the core project. This allows for a thoughtful review
of the divergent community contributions and if they are beneficial to
merge back into the core, even in the absence of a pull request.

Several years of observing community behavior with the concept of
forks shows that a greater number of fringe users of a project decide to
actually submit fixes and small improvements because the ceremony of doing
so is extremely low. Many consumers of open source that have worked in
both the old patches-attached-to-bugs model and the new fork-and
pull-request approach say that the barrier to making contributions in the
old model was the disproportionate time required to prepare the patch
compared to the time it took to make the actual fix to the code.

Figure 21-12. Network graph

Creating Pull Requests

Forking is the enabling step of creating a personal copy of
a project, but the real value for the core project lies in the second
action, formally called a pull request. Pull requests allow any user with
a commit that she feels makes a useful contribution to the project to
announce that contribution to the core project owners.

Once a contributor has finished coding a feature, committed that new
code to a well-named branch, and pushed that new branch to a fork, it can
be turned into a pull request. A pull request can be accurately but
minimally described as a list of topically focused
commits
. Pull requests are most commonly based on the entire
contents of a topic branch, but can be adjusted to a more narrow range of
commits when less than the full branch is ready for offering as a
contribution to the release branch. When the newly pushed branch has been
selected from the branch selector drop-down control, the context-sensitive
Pull Request button, shown in Figure 21-13, is
pressed to initiate the assembly of the pull request announcement.

Figure 21-13. Pull request button

The default behavior of a pull request is to include all of the
commits on the current topic branch. However, in cases that call for it, a
specific range of commits, as well as the source and target branch, can be
manually altered as shown in Figure 21-14.

Figure 21-14. Pull request ranges

With the pull request now created, it is up to the core project
owners to review, evaluate, comment on, and potentially merge in these
changes. Conceptually, this is often compared to the code review process
of Crucible and Gerrit. However, it is GitHub’s opinion that this process
works well—it strikes the perfect balance of being lightweight, yet
sufficient for a thorough code review. It also automates the most burdensome step of incorporating new code,
merging it in, which can in most cases be done from the GitHub site on a Pull Request page via a
mere button click.

Managing Pull Requests

A successful project on GitHub has a queue of Pull Requests
(Figure 21-15) to manage. Anyone that is a
collaborator on this core instance of the project can manage and process pull requests. It is useful to note
that pull requests do not necessarily have to come from forks. Disciplined
contributors that have the collaborator privilege on the core project may
still decide to use pull requests as a means of soliciting code feedback
before merging it in.

Figure 21-15. Project pull request queue

Pull requests are such an important part of the GitHub ecosystem
that each user has his own custom dashboard to display pull requests
across all the projects he is associated with as a contributor (Figure 21-16).

Figure 21-16. System-wide pull request queue

Part of the concept behind pull requests is turning a typically
binary accept/deny operation into a
conversation. That conversation happens with comments on pull requests or comments on specific commits
(Figure 21-17). Comments can be of an
instructive nature, indicating that
the proposed solution still needs work. If the contributor then makes
further commits on the topic branch that is part of the pull request,
those commits, when pushed, show up sequentially in the pull request
thread.

Figure 21-17. Pull request comments

Comments can be made at one of three levels of precision: pull
request, commit, or line of code. The line-level comments (Figure 21-18) are the most useful for
technical adjustments, offering the
reviewer an ability to precisely suggest a more preferred way of coding
the same logic to the author.

Figure 21-18. Pull request line level commit comments

When the solution in the pull request is sufficiently
polished and ready to be merged in, typically to the master branch, it can
be done in one of several ways. The most innovative and time-saving
approach is to use the automatic merge button on the GitHub web user interface (Figure 21-19). This performs a real Git commit,
just as if it had been done from the command line, minus the ceremony of
locally downloading and merging the code and pushing the result back up to
GitHub.

Figure 21-19. Pull request automatic merges

It is natural to think of pull requests as an activity that is done
at the end of working on a feature, bug fix, or other contribution.
However, pull requests can be effectively used at the beginning of a
concept too. It is becoming ever more common to see pull requests
initiated with a mere mockup JPEG image or quick text file outline of the
objectives of the topic branch, followed by solicitation of team feedback
via the pull request comment approaches given previously. The contributors
to the topic branch continue to push their changes to GitHub and the pull
request is automatically updated in conversation style with links to the
latest commits.

Notifications

A social system like GitHub needs a strong notification
mechanism to announce potentially
important changes on the projects, organizations, and users that a
contributor has elected to watch. Notifications, as you might reasonably
guess, are driven by watches of the previously mentioned three types of
items on GitHub.

The summary of all notifications that pertain to you are centrally
located on a notifications page that is reachable by an icon in the top
level navigation, as shown in Figure 21-20.

Figure 21-20. Notifications button

This list of pertinent notifications is iconified based on the
source of the event. It has icons for repository- user- and
organization-level activities. A summary of each activity is provided and
hyperlinked to the details of the event, as shown in Figure 21-21.

Figure 21-21. Notifications list

Notifications can be toggled on and off on a per repository basis by
a hyperlink at the bottom of the repository’s page (Figure 21-22).

Figure 21-22. Notifications repository toggle

System-wide options for notifications are performed under the user’s
administration settings. The type of events to be notified about, as well
as whether these should be routed solely to the web page or additionally
to the user’s email address are all controlled through this page, as shown in
Figure 21-23.

Figure 21-23. Notifications settings

Finding Users, Projects, and Code

GitHub certainly is keen on hosting and facilitating
collaboration on open source projects. However, the greater portion of the
open source community is focused on finding and using open source
libraries. This discovery is facilitated by the GitHub Explore page (Figure 21-24). The opening Explore page provides a curated set
of repositories that are exhibiting
statistical trends, making them likely to be of interest to the greater
open source community.

Figure 21-24. Explore

If your focus is to find a code example in a particular
programming language, then the Advanced Search page is what you are
looking for, as shown in Figure 21-25. Available
criteria for user, popularity, repository name, and programming language
allow for highly targeted searches.

Figure 21-25. Site search

Wikis

Updating a wiki previously meant editing a page in a
browser. This was a very volatile form of editing with minimal version
control. Changes could be lost at the slightest browser refresh.

With a simple Markdown syntax, wikis are
first-class Git repos that sit alongside their project. GitHub wikis
(Figure 21-26) permit commits, comments, merging,
rebasing, and all the features that Git users have come to enjoy, but that
wiki users previously have not had at their disposal.

Figure 21-26. GitHub wikis

But just because you can edit the wiki via a Git repository cloned
to your local machine doesn’t mean giving up the convenience of the
in-browser editing approach (Figure 21-27). The
in-browser editor is also writing back to the underlying Git repo so users
can trace the author and reasoning history of all page edits.

Figure 21-27. GitHub in-browser editing of wikis

GitHub Pages (Git for Websites)

If the wiki page idea sounded attractive, what about having
Git-tracked Markdown files as the foundation of a tool for publishing
entire websites? GitHub pages based on Jekyll provide exactly
that, and can even be mapped to a Domain Name System (DNS) CNAME record as
content for a subdomain or primary domain name (Figure 21-28).

Octopress (Figure 21-29) is gaining traction as a mashup of Jekyll and
GitHub pages, making it easier than ever to publish dynamic content in a
static way. The security vulnerabilities and the growing set of attack
vectors on dynamically generated sites using live databases and
just-in-time compilation has pushed a large number of individuals back to
statically served content. But that doesn’t mean giving up dynamic site
generation, it just means relocating the dynamic processing to authoring
time rather than its traditional location at the time of page requests
with technologies like JavaServer Pages (JSPs) and PHP: Hypertext
Preprocessor (PHP).

Figure 21-28. GitHub pages how to
Figure 21-29. Octopress home page

In-Page Code Editor

Traditionally, coding is done on the desktop in the user’s
text editor of choice. But for a small fix this seems like a burdensome
ceremony of pulling the code, editing the code, committing the code, and
pushing the code. For something as simple as a spelling correction in an
error message, GitHub supports in-browser code editing, as shown in Figure 21-30.

The in-browser editor is based on Mozilla’s Ace JavaScript–based control. This is
the same control used by the Cloud9 IDE and Beanstalk. This control, shown
in Figure 21-31, supports line numbering,
syntax highlighting, and space and tab formatting. Code changes become as
simple as browsing to the source file on GitHub, clicking Edit this file, and
committing the change with your commit message entered directly below the
in-browser editor. Small fixes have never been so easy.

Figure 21-30. Ace in-browser editor
Figure 21-31. In-browser code editing

Subversion Bridge

Although GitHub certainly believes that Git is the VCS of
the future, there is also an understanding that SVN will be with us for
quite some time. GitHub supports that dual existence in two ways.

Traditionally, Git users have kept their repository in SVN and used
git-svn to bridge the two technologies.
However, this approach means that only the lower fidelity SVN metadata can
be kept for commits, excluding the ever useful Git Author and Git
Committer fields, as well as the Git parent commit refs.

GitHub makes the opposite bridge possible, without the aid of any
client-side conversion software. Every Git repository at GitHub is also
dynamically converted at request-time to be served as an SVN repository at
exactly the same HTTPS URL used for Git cloning, as shown in action in
Figure 21-32. This is a complex dynamic
conversion, unique to Git repositories served from GitHub. This bridge
facilitates continued SVN use in a cautious and stepwise conversion to an
eventual sole use of Git. This server-side bridge (Figure 21-33) permits not just commit GUIs, but other
SVN-connected legacy tools to work with a Git-based repository of commits.
The Git default branch, typically master, is automatically mapped to trunk in the SVN interface, showing forethought
of even mapping to the idioms of the SVN realm.

Figure 21-32. Subversion clone of Git repository
Figure 21-33. Git-SVN bridge

Tags Automatically Becoming Archives

When an open source project wants to create a compressed
archive of the project on GitHub, there’s an easy shortcut: just tag the
desired revision of code. Git tags are automatically converted into TGZ
and ZIP compressed archives available from the Tags page as shown in Figure 21-34.

Figure 21-34. Tags as archives

Organizations

Up until now, this book has primarily discussed interactions of
smaller quantities of GitHub users, acting in a relatively independent
fashion. However, the attraction of Git has spread to more cohesive
groups, small businesses, and enterprises. GitHub has an Organizations
(Figure 21-35) set of features to serve those
groups well.

Figure 21-35. Organization selector

GitHub Organizations provide ownership of repositories at a
higher level than mere user accounts. In support of that, there is an
additional security construct: Teams. Teams are a grouping mechanism for
users that associates with a certain permission level and a set of
repositories. The three permission levels are pull only, pull+push, and
pull+push+administration as shown in Figure 21-36.

Figure 21-36. Organization permissions

REST API

Having a web application is a great starting point, but
GitHub has a rich community of developers that are eager to use true
services, not just page scraping, to build the next layer of useful
features. To facilitate the community construction of supporting tools,
GitHub has built a full Application Programming Interface (API). GitHub’s
API has evolved in three major eras, and the current v3 of
the API, as it is known, offers almost all UI-accessible features
in an equivalent API form. In some cases, advanced services are offered
though the API that aren’t even part of the GitHub UI yet.

An example shown in Example 21-1 makes a
call to the API to get the organizations that a user belongs to. The
response, as are all responses from the GitHub API, are given in
JavaScript Object Notation (JSON) format. Note that the avatar_url is really one long string value that
has been split here for typography reasons.

Example 21-1. Calling the GitHub API
curl https://api.github.com/users/matthewmccullough/orgs

[
  {
    "avatar_url": "https://secure.gravatar.com/avatar/11f43e3d3b15205be70289ddedfe2de7
        ?d=https://a248.e.akamai.net/assets.github.com
        %2Fimages%2Fgravatars%2Fgravatar-orgs.png",
    "login": "gradleware",
    "url": "https://api.github.com/orgs/gradleware",
    "id": 386945
  },
  {
    "avatar_url": "https://secure.gravatar.com/avatar/61024896f291303615bcd4f7a0dcfb74
        ?d=https://a248.e.akamai.net/assets.github.com
        %2Fimages%2Fgravatars%2Fgravatar-orgs.png",
    "login": "github",
    "url": "https://api.github.com/orgs/github",
    "id": 9919
  }
]

The gamut of GitHub operations are exposed through the RESTful API
and are well-documented on the GitHub API site, as shown in Figure 21-37. But what is more idea enabling than just
calling out for a list of users, repositories, or files is the ability to
use the open standard for authorization, OAUTH, to request and gain the
ability to act on behalf of a GitHub user. This opens up the possibility
of querying and manipulating private repository contents, using
repositories as storage containers for versioned artifacts beyond source
code, and building applications that can abstract themselves from the
difficulty of building a version control persistence layer.

Figure 21-37. GitHub REST API

Social Coding on Closed Source

Though the first idea that comes to mind with a
collaborative development model like GitHub is open source, nearly all of
the benefits described can be derived even inside the offices of a
company. Businesses should take full advantage of the talents of all
developers in their employ, even if they aren’t currently assigned to a
given project. Pull requests, in combination with organizations and pull
only teams, enable gated contributions by any authorized employee, but
with the safety net of a code review by core project collaborators.

Eventual Open Sourcing

Although open source is often thought of as being open from
birth, an increasing number of projects are being open sourced after a
certain amount of maturity is achieved or after a certain development
milestone is reached. This eventual open sourcing benefits from the
history having been kept in Git and the repository being maintained on
GitHub. The fully preserved context of why is this line of code the
way it is?
is available from the Git commit history. The actual
act of converting the repo to gain the full benefit of GitHub’s social
coding aspects is as simple as a Boolean toggle on the repository’s admin
page, as shown in Figure 21-38.

Figure 21-38. Public and private repo toggle

Coding Models

The choice of Git as the VCS of a development team and, more
specifically, the choice of GitHub as the repository host, facilitates
dozens of unique usage patterns. Three of these usage styles are briefly
described.

The centralized model, shown in Figure 21-39, while still offering the local commit
insulation that isn’t afforded by true centralized systems like SVN, is
the simplest, but least interesting of the models. It is an easy first
step because developers push their local commits frequently so as to
simulate the everything is on the central server state that
was enforced by their version control tool of yesteryear. Although this
can be a viable starting pattern with Git, it is a mere stepping stone
toward unique and valuable leveraging of the distributed and collaborative
model Git and GitHub have to offer.

Figure 21-39. Centralized model

Next up is the lieutenant and commander model shown in Figure 21-40. You’ll recognize it as very similar to
that enabled by pull request facilities of GitHub. It is important to note
that Git projects in the absence of GitHub have a means of implementing
this model through emails and links passed around, but always with greater
apparent friction and ceremony than real pull requests.

Figure 21-40. Linux lieutenant and commander model

Lastly, for companies that are leveraging open source, want
to donate back their bug fixes, but keep the innovations in-house, an
arbitrator for the two repositories can be established. This arbitrator,
as shown in Figure 21-41, picks and chooses
which commits are cherry-picked and pushed back into the public domain to
the open source version of the project. This is being done today on
well-known entities such as RedHat’s JBoss Server.

Figure 21-41. Partial open sourcing model

GitHub Enterprise

All of this may sound very attractive, but your business may
be governed by requirements or even laws that prohibit the storage of code
on the public Internet, no matter how secure the offering. The solution
for this is GitHub Enterprise, whose home page is shown in Figure 21-42. It offers the same public GitHub
experience as described in so many of the sections previously, but is
delivered as a virtual machine image (shown in VirtualBox in Figure 21-43) for on-premise hosting.
Additionally, GitHub Enterprise is compatible with Exchange Server
Lightweight Directory Access Protocol (LDAP) and Central Authentication
Service (CAS) user authentication that many enterprises already have in
place.

Figure 21-42. GitHub Enterprise home page
Figure 21-43. GitHub Enterprise in VirtualBox

GitHub in Sum

Git is a developer tool that has shaken the very foundations of CVS,
SVN, Perforce, and ClearCase installations by showing that high
performance, collaborative, and distributed version control can be found
in an open source solution. Offset by only a short delay from Git’s own
development, GitHub has equally shown that a sharp web application can reduce tool burden,
facilitate quicker fixes, allow a greater number of contributors to
further a project, and most importantly, turn the act of coding into a
truly social activity.

Comments are closed.

loading...