Difference between revisions of "Git"
Line 14: | Line 14: | ||
The HTTP URL is also a git-web instance, so you can view information about the repository and its contents/history in your browser if you navigate there. This is also a good alternative to using [https://trac.mpich.org trac] links in email, documentation, and (sometimes) commit messages. | The HTTP URL is also a git-web instance, so you can view information about the repository and its contents/history in your browser if you navigate there. This is also a good alternative to using [https://trac.mpich.org trac] links in email, documentation, and (sometimes) commit messages. | ||
− | '''IF YOU CANNOT ACCESS THE <code>[email protected]:mpich.git</code> URL BUT THINK YOU SHOULD BE ABLE TO, CONTACT | + | '''IF YOU CANNOT ACCESS THE <code>[email protected]:mpich.git</code> URL BUT THINK YOU SHOULD BE ABLE TO, CONTACT devel@mpich.org SO THAT WE CAN ADD YOUR SSH PUBLIC KEY TO THE DATABASE''' |
== Quick Start == | == Quick Start == | ||
Line 288: | Line 288: | ||
(where <code>BRANCH</code> is probably <code>FOO</code>, but could be something else). | (where <code>BRANCH</code> is probably <code>FOO</code>, but could be something else). | ||
− | At the current moment we do not have any easy way to list all of the available <code>dev/</code> branches unless you have permissions to view the <code>gitolite-admin.git</code> repository. If you think you should have access to a particular development branch, contact | + | At the current moment we do not have any easy way to list all of the available <code>dev/</code> branches unless you have permissions to view the <code>gitolite-admin.git</code> repository. If you think you should have access to a particular development branch, contact devel@mpich.org or the specific MPICH core developer with whom you are working. You can list the dev branches to which you already have access by running: |
<pre><nowiki> | <pre><nowiki> |
Revision as of 00:16, 29 January 2013
Until January 7, 2013, MPICH used Subversion (SVN) for its version control system (VCS). Now we use git. This wiki page documents important information about the use of git within MPICH. Historical information about the SVN repository can be found here.
Contents
Important URLs
writeable clone URL | [email protected]:mpich.git
|
read-only clone URL (via git protocol) | git://git.mpich.org/mpich.git
|
read-only clone URL (via http) | http://git.mpich.org/mpich.git
|
The HTTP URL is also a git-web instance, so you can view information about the repository and its contents/history in your browser if you navigate there. This is also a good alternative to using trac links in email, documentation, and (sometimes) commit messages.
IF YOU CANNOT ACCESS THE [email protected]:mpich.git
URL BUT THINK YOU SHOULD BE ABLE TO, CONTACT [email protected] SO THAT WE CAN ADD YOUR SSH PUBLIC KEY TO THE DATABASE
Quick Start
If you do not have the actual git
tool, you can get that from the git website or your preferred software package management system (brew install git
, apt-get install git-core
, etc.).
The next step is to add the following (substituting your name and email) into your ~/.gitconfig
file:
[user] name = Joe Developer email = [email protected] [color] diff = auto status = auto branch = auto ui = auto # optional, but helps to distinguish between changed and untracked files [color "status"] added = green changed = red untracked = magenta
Quick start for authorized committers (core MPICH developers):
% git clone [email protected]:mpich.git
This will create an mpich
directory in your $PWD
that contains a completely functional repository with full project history.
If you do not have access to the writeable repository but think you should (because you are a core MPICH developer or collaborator), contact [email protected] for access. The system works based on SSH keys, so we will need your SSH public key.
Everyone else who wishes to clone the repository must use one of the other URLs. These will still allow you to contribute back to MPICH with git format-patch
.
Important Dos and Don'ts
Do:
- Use
git pull --rebase
instead ofgit pull
when the code in your current branch has never been pushed to the outside world. Otherwise you will end up creating unnecessary merge commits, which can make it difficult to understand development history. Better yet, sometimes it is easier to understand what is happening if you separate yourgit pull
intogit fetch
followed by an explicitgit rebase
- Use git-style commit messages, with a short (~50 char) subject line and then a separate, more descriptive body.
- Prefer making smaller logical commits rather than single large commits containing loosely or unrelated features.
Don't:
- cherry-pick excessively. Cherry-picking should be a rare activity, not a frequent one. Talk to Dave if you want to understand this better.
- merge just for the sake of merging. If you have a long-running, published topic branch, then don't merge from
master
(for example), just because "it's been a while". Instead, only merge to pick up specific features/fixes that are not suitable for cherry-picking. Name this feature/fix in the merge commit message.
Background Reading and References
- Scott Chacon's "Pro Git" book is free on the web, and covers everything from installing git to concepts that are extremely advanced (like
git replace
). - Git for Computer Scientists provides background about how to think about git's object model. This is very useful for understanding the git man pages and reasoning about the effect of various commands.
Proposed Git Workflow
TBD. Right now, let's use it as a sort of much, much better SVN. One (somewhat heavyweight) option is to adopt git-flow. The upside is that it is a clearly documented process that is used by other groups. The downside is that it assumes a certain level of comfort with git that might be difficult to teach to students and others in a short time frame.
What follows is a comparison of one of the most common activities, committing a bug fix or small feature, shown in the old SVN approach and the recommended new git approach:
Old SVN Approach
From a fresh (or up-to-date and clean) svn working copy:
% vim foo.c # edit an existing tracked file ... % vim bar.c # edit a new (not tracked by SVN) file ... % svn status M foo.c ? bar.c % svn add bar.c % svn status M foo.c A bar.c
Now let's say that you go home for the day without committing these changes. When you get in the next morning, you try to commit the change:
% svn update ... # receive any updates made by others to the repository % svn commit -m 'fixed bug in foo.c, using new bar.c to do so' transmitting... new revision rXYZ created # (paraphrasing here, don't have output in front of me)
Now if there had been a conflict in foo.c
at the svn update
step, then we would have needed to do something like:
% vim foo.c ... # search for conflict markers, resolve conflict !!! ONE OF: (A) !!! % svn resolved foo.c # old-style SVN command !!! OR (B) !!! % svn resolve --accept working foo.c % svn commit -m 'fixed bug in foo.c, using new bar.c to do so' ...
New Git Approach
And the same simple case in git (with extra git status
commands thrown in for pedagogical purposes). This topic is covered more thoroughly in the Basic Merge Conflicts section of The Pro Git book, although in the equivalent context of merging instead of rebasing.
This example assumes that you have the master
branch currently checked out, that your working tree and index are both clean (unmodified), and that the master
branch is setup to track the origin/master
"remote tracking branch":
% vim foo.c # edit an existing tracked file ... % vim bar.c # edit a new (not tracked by git) file ... % git status # On branch master # Changes not staged for commit: # (use "git add <file>..." to update what will be committed) # (use "git checkout -- <file>..." to discard changes in working directory) # # modified: foo.c # # Untracked files: # (use "git add <file>..." to include in what will be committed) # # bar.c no changes added to commit (use "git add" and/or "git commit -a") % git status -s M foo.c ?? bar.c
Now let's say that you go home for the day without committing these changes. When you get in the next morning, you try to commit the change:
% git add bar.c % git status # On branch master # Changes to be committed: # (use "git reset HEAD <file>..." to unstage) # # new file: bar.c # # Changes not staged for commit: # (use "git add <file>..." to update what will be committed) # (use "git checkout -- <file>..." to discard changes in working directory) # # modified: foo.c # % git add foo.c % git status # On branch master # Changes to be committed: # (use "git reset HEAD <file>..." to unstage) # # new file: bar.c # modified: foo.c # % git commit -m 'fixed bug in foo.c, using new bar.c to do so' [master f36baae] fixed bug in foo.c, using new bar.c to do so 1 file changed, 1 insertion(+) create mode 100644 bar.c
At this stage, you now have created a new commit that is only present in your local repository. In order to share this commit with others in the remote repository (named origin
by default), we need to push the commit as well. But first we need to pull down and any changes made by others in the remote repository:
% git fetch origin ... % git rebase # b/c of tracking setup, "origin/master" is the implied argument ... (the above two commands could be replaced by the equivalent "git pull --rebase") % git push origin master ...
Now let's assume the same conflict in foo.c
occurs as in our SVN example. The conflict will manifest itself at the rebase step. It will look something like this:
% git rebase # b/c of tracking setup, "origin/master" is the implied argument First, rewinding head to replay your work on top of it... Applying: fixed bug in foo.c, using new bar.c to do so Using index info to reconstruct a base tree... M foo.c Falling back to patching base and 3-way merge... Auto-merging foo.c CONFLICT (content): Merge conflict in foo.c Failed to merge in the changes. Patch failed at 0001 fixed bug in foo.c, using new bar.c to do so The copy of the patch that failed is found in: /Users/goodell/scratch/git-wiki-example/.git/rebase-apply/patch When you have resolved this problem, run "git rebase --continue". If you prefer to skip this patch, run "git rebase --skip" instead. To check out the original branch and stop rebasing, run "git rebase --abort".
Now we need to resolve the conflict. We can do this in one of two ways:
% git mergetool Merging: foo.c Normal merge conflict for 'foo.c': {local}: modified file {remote}: modified file Hit return to start merge resolution tool (vimdiff): [I PRESSED ENTER] ... (search for conflict markers, fix conflict)
OR:
% vim foo.c ... (search for conflict markers, fix conflict) % git add foo.c
The first option will evaluate the value of $EDITOR
(or similar) in your environment and attempt to provide you with a useful mode for your editor to help you resolve the conflict. In my case, this is a vimdiff window showing the left, right, base, and working copy versions of the conflicted file. git mergetool
will then automatically git add
the file if all conflict markers have been removed from the file when you exit the mergetool editor. The second option is a slightly more manual version of the first option that looks more like the SVN approach to the problem.
Once all conflicts have been resolved, we simply continue the rebase operation:
% git rebase --continue Applying fixed bug in foo.c, using new bar.c to do so
SVN History Migration
What has been imported?
Much of the history from our previous SVN repository has been migrated over to git. This includes:
- All trunk history, with commit messages prefixed by "
[svn-rXXXX]
". This history lives in themaster
branch, which is the git convention corresponding to SVN'strunk
. The oldest history that was present in SVN was 1.0.6, so that's as far back as the git history goes. - All release tags (which are branch-like in SVN), with their history squashed down into a single commit. These commit messages have the format "
[svn-synthetic] tags/release/mpich2-1.4.1p1
". These commits were then tagged with annotated git tags with names like "v1.4.1p1
".
Import Process and Caveats
The history was imported by a custom script because the MPICH SVN repository was more complicated than git-svn
could handle. Specifically, the use of svn:externals
caused a problem. Problem #1 is that git-svn cannot handle any form of SVN external natively. Problem #2 is that our past use of relative SVN externals (e.g., for confdb
) was unversioned. This means that svn export -r XYZ $SVN_PATH
(nor the pinned-revision variant, @XYZ
) would not actually reproduce the correct working copy at revision XYZ
if the confdb
directory had been changed since XYZ
. So the script jumped through a number of hoops in order to provide the expected result from svn export
. Branch points were computed by hand, rather than attempt to teach the script to do this.
Why Git?
SVN has numerous, well documented deficiencies:
- Branching and merging are nightmares.
- Inspecting history is much more difficult than it is in git.
- Working offline in SVN is limited.
- Performance is slow.
The only three things that SVN had in its favor were inertia (we already had it installed, with other infrastructure built around it), support for fine-grained permissions via the MCS "authz" web page, and everyone basically knows how to use it at this point. Eventually our SVN pain began to exceed the inertia benefit and MCS Systems provided gitosis in order to self-administer permissions with finer granularity. The education issue is unfortunate, but this was an issue that simply must be overcome every time that a VCS becomes obsolete (it occurred for the CVS-->SVN migration).
Why was git chosen over another system (Mercurial, bzr, something else...)? Git arguably has the greatest slice of the distributed VCS market right now, so more of the world will know how to use it to interact with our project. Several existing team members already used git regularly, through a limited git-svn clone of the MPICH SVN repository.
Dealing With Development Branches/Repositories
In SVN we had branches in https://svn.mcs.anl.gov/repos/mpi/mpich2/branches/dev
, which we would typically refer to as dev/FOO
. Many of these branches had restricted permissions, especially when used to collaborate on a research paper or with a vendor. Because of git's distributed nature, it is difficult/impossible to restrict read permissions for a specific branch within a repository. So these development branches have each been put into their own new repositories. Not all development branches were migrated (only ones actively in use).
The basic pattern is that SVN branches named dev/FOO
have been placed into a repository named dev/FOO
, containing a sole branch also named FOO
. These repositories are not listed via git-web or the git daemon, so you must use the SSH form of the clone URL when cloning or adding a remote for these development epositories. The basic procedure for adding these dev branches a local, already cloned copy of "origin" is:
% git remote add dev/FOO --fetch [email protected]:dev/FOO.git Updating dev/FOO remote: Counting objects: 5858, done. remote: Compressing objects: 100% (2354/2354), done. remote: Total 3596 (delta 1777), reused 3008 (delta 1228) Receiving objects: 100% (3596/3596), 2.90 MiB, done. Resolving deltas: 100% (1777/1777), completed with 777 local objects. From git.mpich.org:dev/FOO * [new branch] FOO -> dev/FOO/FOO
That is, we are actually doing two things:
- adding a new git "remote" by the name of
dev/FOO
; - fetching its content, especially the
FOO
branch. The "remote tracking branch" in our local repository is then nameddev/FOO/FOO
(mildly confusing, unfortunately).
You will probably then want to create a local branch to track the remote branch:
% git branch FOO dev/FOO/FOO % git checkout FOO
You can now start hacking away on your local FOO
branch.
In the unlikely case you just want to check out a dev branch and don't want to bother with the "origin" repo too, you can do this instead:
% git clone --origin dev/FOO --branch BRANCH [email protected]:dev/FOO mpich-FOO.git
(where BRANCH
is probably FOO
, but could be something else).
At the current moment we do not have any easy way to list all of the available dev/
branches unless you have permissions to view the gitolite-admin.git
repository. If you think you should have access to a particular development branch, contact [email protected] or the specific MPICH core developer with whom you are working. You can list the dev branches to which you already have access by running:
% ssh [email protected] info hello goodell, this is [email protected] running gitolite3 v3.2-13-gf89408a on git 1.7.0.4 R W dev/FOO R W gitolite-admin R W mpich
The Pro Git book has more information about working with remotes online.
Managing Access Controls
The git repositories on git.mpich.org
are hosted at MCS using gitolite. Gitolite has a very informative manual, which I recommend reading if you have questions about the overall setup or detailed permissions issues.
Access to these repositories is controlled through a git repository that is also hosted on the same gitolite server. To access it, clone [email protected]:gitolite-admin.git
. This repository contains two primary parts: a configuration file (conf/gitolite.conf
) and a directory full of public SSH keys. The configuration file specifies which repositories are valid on the server and which users have particular permissions to access those repositories. At the top of the configuration file is a nice big comment that explains the basic format and permissions rules. If at all in doubt, consult the manual and/or [email protected] before making a change. The rules are not hard, but, just like making firewall rule changes, small mistakes can lead to real problems. The keydir
contains files with the format USERNAME.pub
or [email protected]
. These files should each contain a valid public SSH key for the user given by USERNAME
.
Permission changes and repository creation are triggered by pushes to the gitolite-admin.git
repository. So once you make your changes to this repository, commit them (git commit ...
) and then push the repository back up to [email protected]:mpich.git
.
MPICH-core committers can also create repositories by pushing to any repository that has a name of the format u/USERNAME/REPONAME
or papers/REPONAME
. Permissions for these repositories are managed by the creator by running ssh [email protected] perms ...
. The gitolite manual has a section explaining this command. This eliminates the need to fiddle with the gitolite-admin
repository, but you will then need to use this alternative ssh-based method if permissions need to be changed.
Updating the OpenPA Subtree
To be written.