Git crash course
By John Cinnamond on Monday 23rd June 10:49 AM
Since the Rails core team switched to git a lot of rails developers have followed suit. Not wishing to be left out, I have been moving some projects to git recently, although somewhat unusually I have been migrating from darcs. git seems pretty good now that I understand the basics, but the documentation for getting started did leave me a little confused. To compensate, here is my 10 minute crash course in getting started with git.
This guide assumes that you are working in a small team, have a central server with git installed and are familiar with ssh.
Creating a central repository
In the git world, every developer has their own full copy of the repository. This is where their changes get committed to. However, with each developer having their own repository and these repositories being in different states it is difficult to say what the authoritative state of a project is. This is easily solved by setting up another central copy of the repository which all developers have access to and to which changes are pushed when they are ready to be shared with the world.
I set this up in the quickest/easiest way I could be creating a new user called 'git' on our development server and adding each developer to the authorized_keys file. I then created a directory called repos under the git user's home directory to store all of the git repositories.
First steps: getting content into a new repository
I decided to move an existing internal (for now) project called panda from darcs into git. You can use tailor to convert between the two different SCMs, but I was lazy and just ditched the darcs history and started the git repository from the current working version.
On my laptop (where I write most of my code), I got a fresh copy of the latest panda code and removed the _darcs directory. (This is similar to doing an svn export.) Next, I turned it into a git repository with git init. This creates the git repository but does not add any of the existing files to it. You can see this by running git status. You'll see a whole load of files that haven't been added to the repository yet.
Before we add all the files, it's worth taking the time to tell git to rails' temporary or generated files. (This is the git equivalent of _darcs/prefs/boring or svn propset svn:ignore.) Edit the file .git/info/exclude and add the patterns of the files you want to ignore. I added:
log/*.log
db/schema.rb
db/*.sqlite3
nbproject
(That last one is to ignore NetBeans' files.) You might want to add the various tmp directories too. Now you can add the rails project files to the git repository with git add .
One problem remains. The extremely observant amongst you might have noticed that the log and lib (and a few more) directories don't show up in the output of git status. This is because git (stupidly) doesn't track empty directories. To force git to include them, create an empty file in these directories and run git add . again. (You might want to add -v to git add . to see what it's adding.) I did this with:
~/git/panda$ touch log/.gitinclude
~/git/panda$ git add . -v
add 'log/.gitinclude'
You can now commit your changes (git commit) and we're ready to push them centrally. To prepare the ground, create a repository on the central server to push the content into. On the central server, as the git user, I created /home/git/repos/panda then changed to this directory and ran git init. Back on my laptop, I named this remote repository to save me having to remember and type the full location each time. I did this with git remote add origin git@name.of.server:repos/panda. Now I'm all ready to push the changes, which I can do with git push origin master.
A crash course in git branches
Branches in git are pretty much the same as branches in any other SCM (except, of course, darcs which is too clever to need branches). However, if you come from an SVN background it is important to understand that the branch/edit/merge cycle is not only best practice, it is thankfully easy to do.
Having each developer always work in a branch is a great way of ensuring that
- the
masterstays clean and represents the latest stable version of the project - changes can easily be shared with other developers without breaking their repository
- a single developer can jump from, say, the latest development to bug fixing the currently deployed version quickly and easily (and without either task contaminating the other)
Branches are good. Start using them from the outset and life becomes easy later.
The typical branch/edit/merge cycle in git is easy to do, although it does suffer from unfortunate naming. The checkout command is used to switch between branches, and is the most common way of creating a new branch. The flow for carrying out some work is:
$ git checkout -b <name of branch> (a)
<make some changes>
$ git status (b)
$ git add . (c)
$ git commit (d)
- This creates a new branch to work in. The command says "create a new branch called <name of branch> and change to it immediately"
- This shows you what changes are ready to be committed
- This adds all of the changes, ready to commit. If you want to selectively add changes use
git add <path>rather thangit add . - This takes the modified or new files you have added (all of them in our example) and commits them to the repository.
Steps c and d can be combined into git commit -a if you are confident that you know what has changed.
When to branch and when to commit
Branches are not created for every change you make, but rather for groups of changes. I tend to create a new branch for every user story I work on, and commit after each acceptance criteria is met, or more frequently if an acceptance criteria requires many steps to satisfy.
Sharing changes: merging and pushing
Once you have finished (and committed) some work that you want to share with other developers then the git process is to merge the changes back into the master branch and push them to the central server so that other people can pick them up. The master branch was created automatically when you ran git init and this is the branch which should hold the latest stable version of the project.
Merging is easy under git. First, change to the master branch using git checkout, and then merge the changes in using ... git merge. The steps are:
$ git checkout master
$ git merge <name of branch>
Assuming there are no conflicts, you can then push this newly merged master back to the central server with:
$ git push origin master
...and everyone can pick up your changes.
Final reflections
I have only been dabbling with git for about one month now, and only using it in anger for the past week. Inevitably my current 'best practices' will change as I get to know it better. There may be better ways of doing some of the above steps, and there is a lot that I didn't talk about. However, hopefully this will be enough to get you start with your own shared git development.
