Git 5: Branches
One of the most useful features of Git is called branching.
Branching allows you to diverge from the main line of work and edit or
update your code and files (e.g. to test out a new analysis or
some experimental feature) without affecting your main work. If the work
you did in the branch turns out to be useful you can merge that back
into your main
branch. On the other hand, if the work
didn’t turn out as planned, you can simply delete the branch and
continue where you left off in your main line of work. Another use case
for branching is when you are working in a project with multiple people.
Branching can be a way of compartmentalizing your team’s work on
different parts of the project and enables merging back into the
main
branch in a controlled fashion; we will learn more
about this in the section about working remotely.
- Let’s start trying out branching! We can see the current branch by running:
git branch
This tells us that there is only the main
branch at the
moment.
Main and Master
If your branch is calledmaster
instead ofmain
that’s perfectly fine as well, but do check out the Git section of the pre-course setup for more details about the choice of default branch names.
- Let’s make a new branch:
git branch test_alignment
Run
git branch
again to see the available branches. Do you note which one is selected as the active branch?Let’s move to our newly created branch using the
checkout
command:
git checkout test_alignment
Tip
You can create and checkout a new branch in one line withgit checkout -b branch_name
.
Let’s add some changes to our new branch! We’ll use this to try out a different set of parameters on the sequence alignment step of the case study project.
- Edit the
Snakefile
so that the shell command of thealign_to_genome
rule looks like this (add the--very-sensitive-local
option):
bowtie2 --very-sensitive-local -x results/bowtie2/{config[genome_id]} -U {input.fastq} > {output} 2>{log}
Add and commit the change!
To get a visual view of your branches and commits you can use the command:
git log --graph --all --oneline
It is often useful to see what differences exist between branches.
You can use the diff
command for this:
git diff main
This shows the difference between the active branch
(test_alignment
) and main
on a line-per-line
basis. Do you see which lines have changed between
test_alignment
and main
branches?
Tip
We can also add the--color-words
flag togit diff
, which instead displays the difference on a word-per-word basis rather than line-per-line.
Note
Git is constantly evolving, along with some of its commands. While thecheckout
command is quite versatile (it’s used for more than just switching branches), this versatility can sometimes be confusing. The Git team thus added a new command,git switch
, that can be used instead. This command is still experimental, however, so we have opted to stick withcheckout
for the course - for now.
Now, let’s assume that we have tested our code and the alignment
analysis is run successfully with our new parameters. We thus want to
merge our work into the main
branch. It is good to start
with checking the differences between branches (as we just did) so that
we know what we will merge.
- Checkout the branch you want to merge into, i.e.
main
:
git checkout main
- To merge, run the following code:
git merge test_alignment
Run git log --graph --all --oneline
again to see how the
merge commit brings back the changes made in test_alignment
to main
.
Tip
If working on different features or parts of an analysis on different branches, and at the same time maintaining a workingmain
branch for the stable code, it is convenient to periodically merge the changes made tomain
into relevant branches (i.e. the opposite to what we did above). That way, you keep your experimental branches up-to-date with the newest changes and make them easier to merge intomain
when time comes.
- If we do not want to do more work in
test_alignment
we can delete that branch:
git branch -d test_alignment
- Run
git log --graph --all --oneline
again. Note that the commits and the graph history are still there? A branch is simply a pointer to a specific commit, and that pointer has been removed.
Tip
There are many types of so-called “branching models”, each with varying degrees of complexity depending on the developer’s needs and the number of collaborators. While there certainly isn’t a single branching model that can be considered to be the “best”, it is very often most useful to keep it simple. An example of a simple and functional model is to have amain
branch that is always working (i.e. can successfully run all your code and without known bugs) and develop new code on feature branches (one new feature per branch). Feature branches are short-lived, meaning that they are deleted once they are merged intomain
.
Quick recap
We have now learned how to divide our work into branches and how to manage them:
git branch <branch>
creates a new branch.git checkout <branch>
moves the repository to the state in which the specified branch is currently in.git merge <branch>
merges the specified branch into the current one.