Why use a Version Control System?
|
Version control software refers to a type of program that records sets of changes made to files
VCS is a ubiquitous tool for software development
Tracking changes makes it easier to maintain neat and functional code
Tracking changes aids scientific reproducibility by providing a mechanism to recreate a particular state of your code base
VCS provides a viable mechanism for 100’s of people to work on the same set of files
VCS lets you undo mistakes and restore a code base to a previous working state
Git is the most widely used version control software
Using Git allows access to online tools for publication and collaboration
|
Committing and History
|
Setup Git with your details using git config –global user.name “FIRST_NAME LAST_NAME” and git config –global user.email “email@example.com”
A git repository is the record of the history of a project and can be created with git init
Git records changes to files as commits
Git must be explicitly told which changes to include as part of commit (known as staging changes) with git add [file]…
Staged changes can be stored in a commit with git commit -m “commit message”
You can check which files have been changed and/or staged with git status
You can see the full changes made to files with git diff for unstaged files and git diff –staged
The commit history of a repository can be checked with git log
The command git revert commit_ref creates a new commit which undoes the changes of the specified commit
The command git reset –soft HEAD^ removes the previous commit from the history
|
Branching and Merging
|
Git allows non-linear commit histories called branches
A branch can be thought of as a label that applies to set of commits
Branches can and should be used to carry out development of new features
Branches in a project can be listed with git branch and created with git branch branch_name
The HEAD refers to the current position of the project in its commit history
The current branch can be changed using git checkout branch_name
Once a branch is complete the changes made can be integrated into the project using git merge branch_name
Merging creates a new commit in the target branch incorporating all of the changes made in a branch
Conflicts arise when two branches contain incompatible sets of changes and must be resolved before a merge can complete
Identify the details of merge conflicts using git diff and/or git status
A merge conflict can be resolved by manual editing followed by git add [conflicted file]… and git commit -m “commit_message”
|
Sharing your code
|
Public repositories are open to anyone to use and contribute.
Private repositories are just for yourself or a reduced set of contributors.
README contains a description of the software and, often, some simplified installation instructions.
The LICENSE describes how the software must be distributed and used.
Using one of the OSI (open source initiative) licenses is recommended if the repository is public.
CONTRIBUTING describes how other users can help developing the software.
CITATION helps others to cite your software in their own papers.
GitHub can be used to setup a software repository, share your code and manage who and how can access it.
|
Remote repositories
|
origin is typically the name of the remote repository used by git.
Local and remote repositories are not identical, in general.
Local and remote repositories are not synchronized automatically.
push and pull commands only affect the branch currently checked out.
Only changes to a branch that are committed are pushed to the remote.
Local branches need to be explicitly pushed to a new remote one in order to share them.
|
Collaborating
|
Forks and pull requests are GitHub concepts, not git.
Pull request can be opened to branches on your own repository or any other fork.
Some branches are restricted, meaning that PR cannot be open against them.
Merging a PR does not delete the original branch, just modifies the target one.
PR are often created to solve specific issues.
|