Git is a software tool for project development, where a project can be a paper or computer code or both. It is very convenient for collaborating: it allows several people to work on the same project together simultaneously. Unlike Dropbox, with Git users choose when to "push" their changes to the server and share it with others, so code or documents don't get stuck in an intermediate state. Two users can even make changes to the same file at the same time, and the version manager will merge the changes (up to a point). Users write log messages to go with their changes, which explain to their collaborators what the changes are. The history of the entire project is thus retained and annotated.
GitLab is a cloud-based service that hosts Git repositories. Git and GitLab work hand-in-hand. GitHub is a more popular cloud-based service, it didn't use to allow free private repositories so I only use it for open-source projects, such as braidlab. GitLab has the further advantage that it's open-source, so if the website goes away it is still possible to set up a GitLab server.
Git may already be installed on your computer: open a terminal window, type git, and see if you get a usage message.
Otherwise, download and install Git on your computer. On Ubuntu and Debian Linux, this is as simple as$ sudo apt install git
$ ssh-keygenPress enter to accept the default. (If asked to Overwrite, you already have a key and can answer no.) Then type a passphrase (twice), which is like a password except that spaces are permitted.
$ cat ~/.ssh/id_rsa.pubThis should be a long string of nonsense, beginning with ssh-rsa. This is your public key. The private key (same file but without the .pub extension) should never be shared with anyone.
The SSH key you uploaded will allow a secure connection between your computer and GitLab. On a typical Linux setup, you will only need to type in your passphrase once per session.
The very first thing to do, unless you don't plan on making changes or sharing them with others, is to set up your local Git username and email.
$ git config --global user.name "Eddie Shore" $ git config --global user.email eshore@oldtimehockey.eduwhere you substitute your own name and e-mail address. Make sure to use an e-mail you registered with GitLab.
To get a list of basic Git commands type
$ gitby itself. To get help on a specific command, type
$ git help <command>
To get a local copy (a clone) of a Git/GitHub project (also called a repository) such as the rodent C++ library, type
$ git clone git://github.com/jeanluct/rodent.gitrodent is publicly available, so no password is needed. For private projects, if you have access to the project, you will be prompted for the passphrase associated with the key you created above. You only need to enter this passphrase once per session.
By default, the clone command creates a subdirectory rodent in the directory where you ran the command, but you can specify an optional target path.
The URL syntax is slightly different for a Git/GitLab project such as this sample project:
$ git clone git@gitlab.com:jeanluc/sample-project.gitNotice the git@ instead of git://. You can find the required URL in either GitHub or GitLab under a "clone" drop-down menu.
Now you have the entire history of the project available, but by default you will see the latest version. You can edit the files and make changes as usual. If you want to add a new file or directory, just create it as you normally would, and then explicitly tell Git about it with
$ git add <new_file_or_dir>
Note that new additions don't take effect until you commit them, as described below.
$ git add <existing_file_with_changes>If you make subsequent changes to the same file, they will not get committed in the next step. (Note that this works differently from other version-control programs, such as Mercurial.)
To "commit" local changes once you're satisfied with them, use
$ git commitAn editor will pop up for you to enter a log comment (save and exit when done). If the log comment is short, you can specify it directly on the command line:
$ git commit -m"Added file new_file_or_dir, which does stuff."Write something descriptive (if you use several lines, the first line should be a summary that stands on its own), and a new revision is created (also called a changeset). Revisions can consist of several additions, deletions, and/or modifications of files in the project.
Unlike Subversion and many other version control systems, in Git commits are not uploaded to the server automatically. This encourages 'small commits' with well-defined changes. When you are ready you can upload all your commits by issuing
$ git pushNote that this assumes your project was cloned from a remote repository, rather than initialized as a new, fresh project.
To make sure you have the latest version when you return to a project that you previously cloned, execute the pull command:
$ git pullfrom within the local project directory.
If someone else pushed changes, you might get a message when you pull that a merge is needed. Most of the time, simply issuing
$ git mergewill merge your current state with the changes. Note that you then have to commit again:
$ git commit -m"Merged changes."The reason for this extra commit is that the merge might require some tweaking to make sure everything works (code is not broken, etc.), so Git adds this extra step to allow the user to intervene before accepting the merge.
Other useful Git commands include
$ git statuswhich gives "status report" of the current state of your local files, and
$ git diffor
$ git diff paper.texwhich lists the differences between a local file and the original file. It is also useful to view the change history (or log) for a file or directory:
$ git log paper.texFinally, to remove a file or directory run
$ git rm paper.texwhich deletes the file, but will also delete it for you collaborators when they pull changes from the server. Note that removals should be explained in the commit log message (as should everything else!).
Some tips:
Do not be afraid to commit changes! Git remembers everything before your commit, so it is virtually impossible to break anything.
Commit changes often! It is better to have a detailed log of small changes than a huge number of simultaneous changes. In particular, if you don't commit changes often you will forget what you changed (though liberal use of 'git diff' helps to figure that out). That said, a committed changeset should usually be 'consistent,' in the sense that it doesn't break things for everyone else. But sometimes it is preferable to have smaller changesets that do break things, as long as you don't push your changes to the server until done. Just make sure the log message reflects this.
Write helpful log messages! It takes a few seconds more but it's worth it. Refer to changes in specific individual files. Your collaborators will thank you, and you will thank yourself when you revisit the project two years down the line. For multiline log messages, the first line should be a summary of the changes.
Do not include binaries, object files, executables, logs, etc. or any other files that can be recreated. The idea behind version control is to keep track of meaningful changes to files. Binary files tend to change often, and are designed to be recreated from sources. Log files also change often, since they usually contain dates. Unless the binary or log information is truly essential (i.e, the PDF figures in a paper), these should not be versioned. See .gitignore below for how to avoid them showing up on 'git status'.
Undoing or correcting a commit: Often you'll notice that you made a typo in a commit message: in this case just issue
$ git commit --amendto immediately get a chance to correct it, before your changes are pushed to the server.
If you ommitted something more serious, such as a file or extra changes to a file, you can type
$ git reset --soft HEAD~1will undo the previous commit (unless it was pushed to the server already), and allow you to start over. This preserves the changes and the files staged for commit.
The command can be abbreviated as an easy-to-remember alias as
$ git config --global alias.rollback 'reset --soft HEAD~1'Now typing 'git rollback' will undo the last commit.
.gitignore file: To avoid unversioned files showing up on 'git status', which is distracting and might lead to you miss important changes, add their file names to the special .gitignore file, which lives at the base of the project folder.
For example, when dealing with LaTeX documents, my .gitignore file usually contains
*.aux *.bbl *.blg *.log .DS_StoreThe final line ignores the clutter files generated by OSX.
Note that the file .gitignore should itself be under version control ('git add .gitignore')!
You can also use a .gitignore file in subfolders, which will then apply only to the subfolder and the ones below it. Use such a file to ignore the PDF generated by latex. For example, if you have a file paper.tex, create a .gitignore file in its folder containing
paper.pdfThis way the PDF file doesn't show up under 'git status'. It is a bad idea to add *.pdf to a .gitignore file, since there are PDF files you might wish to add to the repository (such as figures for a paper), so they should not be totally ignored.
Mathematica notebooks can be versioned, but there are a few important caveats. (i) In Edit -> Preferences -> Advanced, uncheck "enable notebook history tracking." This will prevent Mathematica from adding a timestamp to every cell, which would create lots of changes to the file whenever it is executed. (ii) In Edit -> Preferences -> Advanced -> Open Option Inspector, look in Cell Options -> Cell Labels and set the option CellLabelAutoDelete to True. This will prevent Mathematica from saving the cell label each time. (iii) Important: Before any commit, always do Cell -> Delete all output, so that output equations and graphics are removed from the notebook.
Aliases are essential: to make git commands more memorable, especially as they get more advanced, it is essential to make liberal use of aliases. These can be used to simply shorten the names of frequently-used commands:
$ git config --global alias.stat status $ git config --global alias.st status $ git config --global alias.ci commit $ git config --global alias.di diffor for more complex commands that you use often or are hard to memorize:
$ git config --global alias.l 'log --stat' # log with file changes $ git config --global alias.dic 'diff --cached' # diff files staged for commit $ git config --global alias.da 'difftool -d --gui' # use GUI for diff of all changes $ git config --global alias.record 'add --patch' # interactively select changes to addYou wouldn't want to remember this one, for instance, which shows all changes that are not yet pushed to the server on all branches:
$ git config --global alias.unpushed 'log --branches --not --remotes --no-walk --decorate --oneline'
For more information see the Git Book.