Git is a distributed version control system created by Linus Trovalds in 2005 for the Linux Kernel development. It is free software distributed under de GNU General Public License. Version control is the management of changes to documents, computer programs, web sites, and other collections of information. So, git was created for managing and developing new and big distributed software projects. The idea is for you to have all the versions of your software, even when a lot of people are making changes.
Every computer its a full repository with a complete history and full version-tracking capabilities. Git also has strong support for non-linear, and distributed development, where barnching and merging is very easy. This system helps you to share your projects with other and invite new developers.
Git is a very powerful tool when you need to work with a group of people, it helps you to have your projects well organized, and save. There are three characteristics that every good version control system may have:
The first, is that when you have a very big project, every developer is constantly making changes to the project. You want to identify who made a specific change. If that person inserted an error or a bug, is easier to solve the problem, because you can exactly know what addition, or deletion created the problem.
With Reversibility, you want to be able to go back to an specific version of your project. Maybe to fix a bug, or to solve a compatibility problem. Going back to the example of a developer adding a bug, if you use a system with reversibility, you can go back to an stable version where everything works fine.
Comparability, refers to the capability to compare changes. If two developers make a change in the same part of the project, what change should be added to the final project? Comparing the to versions one can decide the best option for the project.
If you are really interested in learning about git we encourage you to read the Pro Git Book, which is the book the developers recommend. You Do Not need to read this book to be able to use git on a daily basis, it is only for people who want to go deeper. For what we need you to do in ARCOS-Lab, the following sections of this tutorials are enough.
GitLab is a service for hopsting your git repositories. You can think of it like Google Drive, but for git. It is a server that holds the master record of your project, so that it is easier for you and your peers to sync your work.
In ARCOS-Lab we use two instances of GitLab; one for public projects and another one for internal projects.
For the public projects of the laboratory we created a group in GitLab (the official website) available at https://gitlab.com/arcoslab. In order to have access to this group you need to create a GitLab account and upload your public SSH key to your profile. TODO: link to our ssh tutorial
. Then you need to ask any of the administrators of the group to add you. At the moment of writing the admins are: Federico Ruiz Ugalde, Ariel Mora Jimenez, Javier Peralta Sáenz, and Daniel García Vaglio. Pro Tip:You will get a faster response if you ask any of the last three.
Do not create new repositories in the public instance of GitLab without prior approval from one of the administrators.
Besides being a service, GitLab is an open source project, so we installed GitLab in our Lab. This is only for internal use.
All the code that we have in this repositories is confidential and should not be shared with people that are not part of the ARCOS-Lab.
In order to have access you need an LDAP account. Our GitLab instace is available at git.arcoslab.org, to log in you use your LDAP credentials. The first step is to upload your SSH public key to your profile. Then you should ask any of the administrators to give you access to the repositories or groups that you are going to be using. At the moment of writing the admins are: Federico Ruiz Ugalde, Ariel Mora Jimenez, Javier Peralta Saenz, and Daniel García Vaglio. Pro Tip:You will get a faster response if you ask any of the last three.
There are 8 groups:
In Linux we have a command called git
, which is the recommended way of using git. You can get help for usage with git --help
or read the manual with man git
.
In the following subsections we will learn about the following git subcommands: init
, clone
, add
, commit
, status
, log
, blame
, mv
, rm
, pull
, push
, checkout
, branch
, merge
. If you find that what you need is beyond what we teach here, feel free to read the documentation. The only recommendation we make if you want to follow tutorials outside ARCOS-Lab is:
Do not rewrite the history of a repository
There are two ways to start, you can create a new repository from scratch, or get an already existing repository from a server. To create a new repository go to the directory where you want the repository to be and run:
git init
This will create a new empty git repository.
If there is an already existing repository and you want to get its content run:
git clone <url>
Where the url is the address of the repo you want to get. If there are two urls available, one with http
and one with ssh
, prefer the one with ssh
. This will create a clone of the remote repository in your computer. For now on, when we talk about the local
repository we talk about the clone in your PC, when we talk about remote
we talk about the repo in the server.
Git tries to store the history of your project, but it will not take automatic snapshots of your code. For adding a change to the history we have to run some commands first.
git add <files>
add
will add the files that you say to something called the staging area. The staging area is for git to be aware of the changes. yohoo git, I am planning on adding this to the history!. If you have to work on many file at the time for a particular change in your code base, it is useful to add the files as you finish the changes/fixes.
git commit
This is a very, very, VERY important step of the process. This is when you get all the changes in the staging area and add them to the history of your repository. It is called commit
because you are making a commitment with the rest of your fellow developers that what you have added to the history is correct and safe. When you make a commit, it will be added with your name to the history, so please be very sure of what you commit!
Use descriptive commit messages
When you commit changes, git will ask you to write a commit message. This message is a brief explanation of the changes you made in the repository. This is for helping your fellow developers and your future self to understand what is going on in the repository. Good developers write good commit messages. There too much to say about commit messages, so I will just ask you to read this guide.
Another important aspect about commits is knowing what to add to the repository. You do not waht to add binaries, or autogenerated files to the history of your repository. Git was designed to work with text, whenever you make a change it doesn't store the entire new version but only the deltas. These deltas computations work with text but if you use other types of files git will end up creating entire copies of the files. As time goes on, your repository could grow until it is unmanageable. Avoid compiled files, pdfs, images, audio, videos, etc...
Do not add binaries to the repository
git status
status
gives you a way of knowing what is going on with your repository. It will tell you if there are changes that have not been added to the staging area, and what files are in staging that have not been commited. It can also tell you if you are ahead of the server, and other useful stuff.
git log
This command will give you access to the history of your repository. It will show you a list of commits in (almost) chronological order. Every commit has a hash, a message, an author and a timestamp. The hash is an ID that uniquely identifies the commit from the rest.
git blame <file>
This command does exactly what you would expect. It tells you who has edited a certain file. It will tell you the lines that were edited and the commit in which the changes where added to the history. It is very useful for finding out who to blame for a bug
git mv <file> <path>
This will move a file to a path just as the normal mv
command does. The difference it that git mv
lets git be aware of the move. If you just use the normal mv
git will think that you deleted the original file and created another one in a diferent path, so you will loose all the history of the original one, and will be duplicating information. In order to avoid that use git mv
.
git rm
Lets git be aware of files you remove.
git pull
The main use of having a centralized server is to help sync the history of the local repositories of all the developers in a project. pull
will pull the history from the server to make it available for you.
git push
This command will push your history to the remote. If there are conflicts you will not be able to push. You first need to solve the conflicts in your local repository in order to push them. After you push your changes will be available for the rest of the team. In order to avoid history conflicts pull
, then make changes, then add
, commit
and finally push
.
Git lets you have branches, this are alternative threads of development. Think of it as a parallel universe with its own history. The real magic comes when you merge two branches into one. There is a main branch called master. This branch holds the master of your history, thus is the most important branch of all. Because master is so important, you do not want to work directly in master so you create a branch, work on that branch and then merge it to master.
Do not work directly on master, unless you are the only person working in that repository
Do not push broken code to master
git branch <name>
This command will create a new branch of the given name.
git checkout <branch-name>
To go to another branch you can use the checkout
sub-command. checkout
also works for going to a specific commit. It is useful for when you want to examine your code from the past, for example if there is a bug and you go back in time when the bug was not present. For that you use the hash of the commit you want to go to.
git checkout <hash>
Finally, when you are done with a branch, you can merge it into master. This process will make the changes in the branch available also in master. You have to make sure before merging, that you are not introducing problems to the code base. The merge
subcommand will merge the specified branch into the one you are at the time.
git merge <branch-name>
Branching and merging is the most advanced topic in this tutorial. For a better understanding read this tutorial
If you want a healthy develpment process it is recommended to use merge requests and code reviews. We can make mistakes, so it is a good idea to ask other people to review your changes before adding them to master. In that way the other person will find problems you didn't see, will make suggestions (probably a more experienced developer), or will ask clarifying questions that will help your code to be better. GitLab offers an easy way for doing this, with merge requests. Instead of merging the changs yourself you can open a merge request so that another developer can review your changes and merge them.
This is the recommended way of merging to master.
This is a tutorial for creating merge requests.
This is a tutorial of the general workflow of merge requests and code reviews.
.