- Git Tutorial
- Why do we need Source Code Management?
- What is Source Code Management?
- Benefits of SCM
- What are the Different VCS available to us?
- What is Git?
- Code Life Cycle
- Getting started with GitHub
- Git Installation on Windows and Linux
- Git Basic Commands
- Git Merge Conflicts
- Git Workflow
Git is something so ubiquitous in the software development industry, and it is so easy to use that people sometimes forget how important it is in software development.
And just because it is easy to use and to learn does not mean it can’t be used to help solve complex problems. So, what is it? Why do we need to learn about it? How can we use it? These are all questions that I will try to answer in this detailed git tutorial.
Let’s start by discussing what Source Code management is and what it’s role is in Software development.
Why do we need Source Code Management?
If you have ever coded in your life, be it for any reason; school/college project, as a hobby, software development, for learning something new, you will understand the importance of storing that code safely and in an organised matter. This is because when we write codes, we generally tend to write better versions of it later, no one gets it right the first time. You may have errors that you need to fix, or you may have new ideas that you want to implement.
E.g. You have coded an application which gives you data analysis of a particular website for your college project. Now, you identify that this code needs improvement in its processing, so you go ahead and change the code accordingly. But what happens when you run this new piece of code, and you get an error?
Do you panic because you have a deadline tomorrow and you don’t have any running piece of code because you overwrote your previous version?
This is exactly where source management comes into play. You want to be able to store your code in one dedicated place so that you don’t lose track of it and all the different version of your code. It is always available to you if you wish to go back to a previous version.
Also Read: Python Interview Questions for 2020
What is Source Code Management?
Source code management is the management of the source code of the software. Source code management tools track all the changes made across different versions of code. It also helps the user store and shares the versions of the code in distributed dedicated repository.
Source management is usually a two-man job. The tool that keeps track of all changes, pushes and pulls code and a distributed repository which is used to store the code.
Benefits of SCM
- Control – A goof SCM tool allows you to completely control all aspects of the code you write
- Security – Implementing a proper source management system creates a layer of security around the code and allows for only a need to Know
- Fast Delivery – a SCM also helps in faster building, testing, deployment of the software by being connected to a deployment tool.
- Availability – All the code is stored in a cloud repository. Anyone with access to it can pull the code into their system at anytime from anywhere on the planet granted they have internet access.
What are the Different VCS available to us?
There are multiple tools for source code management that are available to us for use. They are usually very similar with exception of some features that cater to different requirements. They are:
- Git & GitHub
- Mercurial & Bit bucket
- AWS Code commit
As you may have deduced, we are going to be learning git & GitHub. Let’s start.
What is Git?
Git (global information tracker) is an open source distributed version control system or a source code management system. It was developed by a group of Linux kernel developer who were not satisfied with any free source code manages that were available to them back in 2005.
It has slowly become an industry standard as people gravitated away from legacy source code management systems towards open source tools like git as it provided the same features but was completely free and was a distributed system. This was very attractive to many start-ups and amateur developers.
Here are some of the features of git:
- System Compatibility – Git has absolutely no problem running on any type of operating system, be it Windows, Linux or Mac. Git is also great when it comes to inter version control system compatibility as it can work with other tool’s repositories, e.g. you can use git and SVN together as git has no problem pulling and pushing code to and from distributed repositories other than GitHub.
- Collaboration – Git allows for heavy collaboration. A team of developers can easily maintain a hierarchy and work on the same main code but on different snippets of features simultaneously and then merge all their work to form the software that is to be deployed. This allows for easy tracking of who is working on what part of the project and what changes they have made to that part.
- Speed – Git is very efficient at handling code and allows the user for easy forking and cloning of major repositories onto local systems. It is also much much faster than all of its alternatives.
- Distributed system & reliability – This facet of git allows multiple users to work on the same piece of code from anywhere in the world as it is stored in a distributed repository accessible from the internet. This allows companies to hire people without having to worry about their location and focus on their abilities. This also provides a sense of availability as the code is always available to developers. They needn’t worry about server maintenance and upkeep.
- Security – It also keeps a track of all the changes and commits that were made as logs. This allows for easy investigation when an issue occurs. Git is overall one of the best source code management tools that anyone can use; software developers, data scientist, students, professors, academic researchers, hobbyist. It’s easy to learn and easy to use.
Code Life Cycle
Let’s understand how the code flows within the git system.
From so far what we have learned you can deduce that you will write the code on your local system and that code will then be pushed to a remote distributed repository on the internet. Which is what’s happening in git, but there are few steps in between, let’s discuss all of them.
Working Directory – First, a working directory is assigned in the local system, this the location where all of the working code will be stored, and this is where the user will be working from.
Initializing – Once the working directory has been assigned, the next step is to initialize the working directory. Initializing the directory tells git that this directory is to be regarded as a repository which can be later pushed onto the distributed cloud when needed. This is when all git starts tracking all the files in the directory.
Staging – Once the initialization has been completed, the work on the code can begin. All the changes, i.e. additions & deletions are made. Then the code is staged. Staging code basically tells git where all the changes to the code are made and what kind of changes these are.
Committing – Once the work on the code is done, the files are then committed i.e. all the changes made to them are saved onto the local system.
Pushing – Then the code is pushed from the local storage to the cloud/distributed repository. With git it’s GitHub where the code is stored.
The code can then be pulled later to make new changes and then pushed again repeating the life-cycle.
Getting started with GitHub
When we talk about git, it is important to mention GitHub also as it plays a major part in source code management.
The job for source code management is usually divided in this way:
- Git – tool that tracks all the changes and pushes and pulls code, solves merge conflicts.
- GitHub – Distributed repository which stores all of code.
To learn how to effectively use git, we need to know how to use GitHub also. So let’s start of by creating an account on GitHub. Head to github.com and there fill out the form to sign up like so:
Once you have finished the formalities of creating an account, the next step is to learn how to create a repository for purpose.
Create one by going to home page and click on the new button on the left hand side:
Then go ahead and fill in the requisite details for creating a repository:
Click on Create repository to create the repository.
Copy and keep the git URL of the repository you just created:
Git installation on Windows and Linux
Okay, now let’s install git on windows and linux.
The download will automatically start. If it doesn’t click on the suggested link on the page.
Once the executable in downloaded, go ahead and start it up. In the installation wizard it will ask you for configurations, you can leave them as their default values.
Once you are finished installing git you can see that on right clicking on any screen will now give you two extra options like so:
You can use both the Git GUI and Git bash. Most people use git bash, it’s very easy to use; it’s basically a simple Linux terminal.
We will be using Ubuntu as our Linux distro.
You can directly download git from the default Ubuntu repository using apt package manager like so:
Update your instance:
$ sudo apt update
$ sudo apt install git –y
Check the version of git you installed:
$ git --version
Or you can install git directly from the source (this gives you access to the latest version of git)
$ sudo apt update
$ sudo apt install make libssl-dev libghc-zlib-dev libcurl4-gnutls-dev libexpat1-dev gettext unzip -y
$ cd /usr/src
Go to https://github.com/git/git/releases and copy the link url of the latest version of git
Download the file using the wget command with the link url like so:
Un tar the downloaded file:
$ sudo tar -xf git.tar.gz
Go to the new git directory
$ cd git-*
Use the make command to install git:
$ sudo make prefix=/usr/local all
$ sudo make prefix=/usr/local install
Check the version of git you installed:
$ git --verson
And there you have it, git is installed in your system, now you can start using git commands.
Now let’s configure git to take in our GitHub username and password, this works for both windows & Linux (in windows you will be using gitbash for this).
Use the git config command to add your email and username, like so:
$ sudo git config --global user.name "jhondoe45"
$ sudo git config --global user.email email@example.com
You can see if your credentials are stored in the system using this command:
$ sudo config –list
Or directly edit them in the gitconfig file, like so:
$ sudo vi ~/.gitconfig
You can save your changes by pressing shift + wq or exit by pressing shift + q and enter.
Git basic commands
Now let’s understand all of the basic commands you will need to know to work properly with git.
First, create a folder or go to a folder/ working directory.
$ mkdir project
$ cd project
Then initialise the directory/folder using the git init command like so:
$ git init
Once the directory is initialised you can go ahead and add the repository url you had copied earlier and add it as the remote origin. This command will tell git where to upload the code once we are done making changes.
$ git remote add origin <URL>
Then go ahead and create a simple html file that you will be working with, like so:
$ nano index.html
Inside of this file write some very basic code:
Now that we have added a new file and made changes to it, we can start tracking it. You can do so by using the git add command:
$ git add <name of the file> or
git add .
To check the status of the directory whenever you want you can use the git status command like so:
$ git status
It will tell you which all changes were made in the directory, which branch you are on and whether files have been committed or not
Now you can go ahead and commit the changes using the git commit commad like so:
$ git commit –m “Index.html was added”
Make sure to add a relevant commit message so you can remember it later.
Then you can go ahead and push the code to distributed repository using the command:
$ git push origin master
This command pushes all the committed files on the master branch onto the origin repository.
This file will now be available to us from anywhere in the world provided we have access to internet and our github account.
To simulate this create a new directory:
$ mkdir temp
$ cd temp
Now, clone the git repository url onto this directory using the git clone command like so:
$ git clone <git repo url>
Branches allow easy management of the different parts of software, e.g. Let’s say your team is working on delivering a website. This website will have various features right? Each of these features will be counted as a branch within git that will be worked on separately by different developers. Once they finish their work on the corresponding branches, they will issue a PR (pull request), and then their branch will be merged onto the master branch on git repository. The master branch will be the one that goes into production. This sort of falls into the idea of changing a small part of the code without affecting the whole code.
You can imagine branches as separate timelines for the main piece of code. If you look at below image you should get a better idea of what branches are.
Each branch has its own unique commits. But you ask, what are commits? Commits are checkpoints that are created whenever you use the git commit command to save changes, they are also created when you merge branches; these are called merge commits.
Let’s create a few branches in our system, but first we will check if there are any already present branches already in our system:
$ git branch
To create a new branch type in the following command:
$ git branch <name of the branch>
Make sure you have made a commit previously otherwise this command will not work.
To go inside the newly created branch you can use the git checkout command, like so:
$ git checkout <name of the branch>
To delete a branch you need to use the following command:
$ git branch –d <name of the branch>
This command won’t work if you are deleting the branch you are currently in.
You can rename the current branch by using the –m flag, like so:
$ git branch –m <new name>
If you want to list all the branches even the remote ones you can use –a flag, like so:
$ git branch –a
There is a shortcut, you can switch to a new branch just as it is created using this command:
$ git checkout –b <name of the branch>
Git merge allows a user to merge different branches into one. It will create a new commit on merging. When you merge a certain branch to another you are not transferring the commits of the branch that is to be merged.
Merging is done when a branch has fulfilled its purpose. Suppose when feature1 has been completed and now need to be merged onto the master branch to complete the process.
How merging works is that git first finds the commit pointers of the two branches that are to be merged and then git tries to find a common base. Once the common base is found between the two branches, a new merge commit is created where all of the commits of the respective commits are stored.
Best practices for merging
- Make sure to switch the receiving branch when you want to merge.
- Make sure both the branches are of the latest version.
Right let’s start merging then.
Create a new branch:
$ git branch <name of new branch>
Switch to the new branch using the below command:
$ git checkout <name of new branch>
Now add a new file here called “hello.txt”:
$ vi hello.txt
Stage it and then commit it:
$ git add hello.txt
$ git commit –m ”added hello.txt”
Switch back to master branch:
$ git checkout master
Make sure the head is pointing to your current branch using the git log command like so:
$ git log
Once you have confirmed the above go ahead and merge the two branches using the following command:
$ git merge <name of the merge branch>
You have completed the merging process.
Git merge conflicts
Developers usually work in isolation, if not, they are usually aware of them working on the same branch. But when they don’t know that they are making changes to same file, conflicts arises.
Here, to resolve the issues, we make use of the git merge conflict resolution tools. These issues generally come into the picture when the merging happens. Smaller changes are usually resolved by git itself, but for bigger changes where someone has deleted a file, a human decision is required.
The tool that is used to resolve the conflicts is called vimdiff.
People usually avoid merge conflicts by file locking, making sure only one person is working on a specific branch.
Git rebase is similar to git merge in the sense that they both merge two branches, but the difference here is that with rebasing the history of commits of the merging branch is also carried over to the receiving branch when it is joined with the current branch whereas with git merge the sequence of commits in the merging branch are not transferred over, but rather a combined history of the commits is transferred over to receiving branch.
In reality, the commits aren’t actually transferred. Instead, git actually makes completely new commits on the receiving branch that are similar to the old ones.
You might be asking “Why do we need to rebase? We already have the git merge command”. Well, you are somewhat right, but rebase serves its purpose and the purpose is to keep the history of the commits made to the code very organised and tidy. Other than the obvious benefits of keeping organised and tidy, there is another helpful benefit. During error investigation in the code lifecycle, it becomes easier to find the root cause of the error.
Rebasing is better than merging when you want to avoid adding more commits to the history of the main branch than necessary. Merging is easy and simple, but it introduces new commit every time you merge a branch. In a group project, you will have people continuously contributing to the project, which may mean continuous merging. Hence, the introduction of multiple numbers of commits. This is why rebasing is better in certain scenarios.
Git is a powerful tool, leveraging it effectively in software development is important. This is where the approach of the git workflow comes in. A good git workflow suited to an organisation’s requirements can do wonders. That’s why the senior team members try to make sure the software development pipeline is as optimised as possible.
There may be a different set of workflows for different teams based on their individual requirements. These requirements encompass the culture/mind-set followed in the individual team.
A good workflow:
- Is as optimized as possible
- is compatible with the work culture
- Is easy to scale
Normally, you wouldn’t have just one master branch and one or two feature branches in actual software development. There will be many more branches that will serve specific purposes.
Here are examples of few already existing standard workflows:
- Centralized workflow – In this kind of workflow you have only one branch where everyone commits the changes to.
- Gitflow workflow – Makes use of multiple branches, where you have intermediary branches.
- Git feature branch workflow – The code for features is written in a single dedicated branch which is merged to master branch.
- Forking workflow – Has two repositories for a developer to work with.
Let’s look at git workflow:
Here, you can see there are multiple branches; master, hotfix, release, develop, feature. Let’s understand the roles of these.
- Feature – This is a representation of all the features there may be. This branch is the one that most junior developer will be working with. All features of the software are coded in these branches. The dev’s pull the master branch into their local systems and then create a pull request and do their work and then their work is reviewed by a senior member. If it is approved, then the branch is merged onto the Develop branch.
- Develop – This is the branch where all the features are merged onto. So it contains all of the functional code for the software. And once all there are a good number features on this branch it is merged onto the release branch.
- Release – This is the branch where all of the code is reviewed for extra bugs, release preparation is done, and the documentation is completed. Then the release branch is merged with the master branch.
- Master – This is the branch where the code is tagged with a version and is pushed to building, testing and deployment according to the schedule.
- Hotfix – This branch as the names suggest is used to make quick fixes to the code. This branch allows for workflow interruption free bug resolution. It also quick and serves the singular purpose of solving bugs.
This is how a generic git workflow works. Organisations usually implement one version or another of this.
We learned a lot of things in this git tutorial. From what source code management is, why is it so important in software development, what git is, why is git so helpful, what are basic commands to use in git, what is branching, merging, rebasing, git workflow. I hope it was as fun for you learning as it was for me trying to explain all of these things in detail.
“When you do things right, people won’t be sure you’ve done anything at all”. This quote from Futurama illustrates git’s role the best in software development. It is a very helpful tool that doesn’t get enough of the recognition it deserves. It helps us to manage and to organize all of our code and play an integral part of the software development process. The best part is that it is completely free. So go ahead, use git wherever you can, it will be good practice for when you are in a professional environment.
If you found this git tutorial helpful, and wish to learn more such concepts, join Great Learning Academy’s free online courses today!0