Git Tutorials for Beginners – Introduction to Version Control
What is Version Control & why is it needed?
Version control software is an essential part of the modern software team’s professional practices.
It provides you with many capabilities, such as:
- Maintain multiple versions of code.
- An ability to go rollback to previous version.
- Developers can work in parallel.
- Audit traceability with clear picture on whom, when, where and what are the changes.
- Synchronize the code.
- Copy/Merge/Undo the changes.
- Find out the difference between versions.
- Provides full backup without occupying much space.
- Review the history of the changes.
- Capable for both small and large scale projects.
- Ability to share and use the code amongst remotely located developers.
Version control systems are a category of software tools that help a software team manage changes to artifacts over a period of time. With version control, multiple versions of the same file can be easily maintained and any specific version can be recalled instantly.
Software developers working in teams are regularly creating new and updating the existing source code. The code for a project, application or any software component is typically structured in a folder structure or a “file tree”. Suppose, a developer on a particular team is working on a new feature while another developer fixes an unrelated bug by changing the code, each developer can update several parts of the file tree. There will surely be consistency issues.
Version control helps teams resolve these kinds of problems, by keeping a track of every single change made by each contributor and helps prevent conflict issues. We all probably have, at some point, figured out different ways to manage multiple versions of a file by adding suffixes or numbers and then deal with a new final version, in the end. For example : commenting out certain code blocks to temporarily disable a functionality, without deleting it, fearing it could be of use later. Version control is the panacea in such situations.
Types of Version Control
There are two types of VCS:
- Centralized Version Control System (CVCS)
- Distributed Version Control System (DVCS)
The repository indicates a central server that could be local or remote which is directly connected to each of the programmer’s workstations.
Each programmer can extract or update their workstations with the data present in the repository or can make changes to the data or commit in the repository. Every operation is performed directly on the repository.
It may seem pretty convenient to maintain a single repository, but there are some major drawbacks in the approach, like:
- It is not locally available, so you always have to be connected to a network to perform any action.
- Given everything is centralized, if the central server gets crashed or corrupted, the entire data of the project will be lost.
Distributed VCS is the saviour in such scenarios. Let us understand what is Distributed VCS.
Distributed Version Control systems do not depend on a central server to store all the versions of a project file.
In Distributed version control system, each contributor has his own local copy or “clone” of the remote repository, so everyone maintains their own local repository which contains all the files and metadata present just like the main repository.As you can see in the above diagram, every programmer maintains their own local repository, which is actually a copy of the central repository on their local machine. They can commit and update their copy without any intrusion.
They can update their local repository with fresh data from the central server by an operation called “pull” and can commit changes from their repository to the main repository by an operation called “push”.
Distributed VCS gives you the following advantages:
- All local operations (except push & pull) are very quick because the tool only needs to access the local system, not a remote server. Hence, you are not always dependent on an internet connection.
- Committing a new set of changes can be done locally without manipulating the data on the main repository. Once you have a group of changes ready, you can push them all at once.
- Since every developer has a replica of the project repository, they can share changes with each other and do a peer review before updating the main repository, with the code commits.
- If the central server gets crashed at any point of time, the lost data can be easily recovered from any one of the contributor’s local repositories.
Version Control Tools
Version control tools enable collaboration, maintain versions, and track changes across the team. These tools help you deal with an unlimited number of people, working on the same code base, without ensuring that files are delivered back and forth.
Below are some of the most widely-used and most popular open-source version control systems and tools available in the market:
- Client-server repository model.
- Multiple developers might work on the same project in parallel.
- CVS client will have a working copy of the file up-to-date and requires manual intervention only when an edit conflict occurs.
- Keeps a historical snapshot of the project.
- Anonymous read access.
- There is an ‘Update’ command which updates the local copies to the latest version.
- Can uphold different branches of a project.
- Excludes symbolic links to avoid a security risk.
- Uses delta compression technique for efficient storage.
- Excellent cross-platform support.
- Robust and fully-featured command-line client permits powerful scripting
- Helpful support from vast CVS community
- Good web browsing of the source code repository is allowed.
- It’s quite an old, well known & established tool.
- Good option in the collaborative open-source world.
- No integrity checking for source code repository.
- Does not support atomic checkouts and commits.
- Poor support for distributed source control.
- Does not support signed revisions and merge tracking.
- Client-server repository model.
- Directories are versioned.
- Each operation is versioned.
- Supports atomic commits.
- Versioned symbolic links.
- Free-form versioned metadata.
- Space efficient binary diff storage.
- Branching is not dependent upon the file size and this is a cheap operation.
- Other features – merge tracking, full MIME support, path-based authorization, file locking, standalone server operation.
- Good supporting GUI tools like TortoiseSVN, available
- Supports empty directories.
- Have better windows support as compared to Git.
- Easy to set up and administer.
- Known to integrate well with Windows, leading IDE and Agile tools.
- Does not store the modification time of files.
- Does not deal well with filename normalization.
- Does not support signed revisions.
- Provides strong support for non-linear development.
- Distributed repository model.
- Compatible with protocols like HTTP, FTP, ssh.
- Capable of managing different-sized projects efficiently
- Cryptographic authentication of history.
- Pluggable merge strategies.
- Toolkit-based design.
- Periodic explicit object packing.
- Garbage accumulates until collected.
- Super-fast and efficient performance.
- Code changes can be tracked very easily and clearly.
- Easily maintainable and robust.
- Offers an incredible command line utility tool named git bash.
- Also offers GIT GUI where you can very quickly re-scan, state change, sign off, commit & push the code quickly with just a few clicks.
- Complex and bigger history log becomes difficult to understand.
- Does not support keyword expansion and timestamp preservation.
Mercurial is a distributed revision-control tool written in python. The supported operating systems are Unix-like, Windows and macOS.
- High performance and scalability.
- Advanced branching and merging capabilities.
- Fully distributed collaborative development.
- Handles both plain text and binary files robustly.
- Possesses an integrated web interface.
- Fast and powerful
- Easy to learn
- Lightweight and portable.
- Conceptually simple
- All the add-ons must be written in Python.
- Partial checkouts are not allowed.
- Quite problematic when used with additional extensions.
Bazaar is a DVCS, which provides a great, friendly user experience. Bazaar has a unique feature that it can be deployed either with a central code base or as
a distributed code base.
- It has commands similar to SVN or CVS.
- It allows you to be working with or without a central server.
- Provides free hosting services through the websites Launchpad and Sourceforge.
- Supports file names from the entire Unicode set.
- Directories tracking is supported very well in Bazaar (not there in tools like Git, Mercurial)
- Its plugin system is fairly easy to use.
- High storage efficiency and speed.
- Does not support partial checkout/clone.
- Does not provide timestamp preservation.
|CVCS (SVN)||DVCS (Git)|
|There is a central server repository which holds the "official copy" of the code||You don’t “checkout” code from a central repository|
|The server maintains the version history of the repository||You clone the entire repository and pull changes from it, on your local copy|
|You make "checkouts" on your local machine|
You make modifications on the checked out code
Your changes are not versioned
|Your local repo is a replica of everything on the remote server
yours is “just as good” as theirs
|When you have completed, you “check in” your code changes back to the server|
Your check-in increments the repository’s version
|Many operations are local:
|Centralized VCS like SVN tracks versions on every file.||Git maintains “snapshots” of the entire state of the project.
Each committed version of the overall code has a copy of each file in it. Not every file changes with each commit, so more redundancy, but faster
Git is way more popular in comparison to other version control tools available in the market.
It is time we take a deep dive into Git.