Infrastructure-as-Code (IaC) is a key part the cloud technology landscape which has meant I’ve had many discussions with people who traditionally haven’t had to worry about software development practices in their roles as IT Operators.
One key area that seems to undo a lot of folks is file versioning using source control, with the prevailing tech in this space being Git.
I’ll be honest and say that Git can seem like a bit of a beast if you try and digest anything more than introductory content!
In this post I’m going to look at what I think are the basics you need to know and show you a quick way to start working with Git without needing to install a bunch of tools.
It’s a good question! 🙂
Git has become the prevailing source control system of choice, but it is by no means the only one. Many concepts you learn with Git can be easily re-used with other source control systems.
The main features of Git that have made it appeal to people (in my opinion) are:
- It’s a distributed (and not centralised) version control system. This means you can work with files and use source control while not connected to the origin system
- Actions on the system are relatively light-weight. A key example is creating a branch which uses a smart approach to making copies of the source branch thus reducing the overhead of creating and managing branches (🤓 advanced reading)
- It’s widely used which means selecting it to use in your environment won’t make it hard to hire (or train) people who can use it.
On a side note, Microsoft has adopted Git (as part of Azure DevOps) as its source control platform of choice. Read Brian Harry’s 2017 blog post which is really interesting and touches on Microsoft’s contributions back into Git based on using it for source control for Windows’ source code (300 GB and 6 million files!)
Important Git Concepts and Commands
There are a few key concepts and commands you need to understand in order to work with Git. I’ve tried to order these in a way that you can build up on concepts as you read.
- The core of Git. A repository contains all your source code and tracks changes to it over time. Think of it is a journaling filesystem with an inbuilt recycling bin for deleted items.
- If you want to work on an existing repository you can clone (or copy) it locally to your machine. When you clone a repository a link is created between your local clone and the origin repository. When you run most commands locally you will be interacting with your local copy and not the origin.
- Forking is a concept that originated with GitHub, and strictly speaking isn’t a part of Git itself. When you fork a repository you make a linked copy of it, but it is not a clone. You typically fork a repository when you want to contribute content to the repository but don’t have contributor rights on it. A forked repository still has a relationship with the original which means you can submit to, or receive changes from, it in the form of Pull Requests which we cover below (🤓 advanced reading).
- Think of a branch as a folder in a filesystem. By default every repository has one branch named master. You can create multiple, related branches and name them as you see fit (🤓 advanced reading). You can even branch locally without ever publishing the branch back to the origin!
- If you’ve worked with other source control platforms this command might trip you up! When you perform a ‘checkout’ with Git you are, in fact, changing the active branch you are working on and not “checking out” files from Git (🤓 advanced reading).
- This is a reference to the most recent commit in the branch you are working with. The name fairly describes what it is – it’s the head of the branch!
- You must perform this action for a file or folder to included in a Repository. Some development software can automatically watch for new files and add them. A file that has not been added to Git is said to be untracked.
- When you update existing tracked files or add new files you must commit them to Git for the changes or addition to be source controlled. This is an explicit action you have to take and it is always recommended to provide a commit message that describes what the commit consists of.
- Commit hash (ID)
- Every commit you make to Git has an ID associated with it. These IDs are not unique as they are an SHA-1 hash of the contents of a commit, but it’s unlikely you’ll find many instances of the same ID in a repository. This ID is usually displayed as an abbreviated set of seven characters similar to ‘acf87c23’ (🤓 advanced reading).
- Gitignore (Git ignore)
- You create a .gitignore file in your repository to stop certain files or directories from being added – think of things like temporary files or folders or sensitive config files (🤓 advanced reading).
- When you want to update your cloned repository from its origin you use a pull command to retrieve changes.
- When you want to submit your local changes to the origin you push them to the origin.
- Merge conflict
- This occurs when you perform either pull or push and a file (or files) has been modified in both your local and the origin repository. When this happens Git will try and automatically merge the files for you, but if it can’t then you will need to manually fix. Manual merges and committing merged files can be one of more confusing aspects of working with Git and I’m actually going to recommend reading existing content from GitHub on this as it will provide you good approaches to deal with conflicts.
- Pull Request (or PR)
- A Pull Request (PRs) allows you to push changes between branches and between forks. PRs are typically used when you don’t hold the right to contribute on the branch or upstream repository you are submitting to, or in situations where you would like the submitted changes to be reviewed first before being accepted and merged. Pull Requests originated with GitHub and aren’t a part of the core Git platform (much like Forks) (🤓 advanced reading).
On that last one – why is it a “Pull” Request when you are pushing? Well, it’s because you are asking the owner of the target branch or repository to accept, or pull, your changes into their branch or repository.
OK, that’s the important baseline concepts out of the way!
Actually getting started!
One of the easiest ways I think you have today to get started is by using GitHub. If you haven’t already you can head over and join the community.
Once signed up you can start creating repositories and learn about working with Git. You can make as many mistakes here as you want so go nuts! 🙂
You can even choose to fork an existing repository and explore how it hangs together. No need to install any tooling locally on your machine either.
Once you’re ready to go to the next level you can try the following:
- Good First Issues: these are tagged by GitHub maintainers as a good entry point into helping them with their software / repository. Use GitHub’s search to find them.
- Fixing Microsoft Docs: this is a really cool way to work with Git. If you find a mistake in the Docs site you can make an edit which uses GitHub to drive the workflow. I recorded the video below to show you how you can do this.
Note that in the video I’m already logged into my GitHub account and that I have previously forked the Microsoft Docs repository.
Hopefully you’re now at the point where you comfortable working with Git and can start to use it in your day-to-day workflow.