tobiasmanske.de/content/posts/2021-04-14-git-introduction.md

180 lines
6.6 KiB
Markdown

---
title: "Git Introduction"
date: 2021-04-14T18:54:16+02:00
<!-- This: is way too large -->
<!-- cover: -->
<!-- image: /images/git-logo.png -->
categories:
- misc
- git
- software
tags:
- english
- software engineering
- tools
summary: This post will help you to get started with using and understanding git!
showToc: false
draft: true
---
Currently, I'm working on my slides for this semester's tutorials for our Software Engineering course.
I aim to give the students a good introduction to git, as it is an integral part of our toolchain.
So I decided to share my thoughts on what a beginner should know about git here.
I'm probably going to expand on this post in later posts.
---
![The struggles with git](/images/xkcd/1597.png)
<center><i>Source: <a href="https://xkcd.com/1597/">xkcd.com</a></i></center>
## What is it?
So, what is Git? The xkcd already implies that git is a collaborative, distributed source control system.
But that is a rather abstract definition. Let's simplify a bit for now. Git tracks changes to your files.
Essentially Git keeps different versions of your files around, so you can compare them and refer to old versions.
This is really useful, as it makes it hard for you to lose working code, and easy to compare a broken version to a
working one, to find the culprit.
## So how do we use it?
The comic above is not wrong. A lot of git comes down to remembering different commands (or button locations for that matter).
The first thing one needs to do, is to prepare a Repository. A repository encapsulates everything git tracks. So you move to
your project directory and type:
```bash
$ git init
Initialized empty Git repository in $project_dir/.git
```
Git now told you exactly what it did. It prepared a folder for the data it has to track. We now refer to the $project_dir
as the working directory, and $project_dir/.git as the repository. Let's create some files, track some changes and take a
look at how we can interact with those.
```bash
$ echo "our first change" > file_one.txt
$ git status
On branch master
No commits yet
Untracked files:
(use "git add <file>..." to include in what will be committed)
file_one.txt
nothing added to commit but untracked files present (use "git add" to track)
```
So, we created a new file with the content "our first change". `git status` is used to get information about the current
state of the working directory and repository. It currently tells us, that we have not committed anything at all, and that
there is a untracked file. Let's stage it. Staging means that we tell git to *prepare* that file for a snapshot in its current
version.
```bash
$ git add file_one.txt
$ git status
On branch master
No commits yet
Changes to be committed:
(use "git rm --cached <file>..." to unstage)
new file: file_one.txt
```
Git now tells us that there are no unstaged chanes, as well as no commits. And that we have prepared a new file for commit.
So let's create our first commit. A commit basically is a snapshot of our files, frozen in time. We'll not concern ourselves
with the semantics of a good commit for now.
A commit consists of a list of files, their contents, author information, and a message describing the commit.
```diff
$ git commit -m "This is our first commit"
$ git show
commit cf0475067740275c6469836ff743855fcf4e4f85 (HEAD -> master)
Author: Tobias Manske <tobias.manske@example.com>
Date: Wed Apr 14 19:37:00 2021 +0200
This is our first commit
diff --git a/file_one.txt b/file_one.txt
new file mode 100644
index 0000000..103f365
--- /dev/null
+++ b/file_one.txt
@@ -0,0 +1 @@
+our first change
```
`git show` shows us what our last commit did. In my case it is filled in with my email and my name for author information.
The commit has the id cf04... and is currently the commit the master branch points to (more on that later).
In the bottom part we see a so called diff of the state before the commit and the new state. We can obtain more information
about the history with the `git log` command. Without further options it just prints the top part of `git show` for every
commit in the history. Before we can explore that further, we need some more changes. I'll not go into detail here. Just
repeat what we did before. Change stuff, add it to git and commit it.
```diff
$ git log
commit 1849c7644425684cea7039e825168d92df4e6936
Author: Tobias Manske <tobias.manske@example.com>
Date: Wed Apr 14 19:57:38 2021 +0200
remove a file
commit 4db54105f028d23339bbd1eb51fb397ab0d45719
Author: Tobias Manske <tobias.manske@example.com>
Date: Wed Apr 14 19:57:23 2021 +0200
Changes some more stuff
commit cf0475067740275c6469836ff743855fcf4e4f85
Author: Tobias Manske <tobias.manske@example.com>
Date: Wed Apr 14 19:37:00 2021 +0200
This is our first commit
```
As we can see, this is our history. It can be rather cumbersome to look at it this way, let's use some options to make
it easier to digest:
```bash
$ git log --oneline
1849c76 remove a file
4db5410 Changes some more stuff
cf04750 This is our first commit
```
This is a lot more concise, but lacks quite a lot of information. I have a shorthand `git lg` in
[my configuration](https://git.tobiasmanske.de/user/rad4day/public/dotfiles/tree/.gitconfig)
which would show (colored, not like it's shown here):
```bash
$ git lg
* 1849c76 - (4 minutes ago) remove a file - Tobias Manske (HEAD -> master)
* 4db5410 - (4 minutes ago) Changes some more stuff - Tobias Manske
* cf04750 - (25 minutes ago) This is our first commit - Tobias Manske
```
The left most column is a abbreviation of the commit hash (that long number we've seen before) that git uses to identify
and distinguish commits. The length of the abbreviation will adjust if git ever needs more symbols to distinguish two commits.
## Looks complicated, can you explain this again?
I will! With a graphic. Below you can see what happened step-by-step, when we committed.
![How commits changed our graph](/images/git-graph.png)
You can see 4 columns, which are the 4 states our repository was in; every commit changes the state from left to right.
In the beginning we had no commits, then we created our first commit and git placed the HEAD-Pointer at it (more on that later).
Each new commit has a parent-pointer to its ancestor and is now the location of our HEAD. This pattern continues till we made
our last commit.
## So how do I access my data?
Until now, we just stored data into git. That's nice to know, but how could we recover code we deleted later on?
For example: How do we get back our initial version of file_one.txt, how do we look at it? How can we replace our current
version with it?