tobiasmanske.de/content/posts/2021-04-14-git-introduction.md

6.6 KiB

title date <!-- This <!-- cover <!-- image categories tags summary showToc draft
Git Introduction 2021-04-14T18:54:16+02:00 is way too large --> --> /images/git-logo.png -->
misc
git
software
english
software engineering
tools
This post will help you to get started with using and understanding git! false true

Currently, I'm working on my slides for this semester's tutorials for our Software Engineering course. I aim to give the students a good introduction to git, as it is an integral part of our toolchain. So I decided to share my thoughts on what a beginner should know about git here. I'm probably going to expand on this post in later posts.


The struggles with git

Source: xkcd.com

What is it?

So, what is Git? The xkcd already implies that git is a collaborative, distributed source control system. But that is a rather abstract definition. Let's simplify a bit for now. Git tracks changes to your files.

Essentially Git keeps different versions of your files around, so you can compare them and refer to old versions. This is really useful, as it makes it hard for you to lose working code, and easy to compare a broken version to a working one, to find the culprit.

So how do we use it?

The comic above is not wrong. A lot of git comes down to remembering different commands (or button locations for that matter). The first thing one needs to do, is to prepare a Repository. A repository encapsulates everything git tracks. So you move to your project directory and type:

$ git init
Initialized empty Git repository in $project_dir/.git

Git now told you exactly what it did. It prepared a folder for the data it has to track. We now refer to the $project_dir as the working directory, and $project_dir/.git as the repository. Let's create some files, track some changes and take a look at how we can interact with those.

$ echo "our first change" > file_one.txt
$ git status
On branch master

No commits yet

Untracked files:
  (use "git add <file>..." to include in what will be committed)
        file_one.txt

nothing added to commit but untracked files present (use "git add" to track)

So, we created a new file with the content "our first change". git status is used to get information about the current state of the working directory and repository. It currently tells us, that we have not committed anything at all, and that there is a untracked file. Let's stage it. Staging means that we tell git to prepare that file for a snapshot in its current version.

$ git add file_one.txt
$ git status
On branch master

No commits yet

Changes to be committed:
  (use "git rm --cached <file>..." to unstage)
        new file:   file_one.txt

Git now tells us that there are no unstaged chanes, as well as no commits. And that we have prepared a new file for commit. So let's create our first commit. A commit basically is a snapshot of our files, frozen in time. We'll not concern ourselves with the semantics of a good commit for now.

A commit consists of a list of files, their contents, author information, and a message describing the commit.

$ git commit -m "This is our first commit"
$ git show
commit cf0475067740275c6469836ff743855fcf4e4f85 (HEAD -> master)
Author: Tobias Manske <tobias.manske@example.com>
Date:   Wed Apr 14 19:37:00 2021 +0200

    This is our first commit

diff --git a/file_one.txt b/file_one.txt
new file mode 100644
index 0000000..103f365
--- /dev/null
+++ b/file_one.txt
@@ -0,0 +1 @@
+our first change

git show shows us what our last commit did. In my case it is filled in with my email and my name for author information. The commit has the id cf04... and is currently the commit the master branch points to (more on that later). In the bottom part we see a so called diff of the state before the commit and the new state. We can obtain more information about the history with the git log command. Without further options it just prints the top part of git show for every commit in the history. Before we can explore that further, we need some more changes. I'll not go into detail here. Just repeat what we did before. Change stuff, add it to git and commit it.

$ git log
commit 1849c7644425684cea7039e825168d92df4e6936
Author: Tobias Manske <tobias.manske@example.com>
Date:   Wed Apr 14 19:57:38 2021 +0200

    remove a file

commit 4db54105f028d23339bbd1eb51fb397ab0d45719
Author: Tobias Manske <tobias.manske@example.com>
Date:   Wed Apr 14 19:57:23 2021 +0200

    Changes some more stuff

commit cf0475067740275c6469836ff743855fcf4e4f85
Author: Tobias Manske <tobias.manske@example.com>
Date:   Wed Apr 14 19:37:00 2021 +0200

    This is our first commit

As we can see, this is our history. It can be rather cumbersome to look at it this way, let's use some options to make it easier to digest:

$ git log --oneline
1849c76 remove a file
4db5410 Changes some more stuff
cf04750 This is our first commit

This is a lot more concise, but lacks quite a lot of information. I have a shorthand git lg in my configuration which would show (colored, not like it's shown here):

$ git lg
* 1849c76 - (4 minutes ago) remove a file - Tobias Manske (HEAD -> master)
* 4db5410 - (4 minutes ago) Changes some more stuff - Tobias Manske
* cf04750 - (25 minutes ago) This is our first commit - Tobias Manske

The left most column is a abbreviation of the commit hash (that long number we've seen before) that git uses to identify and distinguish commits. The length of the abbreviation will adjust if git ever needs more symbols to distinguish two commits.

Looks complicated, can you explain this again?

I will! With a graphic. Below you can see what happened step-by-step, when we committed.

How commits changed our graph

You can see 4 columns, which are the 4 states our repository was in; every commit changes the state from left to right. In the beginning we had no commits, then we created our first commit and git placed the HEAD-Pointer at it (more on that later).

Each new commit has a parent-pointer to its ancestor and is now the location of our HEAD. This pattern continues till we made our last commit.

So how do I access my data?

Until now, we just stored data into git. That's nice to know, but how could we recover code we deleted later on?

For example: How do we get back our initial version of file_one.txt, how do we look at it? How can we replace our current version with it?