Skip to main content

Git usage guide: understanding working principles of Git

· 8 min read
Josh Cena

This article is migrated from the first section of the README.md file in the new member practice commit repo.

You can either add a file via Graphical-User-Interface(GUI)-powered GitHub Desktop or command line. You may begin with GUI, but you will someday get into command lines since they offer better control over your repo. Furthermore, Visual Studio Code users can try out the built-in source control.

We find it necessary to tell you what you are actually doing each step instead of having you follow the written instructions mechanically. This especially helps since things hardly go as beautifully as the tutorial expects. This section assumes no prior knowledge of any git operations.

You can understand git as a version control system. It keeps track of how each file has been created, modified, and deleted, and the repository owner can switch to any saved version (commit), just like the save files in a game.

Basic operations: clone, branch, commit, push, pull request

Suppose there's a repository out on GitHub named Hello, created by Alice to host perhaps the most famous piece of code in history. Its directory looks like this:

.
├── Hello.cpp

├── Contributors.txt

└── README.md
// Hello.cpp
#include <iostream>
using namespace std;
int main() {
cout << "Hello world!" << endl;
}
// Contributors.txt
XiaoLi
// README.md

# Hello world

A piece of C++ code that prints `Hello world!`

View each commmit as a snapshot of the entire repo. In fact it's much more lightweight than a complete copy, but in essence it contains all the information at that moment. And this version can be viewed as commit C0. (Every commit has a unique hash that's too long to be human-readable, so C0 will suffice.) We also call it the master branch. A branch is a series of commits that form a linear relationship of succession, with the master branch being the "main" branch. In essence, it's a pointer to a commit.

Git tree 1

Bob the linguist finds this repo. He finds great interest in it, but his strong objection to the clichéd Hello world! motivates him to make a contribution. To do this, he needs to download this folder (or repository, as people call it) first. This is basically what cloning does.

After he has a complete copy of the code on his own machine, Bob sets out to make edits. He updates all three files:

// Hello.cpp
#include <iostream>
using namespace std;
int main() {
cout << "Bonjour le monde!" << endl;
}
// Contributors.txt
XiaoLi
XiaoMing
// README.md

# Hello world

A piece of C++ code that prints `Bonjour le monde!`

And now he needs to send it back to GitHub, as his instinct tells him. This is done by directly making a commit. However there's a critical problem here: Alice has zero control over Bob's action. In fact, most public repos (including Computerization's) restrict people from commiting directly to master, because there's no point of authentication. Once the commit is made, Alice would be back and surprisedly find her code becoming French. And it's bad practice anyways: In any collaborative repository, never commit directly to master. (Unless you are resigning in a week) There's another problem with commiting to master, which we will shortly see.

So to fix this problem, Bob creates a new branch, and names it Bob/change-output-language. We may see this as Bob working on a separate but identical folder, and anything he does will not affect master. This not only clarifies his purpose, identifies his position as an author, but also prevents conflict and/or confusion.

He made the said changes. But then he remembered another code of honor: one commit should only serve one purpose. Looking back at his changes, he believes that changing the language and adding his name in the Contributors.txt should be separate things (the difference is minute here, but in real projects it gets obvious). Therefore he makes two commits called Change output to French and Add Bob's name to Contributors.txt, which will be neatly given hashes C1 and C2. Note he did not necessarily work on them sequentially, but git treats C2 as a successor of C1 since it's a more recent commit. Now the branch Bob/change-output-language refers to the commit C2.

Git tree 2

At this point his changes have remained local -- no one can see it on the GitHub page. So then he publishes the branch and pushes it to the origin. This uploads this branch, Bob/change-output-language, with all the commits it contains, to the GitHub remote.

After he has published the branch, any further commits he makes on this branch will be automatically synchronized with GitHub as well.

Directly after, he makes a pull request to ask the code owner (a.k.a. Alice) to merge this branch. When a branch is merged, all the changes will now be reflected in master.

This sounds really simple when there's only one branch, because git simply adds everything to master when all commits are successors of master, which is now C0. So Git moves the C0 pointer:

Git tree 3

And in fact, above is all for new git users to understand. But as future GitHub admins, you deserve to think a little deeper.

Resolving conflicts

So to make thing more fun, Cindy the orthodox comes simultaneously with Bob. Outraged by Alice dropping the return 0 in the cpp file, she decides to fix it. Similarly, she creates a new branch Cindy/improve-code-style from master, which at this point is still C0. The files are changed as:

// Hello.cpp
#include <iostream>
using namespace std;
int main() {
cout << "Hello world!" << endl;
return 0;
}
// Contributors.txt
XiaoLi
XiaoHong
// README.md

# Hello world

A piece of C++ code that prints `Hello world!`

She publishes the branch and opens a pull request.

Git tree 4

Alice, after her tea break, comes back to find the two pull requests. She happily merges Bob's one;

Git tree 5

but then Cindy's one she can not merge on click of a button. GitHub throws a warning, telling her there's a conflict with master. She has to resolve the conflict manually, because git didn't know which version to keep when they are made parallelly. The resolving page shows something like:

// Hello.cpp
#include <iostream>
using namespace std;
int main() {
<<<<<<< master
cout << "Bonjour le monde!" << endl;
=======
cout << "Hello world!" << endl;
return 0;
>>>>>>> XiaoHong/improve-code-style
}
// Contributors.txt
XiaoLi
<<<<<<< master
XiaoMing
=======
XiaoHong
>>>>>>> XiaoHong/improve-code-style

Note that the README.md file needs no resolution because only Bob had made a change; git is smart enough to realize that. But the above issues are pressing. How to resolve them are pretty apparent to humans like Alice, though, so she quickly fixed these, and merged Cindy's branch with no conflict. The files are now (C4):

// Hello.cpp
#include <iostream>
using namespace std;
int main() {
cout << "Bonjour le monde!" << endl;
return 0;
}
// Contributors.txt
XiaoLi
XiaoMing
XiaoHong
// README.md

# Hello world

A piece of C++ code that prints `Bonjour le monde!`

And to merge a branch into a branch that is not a successor, Git will create a new commit:

Git tree 6

And Alice, Bob, and Cindy are happy about their successful collaboration over GitHub on an open-source project.

To understand more about git and its tree structure (as already implied above), check this out: Learn Git Branching

Further Reading

To discover what happens when you pull, push, commit, add, or checkout, you can refer to the following sites:

Alternatively, if you don't like reading long texts, you can watch this YouTube video (about 82-min long):

Or try the commands out on this visualized webpage: