Areas of Interest, as counted by my cat

Month: September 2021

Learning Git – 0: Introduction

Welcome to yet another Git tutorial, because let’s be honest with ourselves: there aren’t enough of them.

Another name for this series might be “Learning Git: I try things that don’t work, so that you don’t have to”. I’ve had several start-stop relationships with Git over the last few years, and for this most recent effort I decided I was going to go all-out; deep-dive; and write up my learning experience afterwards in a way that I wished I could have learned it the first time.

For many folks learning Git, the first command they are introduced to is clone. I’m going to do it differently: In four chapters and four appendices, the last command we’ll talk about is clone – by which time, hopefully, we’ll know exactly what it does and why.

I’m starting a new project, cow-tipper, and I would like to leverage source control. I’m going to use Git. I have my project directory containing my source files, so all I need is a repository.

My experience installing Git can be read here: Appendix A: Installing Git.

The Full Series:

Part 1: The Basics

Part 2: Branching

Part 3: Remotes

Part 4: Remotes II – Reverse the Polarity!

Appendix A: Installing Git

Appendix B: Customizing the Git Message Editor

Appendix C: Petra Rabbit’s Authentication-palooza

Appendix D: What’s the Diff?

Learning Git: 1: The Basics

(Previously: Part 0 – Introduction)

Initializing a local repository

The git init command sets up a local repository in the current folder:

$ cd ~/Projects/cow_tipper
$ git init
Reinitialized existing Git repository in /home/buster/Projects/cow_tipper/.git/

Change history is stored in a .git sub-directory.

Tracking changes, Staging and Committing

Tracked source files

A tracked source file is one that is understood by Git to require a change history to be saved in the repository. Our source directory may contain “tracked” and “untracked” files. We won’t be able to recover change history for an untracked file.

Committing

A commit saves a set of source file changes as a “snapshot” that can be recovered later. Commits are sequential, and have a pointer to their parent snapshot. A text message is saved along with the commit.

Staging

Staging is the process of adding a file to the set of changes that will be included in the next commit operation.

Now I need to stage my source files for the first commit:

$ git add README.md
$ git add cow_tipper.py

The add command does two things:

  • It marks a file as “tracked”;
  • It copies the file into a staging area

Alternatively, I could just add all the files in the directory in one go:

$ git add *

At this point, all files in the folder are “tracked” and ready to be committed. We can double-check where we are at any time by asking Git using the git status command:

$ git status
On branch master 
Changes to be committed: (use "git restore --staged <file>" to unstage)
   new file: README.md 
   new file: cow_tipper.py

Nice. Let’s save this (“commit” it) to a recoverable snapshot. We’ll use the -m switch to supply a text message directly on the command-line:

$ git commit -m "First Commit"
* * * Please tell me who you are.
Run   git config --global user.email "you@example.com"
      git config --global user.name "Your Name"
to set your account's default identity.
Omit --global to set the identity only in this repository.
fatal: unable to auto-detect email address

Oops! Of course Git needs to know who we are, and even tells us what commands we need to run to do this:

$ git config --global user.email "buster@spacefold.com"
$ git config --global user.name "Buster Kitten"

Now let’s re-try that commit:

$ git commit -m "First commit"
[master c9b7b58] First commit
 2 files changed, 8 insertions(+)
 create mode 100644 README.md
 create mode 100644 cow_tipper.py

Success! Note that we don’t have to use the -m “message” command-line switch. If we just git commit then the default text editor will be launched so that we can edit the message. We can customize what editor is used in the configuration: See Appendix B: Customizing the Message Editor.

If we’re really in a hurry we can Stage+Commit in one command:

$ git commit -a

Speaking of configuration parameters, at any time we can ask Git to show us the configuration it’s working with, using git config:

$ git config --list --show-origin
file:/home/buster/.gitconfig     user.email=buster@spacefold.com
file:/home/buster/.gitconfig     user.name=Buster Kitten
file:.git/config        core.repositoryformatversion=0
file:.git/config        core.filemode=true
file:.git/config        core.bare=false
file:.git/config        core.logallrefupdates=true

Read more about Git configuration here: https://git-scm.com/book/en/v2/Customizing-Git-Git-Configuration.

The Git Log

Note that each commit gets an identifying hash value assigned to it. We can see the first 7 characters “c9b7b58” in that commit message output above.

The command git log will list recent commits along with their hash, author, date, and comment. The format of the output is very controllable, for example:

$ git log --pretty=format:"%h - %an, %ar : %s"
e875278 - Buster Kitten, 36 minutes ago : trialling the secondary pump control block
82c55cb - Buster Kitten, 14 hours ago : Working on the next problem.
fc5fea3 - Buster Kitten, 22 hours ago : I just added a line for test purposes.
aaee8e5 - Buster Kitten, 24 hours ago : Removing files we don't need.
c9b7b58 - Buster Kitten, 3 days ago : First commit

I’m not going to go into depth about the Git log, so for further reading I recommend: Git Basics – Viewing the Commit History (The Git Book)

(I’ve also written about git log before: Obtaining a useful log of recent check-ins.)

Reverting

Git makes this very easy to go back to the current commit’s parent, or further:

$ git revert -1
$ git revert -2

If you know the unique portion of the hash code (see the log output above), you can go back to an explicit commit point:

$ git revert 82c55cb

Tagging

One important feature of a source control system is to retrieve a known snapshot from the change history. Although we can use the unique hash identifier to retrieve a specific snapshot from the repository, it is easier if we use a Tag to mark important points in the history.

Tags are named pointers to specific commit points in the repository change history. We can retrieve a copy of your tracked source files as they were at the time the tag was created. Typically tags are used to mark release points in a development process, such as “Beta 1”, “Beta 2”, “version 0.0.7”, etc.

Creating a tag

We use the git tag command to label the most recent commit:

$ git tag v.0.0.0

Or if we like, we can create an “annotated” tag complete with a text message:

$ git tag -a v.0.0.0 -m "This is the initial state of my source code"

For more on tagging, see https://git-scm.com/book/en/v2/Git-Basics-Tagging

How do I…?

Protect files from being tracked (e.g. .log or .bak files)

git add * is very convenient but if we want to prevent some files from being included in that broad scope, we can leverage the .gitignore file.
See https://www.atlassian.com/git/tutorials/saving-changes/gitignore

Change a tracked file name

The mv command both renames a file in the file system and stages the change for the next commit:

$ git mv <old> <new>

Delete a tracked file

The rm command stages a file for removal, and also deletes the file in the file system:

$ git rm <file>

Unstage a file added incorrectly

To unstage a file, either of these commands work. “restore –staged” is the new way:

$ git reset HEAD <file%gt;
$ git restore --staged <file>

Replace a staged file with a more up-to-date version

If you make a second edit to a file after you’ve already staged it, just stage it again to replaced the staged copy:

$ git add <file>

Revert just one file

To discard changes in the working directory, either of these commands work:

$ git checkout -- <file>
$ git restore <file> 

Delete multiple files using a wildcard spec

This command removes files from staging, and also escapes the wildcard character, which is necessary:

$ git rm --cached \*.log

Find out what tags I have

$ git tag
v0.0.0
v0.0.1

Next up, Part 2: Branching.

Learning Git – 2: Branching

(Previously: Part 1 – The Basics)

Remember that a Git repository is a sequence of commits, where each commit includes a pointer to its “parent”.

A branch in Git is just a pointer to one of those commits. There is a default branch typically called “master”. As we make commits, the master branch pointer moves forward along the sequence.

We can create a second branch pointer named, say, “secundo”:

$ git branch secundo

Now we have a new named pointer (branch), which currently also points to the same commit as “master”.

Switching branches

We can switch to this branch, and continue making commits on it:

$ git checkout secundo
$ git switch secundo

Either of these commands will change the current branch to be “secundo” instead of “master”. switch is more recent syntax and I think it is clear in intent.

As we make more commits, the “secundo” branch pointer advances to keep up. Meanwhile, “master” remains unchanged. We can switch back to it:

$ git switch master

When we do this, the source files revert to what they were at the commit point that the “master” branch is pointing to. Any commits made now will advance the “master” branch pointer independently from the “secundo” branch: they will be pointing to different commit points in the repository history tree.

Note: Git refers to the current state of the source directory as “HEAD”. It marks the current branch, and advances forward as commits are made on the current branch.

So, when we switch back to “secundo”, HEAD is moving to that branch, and (naturally) the source files are restored to reflect the latest state of “secundo”.

If you have files open in an editor, be alert for “file has changed outside the editor; reload?” warnings. And if your editor isn’t smart enough to warn you about file system changes, I recommend you find another editor.

Important: any staged files are discarded when you switch branches. Also, any edits in the working tree will be discarded as the version from the switched-to branch is placed into the working tree.

Branch management

Creating branches in Git is a very low impact operation. We can create and switch to a new brand in one step:

$ git switch -c temp_stuff

Delete it when we’re done experimenting (you can’t delete the currently checked-out branch):

$ git switch master
$ git branch -d temp_stuff

Or rename it, if it turns out our experiments are worth keeping for later:

$ git branch --move temp_stuff keep_stuff

Merging

So, I’ve made some changes in my “secundo” branch and I want to keep them and move them into the “master” branch. I could do this manually but one of the powerful features of using Git is that it can do this for us: Merging combines source file changes made in one branch with existing and possibly different changes made in another branch.

First, we need to switch to the target branch, and then perform the merge:

$ git switch master
$ git merge secundo
Updating 82c55cb..fba73e5
Fast-forward
 README.md     |  4 ++--
 cow_tipper.py | 14 +++++++++++++-
 2 files changed, 15 insertions(+), 3 deletions(-)

The source files in the directory have been updated, and contain the changes from the “secundo” branch. We end up with changes made in both branches.

I could now delete the “secundo” branch if I wanted.

Resolving conflicts

What happens if different changes are made to the same section of code in both branches? Let’s try it:

$ git merge secundo
Auto-merging cow_tipper.py
CONFLICT (content): Merge conflict in cow_tipper.py
Automatic merge failed; fix conflicts and then commit the result.

We can get some more advice from git status:

$ git status
On branch master 
You have unmerged paths.
  (fix conflicts and run "git commit")
  (use "git merge --abort" to abort the merge) 

Unmerged paths: 
  (use "git add <file>..." to mark resolution) 
     both modified: cow_tipper.py 

no changes added to commit (use "git add" and/or "git commit -a")

Git is telling us that it couldn’t perform the merge automatically, and that we’ll have to correct it manually. Here’s what cow_tipper.py looks like in the “master” branch now:

<<<<<<< HEAD
output_text = "Cow Tipping For Beginners"
print( output_text )
=======
first_cut = "Cow Tipping: A Beginners Guide"
print( first_cut )
>>>>>>> secundo

I’ll have to decide which variable name is best, and which output text string is correct, and make the edits, and then commit the changes into “master”.

How do I…?

Return to the previously checked-out branch

$ git switch -

Retrieve files by tag

Probably the best way is to use a temporary branch and then checkout the tag:

$ git branch tmp_v0_0_0
$ git switch tmp_v0_0_0
$ git checkout v0.0.0

Further reading

There’s a whole chapter on this and it is worth digging into:

That’s all for this chapter. Next: Part 3: Remotes.

Learning Git – 3: Remotes

(Previously: Part 2: Branching)

Having Git providing a source control repository for local development is great, but there’s more. Git repositories can also reference other remote repositories out on the network. The network might be our corporate LAN, or perhaps the Internet.

Remote references allow us to get changes made by other developers. They also allow us to share our changes with other developers.

Two popular repository vendors are GitHub and Atlassian BitBucket. Either of them will allow us set up our own public or private git repositories for free.

Some conceptual stuff

What can we do with remote references?

  • We can fetch the latest change history from the remote repository;
  • We can reference the state of remote branches called remote-tracking branches, e.g. “alias/name”, or “origin/master”;
  • We can create a local tracking branch that replicates the contents of a remote branch;
  • We can merge the changes into our local tracking branch;
  • We can make local source code edits in a tracking branch; and push the committed changes to the remote repository.

I found the terminology of “remote-tracking branches” versus “tracking branches” confusing at first. Let’s re-iterate:

  • A branch is a pointer to a specific commit in the change history;
  • A remote-tracking branch represents the current state of a remote branch (updated automatically when connected to the server)
  • A tracking branch is a local branch checked out from a remote branch (or “upstream branch”).

I’m going to move my local repository into a remote, shareable version, and then pretend to be a second developer collaborating on the project from a different machine.

I’m going to do it step-by-step, with no shortcuts.

Creating a remote repository on GitHub

I’ve already created a GitHub account, so I log in and create a new empty repository called cow-tipper.

CAUTION: GitHub prompts us to initialize the repository with README, .gitignore, and license files. Don’t be tempted to do this! We require the repository to be empty in order to upload our local repository history into it. Leave the checkboxes un-ticked.

Another thing to note is that GitHub uses a default branch name of “main” instead of the traditional “master”. We could have changed it back at the time of repository creation, but let’s leave it as-is for now.

Remote URLs and Protocols

Git repository can store a remote URL that represents a remote repository. Usually, the URL will use one of two common network protocols: HTTPS or SSH. Either works well but each has its own quirks of user authentication.

GitHub is helpful and allows us to select which protocol we want to use, showing us the appropriate URL, along with some commands for uploading a local repo:

  • SSH: git@github.com:buster-kitten/cow-tipper.git
  • HTTPS: https://github.com/buster-kitten/cow-tipper.git

Connecting our repository to the remote one

Here’s what we need to do:

  • Add a remote reference to the new empty GitHub repository in our local repository;
  • Redefine the default branch name (“master”) so that it matches the remote one (“main”);
  • Initialize the remote repository with our local repos’ change history

Let’s double-check where we are:

$ cd ~/Projects/cow_tipper 
$ git switch master 
$ git status
On branch master nothing to commit, working tree clean

Add remote reference

Let’s add a reference named “skippy” to our remote repository to our local one, using the git remote command:

$ cd ~/Projects/cow_tipper
$ git remote add skippy git@github.com:buster-kitten/cow-tipper.git

Most of the time in literature you’ll see remote references use the name “origin”, but I want to underline the fact that “origin” is not a magic word, it is just a label, and could be anything. Ours is “skippy”.

We can see what remote references our local repository knows about:

$ git remote --verbose
skippy git@github.com:buster-kitten/cow-tipper.git (fetch) 
skippy git@github.com:buster-kitten/cow-tipper.git (push)

…but that’s not important right now.

Synchronize the default branch name

Now we rename the local repository’s default branch from “master” to “main”:

$ git branch --move master main
$ git status 
On branch main
nothing to commit, working tree clean

Two observations:

  • It is not required that the local branch name match the tracked remote branch name, but I think you’ll agree that it is a good idea to reduce confusion.
  • If we had created our local repository to use “main” as the name of the default branch, this rename step would not have been necessary, obviously.

Make it a remote-tracking branch (attempt 1)

$ git branch --set-upstream-to skippy/main
error: the requested upstream branch 'skippy/main' does not exist
hint: 
hint: If you are planning on basing your work on an upstream
hint: branch that already exists at the remote, you may need to
hint: run "git fetch" to retrieve it.
hint: 
hint: If you are planning to push out a new local branch that
hint: will track its remote counterpart, you may want to use
hint: "git push -u" to set the upstream config as you push.

My first instinct was “oh, of course skippy/main doesn’t exist – we don’t have any remote-tracking branches until we fetch from the server”.

This is true, but not appropriate in our situation: there’s no actual history to fetch from our brand-new empty remote repository. It has no commit history, no branches. We could perform a Fetch, but it wouldn’t help.

Let’s try it anyway:

Attempting to fetch the remote repository commit history

A fetch operation will need to authenticate against the remote server, which doesn’t know who we are. We’ll get a “Permission denied (publickey)” error. We need to authenticate ourselves over SSH by creating a public+private key pair and uploading our public key to the server.

For more detail on authentication with remote repositories,
see Appendix C: Petra Rabbit’s Authentication-palooza.

Having set up my RSA key pair and uploaded my public key to GitHub, I’ll try the fetch:

$ git fetch skippy

That took a few seconds before displaying the prompt, so I think this means it worked. No error, anyway. Now let’s try setting up the remote-tracking branch again:

$ git branch --set-upstream-to skippy/main
error: the requested upstream branch 'skippy/main' does not exist

Well, I said fetching wouldn’t help. The second hint that Git echoes to the console is more relevant:

hint: If you are planning to push out a new local branch that
hint: will track its remote counterpart, you may want to use
hint: "git push -u" to set the upstream config as you push.

Okay. Because no branches exist in our target empty repository, we have to push from our local repo in order to create them remotely.

Push to create the remote branch

This is a one-time thing: we push our work to the empty remote repository:

$ git push skippy main:main

Further reading on this, see: https://stackoverflow.com/questions/1519006/how-do-you-create-a-remote-git-branch/1519032#1519032

If you’re following along, but omitted the attempt to fetch, then you might still get the “permission denied (publickey)” error at this point. Ensure your public key is uploaded to your GitHub account.

$ git push skippy main:main
Enumerating objects: 44, done.
Counting objects: 100% (44/44), done.
Delta compression using up to 8 threads
Compressing objects: 100% (41/41), done.
Writing objects: 100% (44/44), 4.53 KiB | 928.00 KiB/s, done.
Total 44 (delta 8), reused 0 (delta 0)
remote: Resolving deltas: 100% (8/8), done.
To github.com:buster-kitten/cow-tipper.git
 * [new branch]      main -> main

At this point, we can go to the GitHub site and select our cow-tipper repository and see that there are now two files, a branch “main”, and 15 commits. Nice! We’ve successfully uploaded our local repository into a remote one.

Let’s recap:

  • we have a remote called “skippy”
  • Our “master” branch was renamed “main” and is now a tracking branch
  • It tracks the remote-tracking branch “skippy/main”
  • All our locally-created source code change history is available on the remote repository on GitHub, available for cloning.

Remote Workflow

Now that our repository is replicated on the remote server, it is available to other developers. I’ve sent out invites using the tools available on GitHub.com, and I’m pretty sure there’s been some activity.

Fetch

Now that we’re working with changes from other developers, we need to refresh our local remote-tracking branches with any new changes they may have saved to the server.

The fetch command retrieves from the remote repository all the source file changes that we don’t already have in our local repo:

$ git fetch skippy
remote: Enumerating objects: 10, done.
remote: Counting objects: 100% (10/10), done.
remote: Compressing objects: 100% (4/4), done.
remote: Total 7 (delta 3), reused 7 (delta 3), pack-reused 0
Unpacking objects: 100% (7/7), 777 bytes | 388.00 KiB/s, done.
From github.com:buster-kitten/cow-tipper
   84b9985..5e9ae78  main       -> skippy/main

Fetch downloads the changes into our remote-tracking branch. Our “main” branch is unchanged at this point:

$ git log --pretty="format:%h %ce %ci %d %n        %s" -1 
84b9985 buster@spacefold.com 2021-09-07 15:40:32 -0700  (HEAD -> main) 
        Merge branch 'secundo'

Yeah, that’s right – the last code change I made was to merge in my “secundo” branch. However we can ask git to show the log for the remote-tracking branch:

$ git log --pretty="format:%h %ce %ci %d %n        %s" -3 skippy/main
5e9ae78 zach@litterbox.com 2021-09-11 20:25:34 -0700  (skippy/main) 
        Zach The Cat was here and added some code
48e6c91 petra@mcgregor_garden.co.uk 2021-09-11 17:54:59 -0700  
        Additional description about stuff to do
84b9985 buster@spacefold.com 2021-09-07 15:40:32 -0700  (HEAD -> main) 
        Merge branch 'secundo'

We see Zach and Petra have been busy, and committed a couple of changes.

Merge

Let’s merge their changes into our local branch:

$ git merge skippy/main
Updating 84b9985..5e9ae78
Fast-forward
 README.md     | 3 ++-
 cow_tipper.py | 2 ++
 2 files changed, 4 insertions(+), 1 deletion(-)

No conflicts, that’s good. Now our local source files on our file system (in our “main” branch) are up-to-date with the latest changes from the remote repo.

The Next Day…

This morning I’ve made some changes to the code: I moved a function into a separate module. I’ve tested my code change, and I’m ready to commit.

But I suspect there may be some changes from my offshore developer. I’ll fetch and check the remote log:

$ git fetch skippy 
[..] 
$ git log --pretty="format:%h %ce %ci %d %n %s" -2 skippy/main
cee496c petra@mcgregor_garden.co.uk 2021-09-13 17:00:39 -0700 (skippy/main)
        Testing negative tax rates 
5e9ae78 zach@litterbox.com 2021-09-11 20:25:34 -0700 
        Zach The Cat was here and added some code

Yup, Petra added a test. What should we do now? Should I:

  • commit my changes locally
  • merge from remote and deal with potential conflicts

Or:

  • stash my current working tree
  • merge from remote
  • un-stash and commit (and deal with potential conflicts)

Aside: I haven’t talked about stashing in this tutorial yet, and it is already epic in size. So go here if you’re curious:
See https://git-scm.com/book/en/v2/Git-Tools-Stashing-and-Cleaning.)

What’s best? I’m not sure. I’m going to go with strategy A:

$ git add *
$ git commit -m "moving function to library module"
$ git log --pretty="format:%h %ce %ci %d %n        %s" -1 
3e60b54 buster@spacefold.com 2021-09-15 17:14:39 -0700  (HEAD -> main) 
        moving function to library module

No errors, and all our work is safe in commit “3e60b54” in branch “main”.

What horrors await us in the remote branch?

$ git diff main skippy/main

Aside: To avoid derailing the narrative, I’ve put a the introduction and examples of DIFF into Appendix D: What’s the DIFF? Check it out if you wish.

This gets messy, because the diff shows us what we need to do to make the source in “main” look like the source in “skippy/main”, but we don’t want to accept all those changes, because it would undo the work we did this morning. We have to be selective.

Worst case, we can roll back to our “safe” commit using git reset –hard 3e60b54. Let’s merge and see what we get:

$ git merge skippy/main
Auto-merging cow_tipper.py
CONFLICT (content): Merge conflict in cow_tipper.py
Automatic merge failed; fix conflicts and then commit the result.

The results are actually pretty good. We do need to manually resolve (i.e., edit) cow_tipper.py because blindly accepting either the remote or local version would be incorrect: I need to keep Petra’s addition (from the remote version), whilst changing it to invoke the function from the new library I added (in the local version).

You do not need to see the code – it’s a trivial and contrived example.

Having resolved the conflicts, and tested it (hey, Petra had a bug in her code!), we can commit and push:

$ git add cow_tipper.py
$ git commit -m "resolving merge conflicts; fixing Petra's bug"
$ git push skippy
fatal: The current branch main has no upstream branch.
To push the current branch and set the remote as upstream, use

    git push --set-upstream skippy main

Uh.. what? Has it forgotten? Do we have to remind it? Okay:

$ git branch --set-upstream-to skippy/main
Branch 'main' set up to track remote branch 'main' from 'skippy'.
$ git push skippy
Enumerating objects: 14, done.
Counting objects: 100% (14/14), done.
Delta compression using up to 8 threads
Compressing objects: 100% (9/9), done.
Writing objects: 100% (10/10), 1.03 KiB | 1.03 MiB/s, done.
Total 10 (delta 4), reused 0 (delta 0)
remote: Resolving deltas: 100% (4/4), completed with 1 local object.
To github.com:buster-kitten/cow-tipper.git
   cee496c..d9ac925  main -> main

Yeah, I don’t know what happened, but we’re back now. The remote repo has our combined changes.

Pull

The Git pull command combines fetch and merge for remote-tracking branches. As Petra enjoys her first coffee of the day, she executes:

$ git pull origin

(No, that’s not a typo. In Petra’s local repository,the remote reference is named “origin”. Only in my local repository is the remote ref called “skippy”.)

And Petra now has the latest source files in her local “main” branch.

How do I…?

Show branches with upstream remote-tracking branches:

$ git branch -vv

Show detailed information about remotes and remote-tracking branches:

$ git remote show <remote>

Further reading:

That’s all for this section. Next, Part 4: Remotes II – Reverse the Polarity!

Learning Git – 4. Remotes II : Reverse the Polarity!

(Previously: Part 3: Remotes)

In the previous chapter/section, we learned about remotes and pushed our local source files up to a new, empty, remote repository for other developers to collaborate.

This time, we’re going to do the opposite: Create an empty local repository and set it up so that we can work on an existing remote repo, our “cow-tipper” project on GitHub.

Building a local repository

I’m pretty sure I know what we need to do:

  • Initialize a new empty repository with git init
  • ensure the default branch is “main” (instead of “master”)
  • add a <remote> referencing git@github.com:buster-kitten/cow-tipper.git
  • fetch from that remote to populate the remote-tracking branch “<remote>/main”
  • set the upstream branch on local “main” to be the “<remote>/main”

In a new directory:

$ git init --initial-branch=main
error: unknown option `initial-branch'

Wut? It turns out that the initial-branch option is new in Git version 2.28, and here in Linux I’m using 2.25. We’ll have to use an alternative method: I’ll create it with the default, and then re-name it:

$ git init
Initialized empty Git repository in /home/buster/Projects/cow_tupper/.git/
$ git branch -m master main
error: refname refs/heads/master not found
fatal: Branch rename failed

Wut!? “master” isn’t real? But:

$ git status
On branch master
No commits yet
nothing to commit (create/copy files and use "git add" to track)

Well, then, let’s just create a “main” branch directly and worry about deleting “master” later:

$ git checkout -b main
Switched to a new branch 'main'

And add the remote, which we’ll call “pluto” for fun. It’s pretty remote. Then we can fetch:

$ git remote add pluto git@github.com:buster-kitten/cow-tipper.git
$ git fetch pluto
remote: Enumerating objects: 68, done.
remote: Counting objects: 100% (68/68), done.
remote: Compressing objects: 100% (46/46), done.
remote: Total 68 (delta 18), reused 68 (delta 18), pack-reused 0
Unpacking objects: 100% (68/68), 6.90 KiB | 441.00 KiB/s, done.
From github.com:buster-kitten/cow-tipper
 * [new branch]      main       -> pluto/main

Cool. Now we can make “main” a remote-tracking branch, right?

$ git branch --set-upstream-to pluto/main
fatal: branch 'main' does not exist

Wut!? Now “main” isn’t real?

$ git branch --verbose
$

No results? But:

$ git status
On branch main
No commits yet
nothing to commit (create/copy files and use "git add" to track)

So we’re “on branch main”, but “branch ‘main’ doesn’t exist”. This is confusing… some might call it a “bug”, others “by design”. It is, for sure, unfortunate.

Okay, let’s not waste any more time: If you remember from earlier, “A branch in Git is just a pointer to a specific commit.” But we don’t have any commits yet. However, the repository metadata knows that the branch will be called “main”, just as soon as the first commit is made. That’s why we get the mixed messages.

So what we actually need to do at this point, is this:

$ git checkout main
Branch 'main' set up to track remote branch 'main' from 'pluto'.
Already on 'main'
$ git status
On branch main
Your branch is up to date with 'pluto/main'.

nothing to commit, working tree clean
$ git branch --verbose
* main 92502d2 Added header comment lines

Here’s what happened, and I’m going to quote and paraphrase StackOverflow author torek directly because they wrote a great explanation:

“You had a repository that is in a peculiar state: it has no commits, so it has no branches. At the same time, it does have a current branch, which is master. In other words, the current branch is a branch that does not exist.

Whenever you run git checkout <name> and there is no branch named <name>, Git checks to see if there is exactly one remote-tracking branch named <remote>/<name>. If so, Git creates a new branch <name> that has <remote>/<name> as its upstream branch.”

torek @ stack overflow

This is a bit funky, and I really am not a fan of received wisdom, but it explains what we observed above.

Let’s recap the necessary commands, but skip the stuff that didn’t work, the duplicate commands, and do things in the optimal sequence:

$ git init
$ git remote add pluto git@github.com:buster-kitten/cow-tipper.git
$ git fetch pluto
remote: Enumerating objects: 68, done.
remote: Counting objects: 100% (68/68), done.
remote: Compressing objects: 100% (46/46), done.
remote: Total 68 (delta 18), reused 68 (delta 18), pack-reused 0
Unpacking objects: 100% (68/68), 6.90 KiB | 415.00 KiB/s, done.
From github.com:buster-kitten/cow-tipper
 * [new branch]      main       -> pluto/main
$ git checkout -b main
Switched to a new branch 'main'
$ ls -l
total 0

No files? Where are my files?

$ git branch --verbose
$

That returns nothing. Huh. Try checking out the branch a second time:

$ git checkout main
Branch 'main' set up to track remote branch 'main' from 'pluto'.
Already on 'main'
$ git branch -v
* main 92502d2 Added header comment lines
$ ls -l
total 12
-rw-rw-r-- 1 buster buster 152 Sep 17 13:22 cow_lib.py
-rw-rw-r-- 1 buster buster 533 Sep 17 13:22 cow_tipper.py
-rw-rw-r-- 1 buster buster 265 Sep 17 13:22 README.md

That’s better.

I have not found an explanation of why we needed to do a second git checkout in order to get our working tree populated. Suggestions on a postcard, please.

Attack of the Clones

None of that stuff really matters because you’re much more likely to use a Git command that bundles all those steps together into one:

Clone

$ git clone <url>

The Git clone command replicates a remote repository into a brand new local repository on your local file system, complete with remotes, remote-tracking branches, and a working tree of source files. Of course, it makes some choices for you:

  • creates a local repository
  • creates a remote called “origin”
  • creates a remote-tracking branch called “origin/master”
  • fetches the latest changes
  • creates a “master” branch tracking “origin/master”
  • merges (i.e. populates) the “master” branch from “origin/master”

Example:

$ git clone git@github.com:buster-kitten/cow-tipper.git
Cloning into 'cow-tipper'...
remote: Enumerating objects: 68, done.
remote: Counting objects: 100% (68/68), done.
remote: Compressing objects: 100% (46/46), done.
remote: Total 68 (delta 18), reused 68 (delta 18), pack-reused 0
Receiving objects: 100% (68/68), 6.92 KiB | 3.46 MiB/s, done.
Resolving deltas: 100% (18/18), done.
$ cd cow-tipper
$ ls -l
total 12
-rw-rw-r-- 1 buster buster 152 Sep 17 13:34 cow_lib.py
-rw-rw-r-- 1 buster buster 533 Sep 17 13:34 cow_tipper.py
-rw-rw-r-- 1 buster buster 265 Sep 17 13:34 README.md

And that’s why clone is typically the first Git command you’re likely to encounter, even before you learn about add, commit, push, fetch, merge, and pull.

How do I…?

Show more information about a remote

$ git remote show <alias>

Delete a remote branch

$ git push <remote> --delete <branch>

This deletes the pointer representing <branch> on the <remote> server .

Delete a remote reference

The remote remove command deletes all remote-tracking branches and config settings related to the remote:

$ git remote remove <alias>

Further reading

And that’s all for now. I hope you enjoyed following along and maybe even found it useful. Cheers.

Learning Git – Appendix A: Installing

Linux

My Linux Mint box seems to have a default version of Git installed:

$ git --version
git version 2.17.1

However, we can get the latest version (for Mint/Ubuntu/Debian):

$ sudo add-apt-repository ppa:git-core/ppa
You are about to add the following PPA:
The most current stable version of Git for Ubuntu.
For release candidates, go to https: /launchpad.net/~git-core/+archive/candidate .
More info: https: /launchpad.net/~git-core/+archive/ubuntu/ppa
Press Enter to continue or Ctrl+C to cancel
Executing: /tmp/apt-key-gpghome.RVLrIDy8yS/gpg.1.sh --keyserver hkps: /keyserver.ubuntu.com:443 --
recv-keys E1DD270288B4E6030699E45FA1715D88E1DF1F24
gpg: key A1715D88E1DF1F24: public key "Launchpad PPA for Ubuntu Git Maintainers" imported
gpg: Total number processed: 1
gpg: imported: 1

And then:

$ sudo apt-get update
Ign:1 http: /packages.linuxmint.com tina InRelease
Hit:2 http: /packages.linuxmint.com tina Release
[..]
Get:12 http: /ppa.launchpad.net/git-core/ppa/ubuntu bionic/main Translation-en [2,252 B]
Fetched 281 kB in 2s (140 kB/s)
Reading package lists . Done

Then, re-install:

$ sudo apt-get install git
Reading package lists ... Done
Building dependency tree
Reading state information ... Done
The following packages were automatically installed and are no longer required:
gir1.2-mate-desktop gir1.2-mate-panel libllvm6.0:i386 libllvm7 libllvm7:i386 libwaylandclient0:i386 libwayland-server0:i386 python-psutil python-xapp
Use 'sudo apt autoremove' to remove them.
The following additional packages will be installed:
git-man libpcre2-8-0
Suggested packages:
git-daemon-run | git-daemon-sysvinit git-doc git-el git-email git-gui gitk gitweb git-cvs gitmediawiki git-svn
The following NEW packages will be installed:
libpcre2-8-0
The following packages will be upgraded:
git git-man
2 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
Need to get 8,050 kB of archives.
After this operation, 9,595 kB of additional disk space will be used.
Do you want to continue? [Y/n] y
- 18 -
[ .]
Processing triggers for man-db (2.8.3-2ubuntu0.1) ...
Processing triggers for libc-bin (2.27-3ubuntu1) ...

After which:

$ git --version
git version 2.28.0

Further reading:

https://stackoverflow.com/questions/19109542/installing-latest-version-of-git-in-ubuntu

Windows

Install Git for Windows, and include the Git Bash Prompt.

Git Bash prompt maps “home” or “~” to C:\users\Buster.

Through a bit of jiggery-pokery, although “/” is mapped to the Git install folder (i.e. “C:\Program Files\Git” , in order to get to the C: or D: drive you use “/C/” or “/D/”.

Hey, it’s better than what Cygwin does.

As a long-time user of TortoiseSVN, at first I thought I would use TortoiseGit but in hindsight, it is better to just bite the bullet and experiment with the command line. There’s just too much difference in how the two source control systems work. I got used to the Git Bash prompt quickly, and now it seems like the best way.

Learning Git – Appendix B: Customizing the message editor

I started out thinking it would be smart to use a GUI editor for Git commits, but after admiring Andreas Kling’s workflow (YouTube) I realized that it would be much more efficient to embrace lighter “terminal” editors, such as nano or vi.

On Linux Mint, the default Git editor appears to be nano.

On Windows, using Git Bash, the default Git editor appears to be vi., and seems to understand Git comment syntax highlighting “out of the box”.

I did experiment with using nano, and adding git comment syntax highlighting using the recommendation here. But eventually I decided that learning the few key commands of vi was the best way to go.

So on Linux, I switched:

$ git config --global core.editor "vi"
$ git commit

The terminal window now shows the VI editor.

Important Key Commands

$ git commit

The editor opens. Hit A (to “Append”) or I (to “Insert”) before you start typing your commit message.

When you’re done, hit [ESCAPE] : W Q to “Write” and “Quit” and return to the command line. Don’t forget the colon!

Alternatively, you get configure Git so that vi goes into Insert mode automatically:

$ git config --global core.editor "vi -c 'startinsert'"

(Thanks to perreal@stackoverflow)

Learning Git – Appendix C: Petra Rabbit’s Authentication-palooza

Petra Rabbit is a developer. She would like to contribute to some private repositories that Zach Cat has set up on BitBucket and GitHub.

1. Using SSH to clone a repository on BitBucket

Zach has a private repository on BitBucket called “X_Files” that Petra needs to clone locally:

$ git clone git@bitbucket.org:zach-the-cat/x_files.git
Cloning into 'x_files'...
The authenticity of host 'bitbucket.org (104.192.141.1)' can't be established.
RSA key fingerprint is SHA256:zzXQOXSRBEiUtuE8AikJYKwbHaxvSc0ojez9YXaGp1A.
Are you sure you want to continue connecting (yes/no/[fingerprint])? 
Warning: Permanently added 'bitbucket.org' (RSA) to the list of known hosts.

This is the first time Petra has used SSH to talk to the BitBucket server on this computer. Consequently, she sees that prompt from SSH asking her if it is okay to add the server IP address to its list of known hosts.

yes
Warning: Permanently added 'bitbucket.org' (RSA) to the list of known hosts.
Forbidden
fatal: Could not read from remote repository.
Please make sure you have the correct access rights and the repository exists.

Yeah… it’s just not that easy. In order for this to happen:

  • Petra has to create a user account on BitBucket;
  • Zach has to grant Petra’s BitBucket user account read+write permission on his repository;
  • Petra has to create a public+private RSA key pair for her development laptop
  • She has to upload the public key to her BitBucket account

Zach goes to BitBucket > X_Files > Repository Settings > User and Group Access, and searches for Petra by email address; selects WRITE access; and confirms.

And now Petra sees an email from Atlassian, “Zach The Cat invited you to collaborate on the X_Files repository on BitBucket”. Of course, she accepts the invitation… When she logs in to BitBucket in her browser, she can now see Zach’s repo. Try again, Petra:

$ git clone git@bitbucket.org:zach-the-cat/x_files.git
Cloning into 'x_files'...
git@bitbucket.org: Permission denied (publickey).
fatal: Could not read from remote repository.
Please make sure you have the correct access rights and the repository exists.

Well, at least it’s a different error, because she’s not quite done yet: She has to create an RSA key pair identifying her on her development laptop, and upload the public key to her account on BitBucket:

$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/home/petra/.ssh/id_rsa): 
Enter passphrase (empty for no passphrase): *********
Enter same passphrase again: *********
Your identification has been saved in /home/petra/.ssh/id_rsa
Your public key has been saved in /home/petra/.ssh/id_rsa.pub

She was prompted to enter a passphrase, and she’s going to need it every time she talks to the remote repository, so let’s hope she picked something memorable!

Now Petra goes to BitBucket > Profile and Settings > Personal settings > SSH Keys. It says “There are no keys configured” but she can click on the Add Key button, and paste the contents of the id_rsa.pub file, along with a descriptive label. The label could be any text but it helps to use something that identifies the current user and client computer (i.e. the development machine), because if she switches machines (i.e. uses a virtual machine or another laptop) then She’ll need to upload a separate public key for that environment also.

Having uploaded the public key to BitBucket, and with the private key accessible to SSH locally, Petra should now be able to clone the repository successfully:

$ git clone git@bitbucket.org:zach-the-cat/x_files.git
Cloning into 'x_files'...

At this point she’s prompted to enter her passphrase for the RSA key pair, and does so.

remote: Enumerating objects: 13, done.
remote: Counting objects: 100% (13/13), done.
remote: Compressing objects: 100% (12/12), done.
remote: Total 13 (delta 2), reused 0 (delta 0), pack-reused 0
Receiving objects: 100% (13/13), done.
Resolving deltas: 100% (2/2), done.

Let’s review:

$ ls -l
total 4
drwxrwxr-x 3 petra petra 4096 Sep 10 11:48 x_files
$ cd x_files
$ git status
On branch main
Your branch is up to date with 'origin/main'.
nothing to commit, working tree 

Git Passphrase Persistence

Petra may get prompted for her RSA key pass phrase every time she commits a change, or fetches changes, but a Git Bash session can run an agent process that takes care of this for her. It is per-session – during the remainder of that session, she won’t get prompted for the pass phrase, except for one time while setting up the agent:

$ eval $(ssh-agent)
Agent pid 227166
$ ssh-add ~/.ssh/id_rsa
Enter passphrase for /home/petra/.ssh/id_rsa: *************
Identity added: /home/petra/.ssh/id_rsa (petra@mcgregor_garden)

For further reading: https://smallstep.com/blog/ssh-agent-explained/

Differences between Windows and Linux

If you use the Git Bash console on Windows, then the workflow is almost exactly the same as that described above. The only difference is that the RSA key files are located in C:\Users\<user>\.ssh\

If you don’t use Git Bash or ssh-keygen, then there are tutorials out there on how to use PuTTYgen or OpenSSH.

2. Using HTTPS to clone a private repository on BitBucket

Petra Rabbit experiments with using HTTPS instead. Typically BitBucket or GitHub will tell us that the Git URL to use with HTTPS is:

https://user-name@bitbucket.org/user-name/repo-name.git

It’s easy to forget that the first user-name is the user we are authenticating as, and the second user is the owner of the repository. So, Petra needs to use her BitBucket account name in place of the first, authenticating user:

$ git clone https://petra-rabbit@bitbucket.org/zach-the-cat/x_files.git
Cloning into 'x_files'...
Password for 'https://petra-rabbit@bitbucket.org': *******
Unpacking objects: 100% (13/13), 2.65 KiB | 677.00 KiB/s, done.

BitBucket accepts her “petra-rabbit” account password.

3. Using HTTPS to clone a repository on GitHub

Petra has created her own private repository on GitHub called “Y_Files” and she’d like to clone it locally so that she can develop offline. GitHub tells her the URL to use for HTTPS:

https://github.com/petra-rabbit/Y_Files.git

It’s interesting that it is different from the URL that BitBucket suggests, it’s missing the “user-name@” prefix. No problem, it will just prompt for a user name:

$ git clone https://github.com/petra-rabbit/Y_Files.git
Cloning into 'Y_Files'...
Username for 'https://github.com': petra-rabbit
Password for 'https://petra-rabbit@github.com': ********
remote: Support for password authentication was removed on August 13, 2021. Please use a personal access token instead.
remote: Please see https://github.blog/2020-12-15-token-authentication-requirements-for-git-operations/ for more information.
fatal: Authentication failed for 'https://github.com/petra-rabbit/Y_Files.git/'

Interesting. GitHub is going to require Petra to learn about Personal Access Tokens:
https://docs.github.com/en/github/authenticating-to-github/keeping-your-account-and-data-secure/creating-a-personal-access-token

Petra opens GitHub in her web browser and navigates to: GitHub Profile > Settings > Developer Settings > Personal access tokens.

She clicks on Generate New Token.; ticks the [x] repo checkbox; generates the token; and saves it in a text file in a secret location.

Try again, this time including the user-name prefix, just to prove it works:

$ git clone https://petra-rabbit@github.com/petra-rabbit/Y_Files.git
Cloning into 'Y_Files'...
Password for 'https://petra-rabbit@github.com': 

This time, instead of her GitHub account password, Petra pastes the PAT string:

Password for 'https://petra-rabbit@github.com': ********************************
remote: Enumerating objects: 9, done.
remote: Counting objects: 100% (9/9), done.
remote: Compressing objects: 100% (6/6), done.
remote: Total 9 (delta 1), reused 0 (delta 0), pack-reused 0
Unpacking objects: 100% (9/9), 1.95 KiB | 664.00 KiB/s, done.

Repository successfully cloned locally:

$ cd Y_Files
$ git status
On branch main
Your branch is up to date with 'origin/main'.
nothing to commit, working tree clean

Sorted.

4. Granting permission to other developers in GitHub

Petra would like to invite Zach The Cat to contribute to her Y_Files repo on GitHub.

  • She navigates to GitHub > Repositories > Y_Files > Settings > Manage Access
  • She presses the Invite a collaborator button
  • She searches for the user zach-the-cat, selects him, and waits.
  • Zach receives an invitation via email and accepts.
  • Petra’s Y_Files repository now has one collaborator

Exercise for the Student: Save your Git repository remote authentication credentials in your favorite IDE.

That’s all for this Appendix. Go back to the Top.

Learning Git – Appendix D: What’s the DIFF?

Ray is writing the next great American detective novel, and we’ve started by creating a remote BitBucket Git repository to manage his changes.

Ray clones the repository:

$ git clone https://ray-chandler@bitbucket.org/ray-chandler/noir.git

Consider that Ray now has a tracking branch called “master”; a remote reference called “origin”; and a remote-tracking branch “origin/master”.

He has many possible versions of his novel:

  • The working tree contains Ray’s manuscript with any current un-staged, un-committed changes;
  • The staging area may contain a copy that was saved for inclusion in the next commit (via add);
  • HEAD is the most recent commit in the current branch (i.e. “master”);
  • The remote-tracking branch “origin/master” may also have some un-merged differences from the last fetch operation;
  • The “master” branch in the remote repository on the server may have some un-fetched changes recently checked in by Ray’s editor.

Wow, that’s five possible versions.

Scene: Ray sips at a glass of bourbon, and types furiously, making changes to Chapter 1. He saves his work periodically.

While Ray is accessing his muse, let’s learn about Git’s diff tool:

Using DIFF

The Git diff command is a tool we can use to compare different versions of source files:

$ git diff <target> <source> # 

This produces a report of changes needed to make “target” look like “source”. There are some more common, special cases:

To make <target> look like the working tree:

$ git diff          # target = staging area
$ git diff HEAD     # target = HEAD     
$ git diff <target> 

To make <target> look like the staging area:

$ git diff --staged          # target = HEAD
$ git diff --staged HEAD     # target = HEAD     
$ git diff --staged <target>

What would git commit -a do? Find out with:

$ git diff --staged HEAD

What would git commit do? Find out with:

$ git diff --staged

Examples

Consider the following sequence:

echo "This line is pushed to remote/master." > source.txt
git commit -a -m "Step1"
git push origin
git switch -c secundo
echo "This line is committed to secundo." > source.txt
git commit -a -m "Step2"
git switch master
echo "This line is committed to master." > source.txt
git commit -a -m "Step3"
echo "This line is in the staging area." > source.txt
git add source.txt
echo "This line is in the working tree." > source.txt

We’ve now got different versions of source.txt in all possible locations. Let’s find out the differences:

$ git diff
diff --git a/source.txt b/source.txt
index 16f0b21..c260d9d 100644
--- a/source.txt
+++ b/source.txt
@@ -1 +1 @@
-This line is in the staging area.
+This line is edited in the working tree.

Yup, that describes how to update the staging area to match the version in the working tree. (From here on out, I’ll omit the first few lines of the diff output).

$ git diff --staged origin/master
:
-This line is pushed to remote/master.
+This line is in the staging area.
$ git diff secundo master
:
-This line is committed to secundo.
+This line is committed to master.

Exercise for the student: Try out the other variations.

Back to that detective story

Cut scene: Editors office. Penny White is sitting at her computer.

Penny cloned the repository yesterday, but she’s pretty sure they’ll be some updates from Ray on the server:

$ git fetch origin
:
Unpacking objects: 100% (6/6), 738 bytes | 22.00 KiB/s, done.
From https://bitbucket.org/ray-chandler/noir
   14c6528..713d32a  master -> origin/master
$ git status
On branch master
Your branch is behind 'origin/master' by 2 commits, and can be fast-forwarded.
 (use "git pull" to update your local branch)
nothing to commit, working tree clean

Penny could merge at this point, but before she does, she’d like to see what the changes are.

From our experiments above, we know that in order to see what will change during the merge, Penny will need to request a diff using “master” as the <target> and “origin/master” as the <source>:

$ git diff master origin/master
diff --git a/detective.txt b/detective.txt
index df79c59..8616f4d 100644
--- a/detective.txt
+++ b/detective.txt
@@ -1,7 +1,9 @@
 Chapter 1

 There was a knock at the door. I quickly hid the comic book under the WIRED magazine
-and took out some official looking papers and scattered them about the desk.
+on the desk and took out some official looking papers and scattered them about.

 "Jess!" I yelled.

+The door opened. The siloette blocking the light from the door was a dame,
+but it wasn't Jess.

Ray has been busy. The first thing Penny is going to do after merging is correct the spelling of “silhouette”.

A Graphical Diff

Penny is not a fan of the text output from diff, so she has read up on the difftool command. It is essentially identical to diff, except Git will launch your preferred graphical utility to display the diff information.

If you’re interacting with source code from inside an IDE that includes Git integration, this probably won’t be something you’ll need to do. But it helps to understand what is going on behind the scenes.

I recommend meld, as it is reviewed positively and has versions for both Windows and Linux: https://meldmerge.org/

After we’ve installed it, we can enable it in Git by adding the following lines to .gitconfig:

[diff]
    tool = meld
[difftool "meld"]
    path = meld    ; Windows: c:\\Program Files (x86)\\Meld\Meld.exe
[difftool]
    prompt=false

[merge]
    tool = meld
[mergetool "meld"]
    path = meld   ; Windows: c:\\Program Files (x86)\\Meld\Meld.exe
[mergetool]
    keepBackup = false

Penny views the changes using Meld:

$ git difftool master origin/master
Using Meld to view the diff

Meld presents each changed file in a separate window, opening them in sequence as the previous one is closed. For many files, this can get clumsy.

An alternative would be to generate a list of files with differences, then launch the difftool for each file on your own schedule:

$ git diff --compact-summary master origin/master
 detective.txt   | 7 ++-----
 references.txt  | 1 -
 2 files changed, 2 insertions(+), 6 deletions(-)
$ git difftool master:detective.txt origin/master:detective.txt

That last command demonstrates how Penny would specify a single file, using the “branch_path:file_spec” notation.

Further reading:

That’s all for this Appendix. Go back to the article here.

© 2021 More Than Four

Theme by Anders NorenUp ↑