Git is one of the most powerful tools in a developer’s arsenal, but its flexibility and depth can sometimes make it challenging to master. This post explores some of the fundamental Git concepts that every developer should understand to work effectively with version control.

Local vs. Remote: Understanding the Relationship

When working with Git, one of the first things to grasp is the relationship between your local repository and the remote repository (typically hosted on platforms like GitHub, GitLab, or Bitbucket).

The message Your branch is ahead of 'origin/master' by 2 commits is a common sight for Git users. This simply means your local repository contains two commits that haven’t been pushed to the remote repository yet. These commits exist only on your machine until you synchronize them with the remote using git push origin master.

Handling Divergent Histories

This relationship becomes more complex when changes occur in both locations. Consider this scenario:

Developer: What happens when one also cleans up code that was part of a pull request on GitHub and not locally?

When you edit files on GitHub directly but also have local changes, your histories diverge. This creates a situation where both repositories have different changes to the same codebase. When you try to synchronize them, Git will inform you of this divergence:

$ git pull origin symbolic-tensor-constants
From https://github.com/c5huracan/tinygrad
 * branch              symbolic-tensor-constants -> FETCH_HEAD
hint: You have divergent branches and need to specify how to reconcile them.
hint: You can do so by running one of the following commands sometime before
hint: your next pull:
hint: 
hint:   git config pull.rebase false  # merge
hint:   git config pull.rebase true   # rebase
hint:   git config pull.ff only       # fast-forward only

In this case, you need to decide how to reconcile these divergent histories. For branches that are part of pull requests (which are considered public), merging is typically the safer approach:

git pull --no-rebase origin symbolic-tensor-constants

Fetch vs. Pull: Different Ways to Get Remote Changes

When you need to get changes from the remote repository, Git offers two primary commands:

Developer: What is the difference between a fetch and a pull?

Git Fetch

git fetch downloads changes from the remote repository but doesn’t integrate them into your working files. It’s like checking what’s new without actually applying those changes.

git fetch origin

This command:

  • Updates your remote-tracking branches (like origin/master)
  • Doesn’t modify your working directory
  • Allows you to review changes before integrating them

Git Pull

git pull is more direct - it downloads changes and immediately integrates them into your current branch.

git pull origin master

Essentially, git pull is a combination of git fetch followed by either git merge or git rebase, depending on your configuration.

A safer workflow is often to fetch first, review the changes (using git diff origin/master), and then decide how to integrate them.

Developer: What is the : prompt when I run the git diff trying to tell us?

When running commands like git diff, you might encounter a colon (:) prompt. This indicates that Git is using a pager (typically “less” on Unix-like systems) to display the output.

This interactive mode allows you to:

  • Navigate through large outputs with Space (page down) or b (page up)
  • Search for specific text with /
  • Quit and return to the command line with q

If you prefer to see all output at once, you can use the --no-pager option:

git --no-pager diff

Merge vs. Rebase: Two Paths to Integration

Developer: What is the difference between merge and rebase?

When integrating changes from one branch into another, Git offers two distinct approaches:

Git Merge

Merging creates a new “merge commit” that combines changes from both branches, preserving the complete history.

A---B---C (master)
     \
      D---E (feature)

After git merge feature while on master:

A---B---C---F (master)
     \     /
      D---E (feature)

Where F is a new merge commit that combines the changes.

Merging is non-destructive and clearly shows when and where branches were integrated, but it can create a more complex, non-linear history.

Git Rebase

Rebasing takes your commits from one branch and replays them on top of another branch, creating a linear history.

A---B---C (master)
     \
      D---E (feature)

After git rebase master while on feature:

A---B---C (master)
         \
          D'---E' (feature)

Rebasing creates a cleaner, linear history but rewrites commit history in the process.

Developer: So if a branch is part of a pull request, am I correct to say that it is public and therefore I need to merge it?

Yes! The golden rule is never to rebase branches that others might be working on. Once a branch is part of a pull request:

  1. It should be considered public
  2. Other developers may have already reviewed it
  3. Comments might be tied to specific commits
  4. Others might have based their work on it

In these cases, using merge preserves the history that reviewers have already seen and commented on in the PR.

Handling Changes Made on GitHub

Developer: What if I made updates to the file already committed on GitHub?

If you’ve edited files directly on GitHub’s web interface, you have several options:

  1. Create a new commit (safest approach):
    git add <updated-files>
    git commit -m "Update to address review comments"
    git push origin <branch-name>
    
  2. Amend the last commit (if your changes are small fixes):
    git add <updated-files>
    git commit --amend
    git push --force origin <branch-name>
    
  3. Interactive rebase to modify specific commits:
    git rebase -i HEAD~n  # where n is the number of commits to go back
    # Then edit the commits as needed
    git push --force origin <branch-name>
    

If the PR has already been reviewed, adding a new commit is generally best as it preserves the review context.

Developing an Effective Git Workflow

Based on these concepts, here’s a recommended workflow for handling common scenarios:

  1. Before starting new work:
    git fetch origin
    git diff origin/master  # Review changes
    git pull  # Or merge/rebase as appropriate
    
  2. When integrating remote changes with local work:
    git fetch origin
    git diff origin/master  # Review differences
    # Choose either:
    git merge origin/master  # Preserve history (for public branches)
    # OR
    git rebase origin/master  # Create linear history (for private branches)
    
  3. When resolving divergent histories:
    git pull --no-rebase origin <branch-name>  # For public/PR branches
    # OR
    git pull --rebase origin <branch-name>  # For private branches
    git push origin <branch-name>  # Push your integrated changes
    

Conclusion

Git’s power comes from its flexibility, but with that flexibility comes complexity. By understanding the relationship between local and remote repositories, the differences between fetch and pull, and when to use merge versus rebase, you’ll be better equipped to handle version control effectively.

Remember that Git is designed to protect your work, not lose it. When in doubt, you can always create a backup branch before attempting complex operations:

git branch backup-before-rebase

This ensures you can always return to a known good state if things don’t go as planned.