Introduction to Git, from ground zero. Includes basic concepts, most common commands and workflows. Describes rebase vs merge, what push and pull do, and what the SHA1 does for you.
eAuditor Audits & Inspections - conduct field inspections
Git: a brief introduction
1. GIT: A BRIEF INTRODUCTION
Randal L. Schwartz, merlyn@stonehenge.com
Version 5.1.7 on 20 July 2017
This document is copyright 2006-2017 by Randal L. Schwartz, Stonehenge Consulting Services, Inc.
This work is licensed under Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License
http://creativecommons.org/licenses/by-nc-sa/3.0/
2. ABOUT ME
➤ Been tracking git since it was created (April 2005)
➤ Took Linus to lunch when he moved to Portland: “What are you working on?”
➤ Used git on everything from individual tasks to large teams
➤ Formerly participated in the mailing list
➤ But it’s too noisy now
➤ Provided some patches to git, and suggestions for user interface changes
➤ Still build git every day from master
➤ Gave the first Google TechTalk (October 2007) that talked about what Git actually is
➤ Linus spoke six months earlier about what Git isn’t
3. WHAT IS GIT?
➤ Git manages changes to a tree of files over time
➤ Git is optimized for:
➤ Distributed development
➤ Large file counts
➤ Complex merges
➤ Making trial branches
➤ Being very fast
➤ Being robust
4. BUT NOT FOR...
➤ Tracking file permissions and ownership
➤ Tracking individual files with separate history
➤ Making things painful
5. WHY GIT?
➤ Essential to Linux kernel development
➤ Created in 2005 as a replacement when BitKeeper suddenly became “unavailable”
➤ Now used by millions of projects
➤ Everyone can:
➤ Clone the tree
➤ Make and test local changes
➤ Submit the changes as patches via mail
➤ OR submit them as a published repository
➤ Track the upstream to revise if needed
6. HOW DOES GIT DO IT?
➤ Universal public identifiers
➤ Computed from the constituent bits, so they can be verified
➤ Multi-protocol transport: HTTP, SSH, GIT
➤ Efficient object storage
➤ Everyone has entire repo (disk is cheap)
➤ Easy branching and merging
➤ Common ancestors are computable
➤ Patches (and repo updates) can be transported or mailed
➤ Binary “patches” are supported
7. THE SHA1 IS KING
➤ Every object has a SHA1 to uniquely identify it
➤ Objects consist of:
➤ Blobs (the contents of a file)
➤ Trees (directories of blobs or other trees)
➤ Commits:
➤ A tree
➤ Plus zero or more parent commits
➤ Plus a message about why
➤ And tags
8. TAGS
➤ An object (usually a commit)
➤ Plus an optional subject (if anything else is given)
➤ Plus an optional payload to sign it off
➤ Plus an optional GPG signature
➤ Designed to be immobile
➤ Changes not tracked after cloning
➤ Use a branch if you want to move around
9. OBJECTS LIVE IN THE REPO
➤ Git efficiently creates new objects
➤ Objects are generally added, not destroyed
➤ Unreferenced objects will garbage collect
➤ Objects start “loose”, but can be “packed”
➤ Packs represent objects as deltas
➤ Packs are also created for push and pull
10. COMMITS RULE THE REPO
➤ One or more commits form the head of object chains
➤ Typically one head called “master”
➤ Nothing special about the name “master” though
➤ Others can be made at will (“branches”)
➤ Usually one commit in the repo that has no parent commit (“root” commit)
11. REACHING OUT
➤ From a commit, reaching the components:
➤ Chase down the tree object to get to directories and files as they existed at this
commit time
➤ Chase down the parent objects to get to earlier commits and their respective trees
➤ Do this recursively, and you have all of history
➤ And the SHA1 depends on all of that!
12. THE GIT REPO
➤ A “working tree” has a “.git” dir at the top level
➤ Unlike CVS, SVN: no deeper pollution
➤ This makes it friendly to recursive greps
➤ Even though “git grep” is also provided
13. THE .GIT DIR CONTAINS:
➤ config – Configuration file (.ini style)
➤ objects/* – The object repository
➤ refs/heads/* – branches (like “master”)
➤ refs/tags/* - tags
➤ logs/* - logs
➤ refs/remotes/* - tracking others
➤ index – the “index cache” (described shortly)
➤ HEAD – points to one of the branches (the “current branch”, where commits go)
14. THE INDEX (OR “CACHE”)
➤ A directory of blob objects
➤ Represents the “next commit”
➤ “Add files” to put current contents in
➤ “Commit” takes the current index and makes it a real commit object
➤ Diff between HEAD and index:
➤ changed things not yet committed
➤ Diff between index and working dir:
➤ changed things not yet added
➤ untracked things
15. WHAT’S IN A NAME?
➤ Git doesn’t record explicit renaming
➤ Nor expect you to declare it
➤ Exact renaming determined by SHA1
➤ Copy-paste-edits detected by similarity
➤ Computer better than you at that
➤ Explicit tracking will be wrong sometimes
➤ Being wrong breaks merges
16. GIT SPEAKS AND LISTENS
➤ Many protocols to transfer between repos
➤ http, https, git, ssh, local files
➤ In the core, git also has:
➤ import/export with CVS, SVN
➤ I use SVN import to have entire history of an SVN project at 30K feet
➤ Third party solutions handle others
➤ Git core also includes cvs-server
➤ A git repository can act like a CVS repository for legacy clients or humans
17. GETTING GIT
➤ Get the latest: https://git-scm.com/downloads
➤ Linux packages also exist
➤ Track the git-developer archive:
➤ git clone https://github.com/git/git
➤ Maintenance releases are very stable
➤ I install mine “prefix=/opt/git”
➤ add /opt/git/bin to PATH
18. GIT COMMANDS
➤ All git commands start with “git”
➤ “git MUMBLE-FOO bar” can also be written as “git-MUMBLE-FOO bar”
➤ This allows a single entry “git” to be added to the /usr/local/bin path
➤ This works for internal calls as well
➤ Manpages are still under “git-MUMBLE-FOO”
➤ Unless you use “git help MUMBLE-FOO”
➤ Or “git MUMBLE-FOO --help”
19. PORCELAIN AND PLUMBING
➤ Low-level git operations are called “plumbing”
➤ Higher level actions are called “porcelain”
➤ The git distro includes both
➤ Use porcelain from command line
➤ But don’t script with it
➤ Future releases might change things
➤ Past releases have definitely changed things
➤ Use plumbing for scripts
➤ Intended to be upward compatible (or at least versioned)
➤ Strangely, the --porcelain flag often enables plumbing mode (WTF?)
20. CREATING A REPO
➤ git init
➤ Creates a .git in the current dir
➤ Optional: edit .gitignore
➤ “git add .” to add all files (except .git!)
➤ Then “git commit” for the initial commit
➤ Creates current branch named “master”
➤ The name “master” is otherwise unspecial
➤ Could also do this on a tarball, once unpacked
21. CLONING
➤ Creates a git repo from an existing repo
➤ Generally creates a subdirectory
➤ Your workfiles and .git are in there
➤ Clone repo identified as “origin”
➤ But the name is otherwise unspecial
➤ Remote branches are “tracked”
➤ On fetch operations, the remote branch heads are copied to origin/$branchname
➤ Remote “HEAD” branch checked out as your initial “master” branch as well
22. COMMITTING
➤ Your work product is more commits
➤ These are always on a “branch”
➤ A branch is just a named commit
➤ When you commit, the former branch head becomes the parent
➤ The branch head moves to be the new commit
➤ Thus, you’re creating a directed acyclic graph
➤ ... rooted in branch heads
➤ A merge is just a commit with multiple parents
23. TYPICAL WORK FLOW
➤ Edit edit edit
➤ git add files/you/have changed/now
➤ This adds the files to the index
➤ “git add .” for adding all interesting files (except ignored files)
➤ “git add -p” selects patch by patch (my favorite)
➤ git status
➤ Tells you differences between HEAD, index, and working directory
24. MAKING THE COMMIT
➤ “git commit” pops you into a text editor for the commit message
➤ Good commit message rules:
➤ Separate subject from body with a blank line
➤ Limit the subject line to 50 characters
➤ Capitalize the subject line
➤ Do not end the subject line with a period
➤ Use the imperative mood in the subject line
➤ Wrap the body at 72 characters
➤ Use the body to explain what and why vs. how
➤ Current branch is moved forward
25. BUT WHICH BRANCH?
➤ Git encourages branching
➤ A branch is just 41 text bytes!
➤ Typical work flow:
➤ Think of something to do
➤ git checkout -b topic-name master
➤ work work work, commit to topic-name
➤ When your thing is done:
➤ git checkout master
➤ git merge topic-name
➤ git branch -d topic-name
26. WORKING IN PARALLEL
➤ You can have multiple topics active:
➤ git checkout -b topic1 master
➤ work work; commit; work work; commit
➤ git checkout -b topic2 master
➤ work work work; commit
➤ git checkout topic1; work work; commit
➤ Decide how to bring them together
➤ Merge: parallel histories
➤ Rebase: serial histories
➤ Each has pros and cons
27. THE MERGE
➤ git checkout master
➤ git merge topic1; git branch -d topic1
➤ This should be trivial (“fast forward”) merge
➤ git merge topic2
➤ Conflicts may arise:
➤ overlapping changes in text edits
➤ files renamed two different ways
➤ You need to resolve, and continue:
➤ git commit -a -m “describe the merge fix here”
28. THE REBASE
➤ Rewrites commits
➤ Breaks SHA1s: commits are lost!
➤ Don’t rebase if you’ve published commits!
➤ git checkout topic2; git rebase master
➤ topic2’s commits rewritten on top of master
➤ May result in merge conflicts:
➤ git rebase --continue or --abort or --skip
➤ git rebase -i (interactive) is helpful
➤ When rebased, merge is a fast forward:
➤ git checkout master; git merge topic2
29. READ THE HISTORY
➤ git log
➤ print the changes
➤ git log -p
➤ print the changes, including a diff between revisions
➤ git log --stat
➤ Summarize the changes with a diffstat
➤ git log -- file1 file2 dir3
➤ Show changes only for listed files or subdirs
30. WHAT’S THE DIFFERENCE?
➤ git diff
➤ Diff between index and working tree
➤ These are things you should “git add”
➤ “git commit -a” will also make this list empty
➤ git diff HEAD
➤ Difference between HEAD and working tree
➤ “git commit -a” will make this empty
➤ git diff --cached
➤ between HEAD and index
➤ “git commit” (without -a) makes this empty
31. OTHER DIFFS
➤ git diff OTHERBRANCH
➤ Other branch and working tree
➤ git diff BRANCH1 BRANCH2
➤ Difference between two branch heads
➤ git diff BRANCH1...BRANCH2
➤ changes only on branch2 relative to common ancestor
➤ git diff --stat (other options)
➤ Nice summary of changes
➤ git diff --dirstat (other options)
➤ Summarize directory changes
32. BARKING UP THE TREE
➤ Most commands take “tree-ish” args
➤ SHA1 picks something absolutely
➤ Can be abbreviated if not ambiguous
➤ HEAD, some-branch-name, some-tag-name, some-origin-name
➤ Optionally followed by @{historical}
➤ “historical” can be:
➤ yesterday, 2011-11-22, etc (date ref)
➤ 1, 2, 3, etc (prior version of this ref)
➤ “upstream”, “u” (upstream version of local)
33. MEET THE PARENTS
➤ Any of those on the prior slide, followed by:
➤ ^n - “the n-th parent of an item” (default 1)
➤ ~n - n ^1’s (so ~3 is ^1^1^1)
➤ :path - pick the object from the tree
➤ ^{/some text here} - first parent with a commit message matching the text
34. RANGES
➤ $tree1 - all commits reachable from $tree1 back to the root
➤ ^$tree1 $tree2 - all commits from $tree2 MINUS all commits from $tree1
➤ Common enough that it can be written $tree1..$tree2 (“from $tree1 to $tree2”)
➤ $tree1…$tree2 is $m..$tree1 with $m..$tree2, where $m is the merge base
➤ Omitting either operand in .. or … defaults to HEAD
➤ $tree1.. - all commits on current branch since $tree1
➤ ..$tree2 - all commits introduced on $tree2 since this branch forked
35. RANGE EXAMPLES
➤ git diff HEAD^ HEAD
➤ most recent change on current branch
➤ Also: git diff HEAD~ HEAD
➤ git diff HEAD~3 HEAD
➤ What damage did last three edits do?
36. SEEING THE CHANGES
➤ gitk mytopic origin
➤ Tk widget display of history
➤ Shows changes back to common ancestor
➤ gitk --all
➤ show everything
➤ To skip the tags: gitk HEAD --branches --remotes
➤ gitk from..to
➤ Just the changes in “to” that aren’t in “from”
➤ git show-branch from..to
➤ Same thing for the Tk-challenged
37. PLAYING WELL WITH OTHERS
➤ git clone creates “tracking” branches
➤ Typically named “origin/master” etc
➤ To share your work, first get up to date:
➤ git fetch origin
➤ Now rebase your changes on upstream:
➤ git rebase origin/master
➤ Or fetch/rebase in one step
➤ git pull --rebase
➤ To push upstream:
➤ git push
38. RESETTING
➤ git reset --soft
➤ Makes all files “updated but not checked in”
➤ git reset --hard # DANGER
➤ Forces working dir to look like last commit
➤ git reset --hard HEAD~3
➤ Tosses most recent 3 commits
➤ use “git revert” instead if you’ve published
➤ git checkout HEAD some/lost/file
➤ Recover the version of some/lost/file from the last commit
39. IGNORING THINGS
➤ Every directory can contain a .gitignore
➤ lines starting with “!” mean “not”
➤ lines without “/” are checked against basename
➤ otherwise, shell glob via fnmatch(3)
➤ Leading / means “the current directory”
➤ Checked into the repository and tracked
➤ Every repository can contain a .git/info/exclude
➤ Both of these work together
➤ But .git/info/exclude won’t be cloned
40. CONFIGURATION
➤ Many commands have configurations
➤ git config name value
➤ set name to value
➤ name can contain periods for sub-items
➤ git config name
➤ get current value
➤ git config --global name [value]
➤ Same, but with ~/.gitconfig
➤ This applies to all git repos for a user
41. THE STASH
➤ Creates temporary commits to represent:
➤ current index (git add ...)
➤ current working directory (git add .)
➤ Can rebase those onto new index later
➤ Many uses, such as pull into dirty workdir:
➤ git stash; git pull ...; git stash pop
➤ Might result in conflicts, of course
➤ Multiple stashes can be in play
➤ “git stash list” to show them
42. OTHER USEFUL PORCELAIN
➤ git archive: export a tree as a tar/zip
➤ git bisect: find the offensive commit
➤ git cherry-pick: selective merging
➤ git mv: rename a file/dir with the right index manipulations
➤ git rm: ditto for delete
➤ git push: write to an upstream
➤ git revert: add a commit that undoes a previous commit
➤ git blame: who wrote this?
43. COMMIT ADVICE
➤ Split changes into small logical steps
➤ Ideally ones that pass the test suite again
➤ This helps for “blame” and “bisect”.
➤ Easier to squash commits later than to break up
➤ “git rebase -i” can squash, omit, reorder
44. PICKING FROM BRANCHES
➤ Two main tools: “merge” and “cherry-pick”
➤ Merge brings in all commits
➤ Scales well for large workflows
➤ Cherry-pick brings in one or more
➤ Great when a single patch is needed
45. GIT.GIT’S WORKFLOW
➤ Four branches:
➤ maint: fixes to existing releases
➤ master: next release
➤ next: testing for next master
➤ pu: experimental features
➤ Each one is a descendent of the one above
➤ Commit to the oldest branch needing patch
➤ Then merge it upward:
➤ maint to master to next to pu
46. TOPIC BRANCHES
➤ Most features require several iterations
➤ Commit these to topic branches during design
➤ Easier to rehack or abandon this way
➤ Fork topic from the oldest main branch
➤ Refresh-merge from that branch if needed
➤ But don’t do that routinely
➤ Rebase topic branch if forked from wrong branch
➤ More details at “man 7 gitworkflows”
47. TESTING INTEGRATION
➤ Merge from base branch to topic branch
➤ ... on a new throw-away branch
➤ This branch is never merged back in
➤ Just for testing
➤ Can be published publicly, if you make it clear that it’s just for testing
➤ Otherwise, typically used only locally
➤ If integration fails, fix, and cherry-pick those back to the topic branch before final
merge
48. TIME TO “GIT” DIRTY
➤ Make a git repository:
➤ mkdir git-tutorial
➤ cd git-tutorial
➤ git init
➤ git config user.name “Randal Schwartz”
➤ git config user.email merlyn@stonehenge.com
➤ Add some content:
➤ echo "Hello World" >hello
➤ echo "Silly example" >example
49. WHAT’S UP?
➤ git status
➤ git add example hello
➤ git status
➤ git diff --cached
50. “GIT ADD” TIMING
➤ Change the content of “hello”
➤ echo "It's a new day for git" >>hello
➤ git status
➤ git diff
➤ Now commit the index (with old hello)
➤ git commit -m initial
➤ git status
➤ git diff
➤ git diff HEAD
51. GIT COMMIT -A
➤ Note that we committed the version of “hello” at the time we added it!
➤ Fix this by adding -a nearly always:
➤ git commit -a -m update
➤ git status
52. WHAT HAPPENED?
➤ Ask for logs:
➤ git log
➤ git log -p
➤ git log --stat --summary
➤ Tag, you’re it:
➤ git tag my-first-tag
➤ Now we can always get back to that version later
53. SHARING THE WORK
➤ Create the clone:
➤ cd ..
➤ git clone git-tutorial my-git
➤ cd my-git
➤ The git clone will often have some sort of transport path, like git: or rsync: or http:
➤ See what we’ve got:
➤ git log -p
➤ Note that we have the entire history
➤ And that the SHA1s are identical
54. BRANCHING OUT
➤ Create branch “my-branch”
➤ git checkout -b my-branch
➤ git status
➤ Make some changes:
➤ echo "Work, work, work" >>hello
➤ git commit -a -m 'Some work.'
55. CONFLICTS
➤ Switch back, and make other changes:
➤ git checkout master
➤ echo "Play, play, play" >>hello
➤ echo "Lots of fun" >>example
➤ git commit -a -m 'Some fun.'
➤ We now have conflicting commits
56. SEEING THE DAMAGE
➤ In an X11 display:
➤ gitk --all
➤ The --all means “all heads, branches, tags”
➤ For the X11 challenged:
➤ git show-branch --all
➤ git log --pretty=oneline —abbrev-commit --graph --decorate --all
➤ Handy for a mail message
57. MERGING
➤ We’re on “master”, and we want to merge in the changes from my-branch
➤ Select the merge:
➤ git merge my-branch
➤ This fails, because we have a conflict in “hello”
➤ See this with:
➤ git status
➤ Edit “hello”, and commit:
➤ git commit -a -m “Merge work in my-branch”
58. DID IT WORK?
➤ Verify the merge with:
➤ gitk --all
➤ git show-branch --all
➤ See changes back to the common ancestor:
➤ gitk master my-branch
➤ git show-branch master my-branch
➤ Note that master is only one edit from my-branch now (the merge patch-up)
➤ “git show” handy with merges:
➤ git show HEAD
59. MERGING THE UPSTREAM
➤ Master is now updated with my-branch changes
➤ But my-branch is now lagging
➤ We can merge back the other way:
➤ git checkout my-branch
➤ git merge master
➤ This will succeed as a “fast forward”
➤ This means that the merge-from branch already has all of our change history
➤ So it’s just adding linear history to the end
60. UPSTREAM CHANGES
➤ Let’s change origin a bit
➤ cd ../git-tutorial
➤ echo "some upstream change" >>other
➤ git add other
➤ git commit -a -m "upstream change"
➤ And now fetch it downstream
➤ cd ../my-git
➤ git fetch
➤ gitk --all
➤ git diff master..origin/master
61. MERGE IT IN
➤ Explicit merging
➤ git checkout master
➤ git merge origin/master
➤ Implicit fetch/merge
➤ git pull
➤ Eliminating the bushy tree
➤ git pull --rebase
➤ (Fails in our example.. sigh.)
62. SPLITTING UP A PATCH
➤ Sometimes, your changes are logically separate
➤ echo “this change” >>hello
➤ echo “unrelated change” >>example
➤ Now make two commits:
➤ git add -p # interactively select hello change
➤ git commit -m “fixed hello” # not -a!
➤ git commit -a -m “fixed example”
63. FIXING A COMMIT
➤ Oops, left out something on that last one
➤ echo "another unrelated" >>example
➤ Now “amend” the patch:
➤ git commit -a --amend
➤ This replaces the commit
➤ Be careful that you haven’t pushed it!
64. FOR FURTHER INFO
➤ See “Git (software)” in Wikipedia
➤ And the git homepage http://git-scm.com/
➤ Git wiki at https://git.wiki.kernel.org/
➤ Wonderful (but dated) Pro Git book: https://git-scm.com/book/en/v2
➤ Get on the mailing list
➤ Helpful people there
➤ You can submit bugs, patches, ideas
➤ And the #git IRC channel (on Freenode)
➤ Now “git” to it!