Submodules management in Git

Git’s submodule tool allows you to create a child repository as a submodule inside a repository. This article shows you how to manipulate a submodule.

Add a submodule from a remote repository

To create a new sub repository whose content does not exist in the repository:

# Add a submodule named a-submodule from an remote repository
# This command will create a .gitmodules which contains the mapping relation of
# the submodule and its remote repository.
# It also stages the changes.
$ git submodule add https://github.com/xxx/a-submodule

# Use status to check what the above command stages
$ git status
...
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

        new file:   .gitmodules
        new file:   a-submodule

# You need to commit staged changes: .gitmodules and a-submodule
$ git commit -m 'add a submodule'
[master 9d7d3b3] add: add a submodule
 2 files changed, 4 insertions(+)
 create mode 100644 .gitmodules
 create mode 160000 a-submodule

As you see, Git treats the new submodule as a file with special mode (1600000) in a commit.

For a repository with submodules inside it, it is called the main repository.

Note: In new git versions, the submodule’ s .git data is stored the main repository’s .git directory while they stay in the submodule’s directory in the old versions.

Add a submodule from an subdirectory

You can also switch a subdirectory to a submodule. To add a submodule from an subdirectory which has been tracked, you need to remove it first:

# First remove the subdirectory named csvlib from index
$ git rm --cached csvlib

Then add a subdirectory as a submodule:

# Init csvlib as a repo
$ cd csvlib
$ git init
$ git add .
$ git commit -m 'initial commit'

# Add csvlib repo as a submodule
$ git submodule add ./csvlib/
Adding existing repo at 'csvlib' to the index

# Check status
$ git status
...
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

        modified:   .gitmodules
        new file:   csvlib
        deleted:    csvlib/csv.py

# Commit the changes
$ git commit -m 'switch subdirectory csvlib as a submodule'
[master 9d7d3b3] add: add submodule git-submodule
 2 files changed, 4 insertions(+)
 create mode 100644 .gitmodules
 create mode 160000 a-submodule

Note: If you switch to a branch which the subdirectory not being as a submodule,you may get an error for the subdirectory will be overwritten. At this time, an option -f is needed to checkout. Afterwards if you get back to the branch treating it as a submodule, you need to execute git checkout . in the submodule’s directory to get the content back. See more on git submodules.

Commit changes for a submodule

To commit changes in a submodule, first commit in its own repository and then commit for the main repository:

# Commit changes in the submodule directory
$ git add a.py 
$ git commit -m 'feat: add some functionly'
# Push changes to its remote repository
$ git push

# Switch to the main repository directory
$ cd ..
# Check changes from a submodule
$ git status
...
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)
  (commit or discard the untracked or modified content in submodules)

        modified:   a-submodule (modified content)

# Commit the submodule's changes to the top repository
$ git add a-submodule
$ git commit -m 'update changes of submodule a-submodule'
# Push changes if you want
$ git push

Pull updates for a submodule

To pull updates from remote for a submodule, an easy way is to run:

# Let Git to go into the submodule's directory and get updates for a-submodule
# --merge, merge local work
$ git submodule update --remote --merge a-submodule

Or you can do it manually:

# Switch to the submodule's directory
$ cd a-submodule

# Pull and merge updates form upstream branch
$ git pull
$ git merge origin/master

Manipulate multiple submodules at a time

Git has a submodule foreach command for you to manipulate multiple submodules. For example, stashing changes for all your submodules:

# Stash changes for all submodules
$ git submodule foreach 'git stash'

Remove a submodule from a repository

To remove a submodule named a-submodule from a repository:

# Frist remove the from index
$ git rm --cached a-submodule

# If you want to keep submodule's files in working tree.
# Rename it for the next command will delete it from working tree
$ mv a-submodule a-submodule-renamed

# Unregister the submodule
# This command removes it both from .git/config and working tree
$ git submodule deinit a-submodule

Listing changed files in a commit

Use below command to list the changed files in a commit:

# List changed files in the specified commit
$ git show --name-only <commit>

# Examples

# List changed files in HEAD
$ git show --name-only HEAD
commit ea34837d870e48106ae9ad09f41297a64ad6a6a1 (HEAD -> master)
Author: xxx <xxx@xxx.com>
Date:   Wed Mar 13 20:27:46 2019 +0800

    feat: add localization

my-plugin.php
languages/my-plugin-zh_CN.po

If you only want the file names, use:

# List only names of changed files in HEAD
$ git diff --name-only HEAD~ HEAD
my-plugin.php
languages/my-plugin-zh_CN.po

Git FAQ

Config

Clone

  • Partial clone

    How to clone only part of commits of a huge repository

Branch

Stash

  • Stash changes

    How to save changes temporally, apply changes, list stashes, delete a stash?

Commit

Undo

Remote

  • Remote management

    How to add a remote, delete a remote, change the URL of a remote, etc.

Diff

  • Differences

    How to check differences between working tree, index, a specific commit?

View

Delete

Log

  • Check log

    How to view the last several commits, search the commit history with string, list the commits graphically, check which commit changed a string, etc.

Deleting files from Git commit history

If you commits an sensitive file or a huge unwanted file, you may want to remove it from every commit. Git provides a nuclear-level command git filter-branch which allows you rewrite the history.

git filter-branch executes the specified command for each commit specified by you and generates new commits.

Before you start, you must keep it in mind that this operation changes the existing history. If it is a public repository and someone have did some work based on the commits you want to rewrite, you’d better not do this. If you have to, remember to notify them to run git pull --rebase command.

Delete a huge folder from every commit

Here is an example of how to remove a huge folder from each commit which is committed accidentally at first.

You’d better test below commands in a temporary repository to make sure that they work properly for your git version. It is met that git filter-branch below removes the folder in working tree as well but it should not.

We do it in a new testing branch, when the result is what we want then reset it as the prior branch.

# Do it in a new testing branch
$ git checkout -b test

# Remove 'build' folder from every commit on the new branch
# --index-filter, rewrite index without checking out
# -r, remove recursively in subfolders
# --cached, remove it from index but not include working tree
# --ignore-unmatch, ignore if files to be removed are absent in a commit
# --prune-empty, remove empty commits generated by 'git rm' command
# HEAD, execute the specified command for each commit reached from HEAD by parent link
$ git filter-branch --index-filter 'git rm -r --cached --ignore-unmatch build' --prune-empty HEAD
Rewrite fee4b8ee9df321a877cd2663b20b293eea4a1f8c (1/2)rm 'build/main.app'
Rewrite 63f272ab5152c66693614efae77567799837c6e0 (2/2)
Ref 'refs/heads/test-filter' was rewritten

# The output is OK, reset it to the prior branch master
$ git checkout master
$ git reset --soft test

# Remove test branch
$ git branch -rm test

# Push it with force
$ git push --force origin master

If you changed commits in remote repository, remember notice other members execute below command:

# Tell others to execute below command if you changed commits in remote repository.
$ git pull --rebase

Note:

  1. If --ignore-unmatch option is not added, it will fail when the files to be removed do not exist in a commit.
  2. The files you removed will stay in disk for a while, they will be removed entirely in the next automic garbage collection of git.

Tips: To avoid adding unwanted files accidently, you should ignore it.

Other useful options:

# Execute the specified command for the last 5 commit
$ git filter-branch --index-filter 'git rm --cached --ignore-unmatch user.pem' HEAD~6..HEAD

# Execute the specified command for all branches
$ git filter-branch --index-filter 'git rm --cached --ignore-unmatch user.pem' -- --all

# Update tags when executing filter-branch, remember to push them to remotes afterwards
$ git filter-branch --index-filter 'git rm --cached --ignore-unmatch user.pem' --tag-name-filter cat HEAD

# Remove empty commits generated by 'git rm' command
$ git filter-branch --index-filter 'git rm --cached --ignore-unmatch user.pem' --prune-empty HEAD

Remotes management in Git

In Git, you can have more than one remote repository. git remote command is used to manage them. Operations illustrated include how to add a remote, remove a remote, view a remote’s information, change a remote’s URL, etc.

Add a remote

# Add a remote
$ git remote add <name> <url>

# Examples:
# Add a remote named test
$ git rmeote add test https://github.com/username/repository.git

Note: When you clone a remote repository, Git sets the remote as origin for you automatically.

List remotes

# Show remotes with names
$ git remote
origin

# Show remotes with names, urls
$ git remote -v
origin    git@github.com:mojombo/grit.git (fetch)
origin    git@github.com:mojombo/grit.git (push)

Change URL of a remote

# Chang url of origin
$ git remote set-url origin https://github.com/USERNAME/REPOSITORY.git

Verify its URL has changed:

$ git remote get-url origin
origin https://github.com/USERNAME/REPOSITORY.git

Get URL of a remote

# Get the url of origin
$ git remote get-url origin
origin https://github.com/USERNAME/REPOSITORY.git

Fetch updates from a remote

# Fetch updates from the remote named origin
$ git remote update origin

Rename a remote

# Rename a remote
$ git remote rename <old> <new>

# Examples:
# Rename origin to main
$ git remote rename origin main

Remove a remote

# Remove a remote
# Note: all the remote-tracking branches and settings 
# for the remote are removed.
$ git remote remove <name>

# Examples:
# Remove the remote nameed server2
$ git remote remove server2