Understanding Git Submodules

Posted on Mon 23 March 2026 by sanyam_khurana in DevOps

Have you ever needed to use another Git repository inside your own project? Maybe a shared library, a theme, or a set of configuration files maintained separately? Git submodules let you do exactly that - embed one Git repository inside another while keeping their histories independent. Let's understand how they work and when to use them.

What is a Git Submodule?

A Git submodule is a reference to another Git repository at a specific commit, embedded inside a parent repository. Think of it as a pointer - your parent repo doesn't store the submodule's files directly in its own history. Instead, it records:

  1. The URL of the external repository
  2. The commit hash it should point to
  3. The path where it should be checked out

This information is stored in a file called .gitmodules at the root of your repository.

When Should You Use Submodules?

Submodules are useful when:

  • You want to include an external project (e.g., a theme, a library) inside your repo
  • The external project has its own release cycle and version history
  • You want to pin to a specific version of that external project
  • Multiple projects need to share the same code without copy-pasting

Common real-world examples:

  • A blog using a theme maintained in a separate repo (like this very blog!)
  • A project depending on a shared internal library
  • A monorepo-like structure where components are independently versioned

Adding a Submodule

To add a submodule to your project:

git submodule add https://github.com/user/repo.git path/to/submodule

This does three things:

  1. Clones the repository into path/to/submodule
  2. Creates (or updates) the .gitmodules file
  3. Stages both the .gitmodules file and the submodule path

Your .gitmodules file will look something like this:

[submodule "path/to/submodule"]
    path = path/to/submodule
    url = https://github.com/user/repo.git

After adding, commit the changes:

git add .gitmodules path/to/submodule
git commit -m "feat: add submodule for external dependency"

Cloning a Repository with Submodules

When someone clones your repository, the submodule directories will be empty by default. They need to initialize and fetch the submodules:

git clone https://github.com/user/parent-repo.git
cd parent-repo
git submodule init
git submodule update

Or do it all in one step:

git clone --recurse-submodules https://github.com/user/parent-repo.git

Updating a Submodule

There are two common scenarios:

Pulling the Latest Changes from Upstream

If the external repository has new commits and you want to update:

cd path/to/submodule
git fetch
git checkout main       # or the branch you want
git pull
cd ..
git add path/to/submodule
git commit -m "chore: update submodule to latest"

Or use the shorthand:

git submodule update --remote path/to/submodule

Making Changes Inside the Submodule

If you've forked the external repo and want to make changes:

cd path/to/submodule
git checkout -b my-feature
# make your changes
git add .
git commit -m "feat: my changes"
git push origin my-feature
cd ..
git add path/to/submodule
git commit -m "chore: update submodule to include my changes"

The parent repo now points to your new commit in the submodule.

How the Parent Repo Tracks Submodules

This is the key concept that confuses most people. The parent repository does not track the submodule's files. It only tracks a commit hash - a pointer to a specific snapshot of the submodule.

When you run git diff in the parent repo after updating a submodule, you'll see something like:

-Subproject commit abc1234def5678...
+Subproject commit 789abcd012345...

That's it - just a pointer change. The actual file changes live in the submodule's own repository.

Using a Fork as a Submodule

A common pattern is to fork an external project, make your customizations, and point the submodule to your fork. This gives you:

  • Your own customizations on a branch
  • The ability to pull upstream updates via git fetch upstream
  • A clean separation between upstream code and your changes

To set this up:

# Fork the repo on GitHub first, then:
git submodule add git@github.com:your-user/forked-repo.git path/to/submodule
cd path/to/submodule
git remote add upstream https://github.com/original-user/repo.git
git checkout -b my-customizations
# make changes, commit, push

To pull upstream updates later:

cd path/to/submodule
git fetch upstream
git merge upstream/main
# resolve any conflicts with your customizations
git push origin my-customizations

Specifying a Branch

You can configure a submodule to track a specific branch:

git config -f .gitmodules submodule.path/to/submodule.branch my-branch

Then git submodule update --remote will pull from that branch.

Removing a Submodule

Removing a submodule requires a few steps:

# Remove the submodule entry from .gitmodules
git config -f .gitmodules --remove-section submodule.path/to/submodule

# Remove the submodule entry from .git/config
git config --remove-section submodule.path/to/submodule

# Remove the submodule files
git rm --cached path/to/submodule
rm -rf path/to/submodule
rm -rf .git/modules/path/to/submodule

# Commit
git add .gitmodules
git commit -m "chore: remove submodule"

Common Pitfalls

1. Detached HEAD Inside Submodules

When you run git submodule update, the submodule is checked out in a detached HEAD state at the pinned commit. If you want to make changes, always create a branch first:

cd path/to/submodule
git checkout -b my-branch  # don't work in detached HEAD

2. Forgetting to Push Submodule Changes

If you commit changes in a submodule and update the parent repo's pointer, but forget to push the submodule - others won't be able to fetch that commit. Always push the submodule first, then the parent repo.

3. Stale Submodule References

After pulling changes in the parent repo, always run:

git submodule update --init --recursive

Otherwise, your submodule might be stuck at an older commit.

Submodules vs. Alternatives

Approach Best For
Submodules Pinning to a specific version of an external repo
Subtrees Merging external code directly into your repo's history
Package managers (pip, npm) Published libraries with proper versioning
Monorepos When all code should share one history

Submodules work best when you need a clear boundary between repositories while maintaining a specific version dependency.

Summary

Git submodules are a powerful way to compose projects from multiple repositories. The key things to remember:

  • A submodule is a pointer to a commit in another repo, not a copy of its files
  • Use --recurse-submodules when cloning repos that contain submodules
  • Always create a branch before making changes inside a submodule
  • Push the submodule first, then update and push the parent repo
  • Consider forking external repos when you need customizations

They have a learning curve, but once you understand the mental model - a repo inside a repo, each with its own history -they become a natural tool in your Git workflow.

Sanyam Khurana

Sanyam Khurana

Principal Engineer & Cloud Architect with 10+ years of experience. MSCS from Georgia Tech. Maintainer at DjangoProject.com, CPython bug triager. Goes by "CuriousLearner" across the web. Find him on GitHub.