Skip to content

Latest commit

 

History

History
65 lines (54 loc) · 4.29 KB

README.md

File metadata and controls

65 lines (54 loc) · 4.29 KB

repo_reorg

When converting a large git repo into smaller git repos, preserving history can be difficult if files from different subdirectories need to be combined into the same smaller repo. To support this kind of operation, the two utility scripts in this repo can be used together or independently.

Creates a new repo from chunks of an existing repo by selectively moving files from an existing repo to new locations in a new repo, while preserving history. repo_reorg.py uses mapping files generated by hand or by repo_map_gen.py.

repo_reorg.py clones a copy of ORIGIN, then calculates the minimum number of fragments of ORIGIN necessary to create a new repo. It clones a new copy of ORIGIN for each fragment, then filters each with git filter-branch --subdirectory-filter. Then, repo_reorg.py reorganizes the files in each fragment into their new locations using git mv and deletes any files which are not in the map. Finally, it merges each fragment into the new repo.

usage: repo_reorg.py [-h] -o ORIGIN [-b BRANCH] [-d DESTINATION] [-n NAME]
                     [-f FILE]
                     [P [P ...]]

Create a new git repo, filtering the contents out of an existing one

positional arguments:
  P                     Paths to include in the new repo. Paths must be of the
                        form: <old repo path>:<new repo path>

optional arguments:
  -h, --help            show this help message and exit
  -o ORIGIN, --origin ORIGIN
                        The source repository
  -b BRANCH, --origin-branch BRANCH
                        The branch of the origin to clone
  -d DESTINATION, --destination DESTINATION
                        The destination repository (optional)
  -n NAME, --name NAME  The name of the repo to create
  -f FILE, --file FILE  Load the paths to merge out of FILE. The format of the
                        path arguments must be followed, one path per line.

Generating the mapping files used by repo_reorg.py can be tedious. Instead, copy files manually to their new locations and use repo_map_gen.py to generate the mapping file. Once the mapping file is created, run repo_reorg.py to generate a new repo which preserves history for the copied files in their new locations.

There are two exceptional conditions that repo_map_gen.py handles explicitly:

  • If a file has no source element in the origin repo, a warning will be printed and no mapping will be generated for that file.
  • If a file has more than one possible source element in the origin repo, repo_map_gen.py will try to match the file's path to determine the correct file to use. If that fails, repo_map_gen.py will print a warning that multiple matches were found for the file, and generate a mapping for each possible match, marking each with '#' at the beginning of the line.
usage: repo_map_gen.py [-h] -o ORIGIN -d DESTINATION [-e EXCLUDE_ORIGIN]
                       [-E EXCLUDE_DESTINATION] [-f OUTPUT_FILE]

Generate a mapping file to translate between two directories

optional arguments:
  -h, --help            show this help message and exit
  -o ORIGIN, --origin ORIGIN
                        The source directory
  -d DESTINATION, --destination DESTINATION
                        The destination directory
  -e EXCLUDE_ORIGIN, --exclude-origin EXCLUDE_ORIGIN
                        Exclude DIR from origin path search
  -E EXCLUDE_DESTINATION, --exclude-destination EXCLUDE_DESTINATION
                        Exclude DIR from destination path search
  -f OUTPUT_FILE, --output-file OUTPUT_FILE
                        Output File

Recovering history on a repo which has already been split

If there is a repo which has already lost history due to a reorganization, it's possible to get it back:

  1. Checkout the first commit of files to the new repo
  2. Use repo_map_gen.py to generate a mapping file
  3. Use repo_reorg.py to create a new repo with history
  4. Cherry-pick all commits from the lost-history repo into the recovered-history repo
  5. Use git push --force to override the lost-history repo