Skip to content

Latest commit

 

History

History
165 lines (116 loc) · 5.44 KB

README.adoc

File metadata and controls

165 lines (116 loc) · 5.44 KB

Renumber UIDs/GIDs on file systems efficiently

Use case

Typically when consolidating environments, or moving standalone systems into a consolidated directory existing users and groups with conflicting UIDs and GIDs have to be renumbered. Part of this process means you have to safely and efficiently modify the file systems and bring the AS-IS situation into line with the wanted TO-BE end-result.

This is exactly what renumid intends to help with.

Two-phase approach

Intuitively one would scan the filesystems for a specific UID or GID, once identified you’d change the ownership to the new state, continue with the next files until all files, and all UIDs/GIDs are dealt with. Often the UNIX 'find' utility is used for this like:

find / -uid 1234 -exec chown 3456 {} \;
find / -uid 3456 -exec chown 5678 {} \;

However this poses a few problems:

  • You will need to scan the whole filesystem for each UID/GID to process. This takes a lot of time and is far from efficient

  • You risk ending up with the wrong end-result. If you are not careful (like the above example) you may be renumbering already mapped files in a subsequent action. Creating a situation that you cannot restore easily.

  • Changes that require both UID and GID changes cannot be easily dealt with.

  • If the process (for whatever reason is interrupted) replaying the remaining actions is tedious and time-consuming.

The solution to this is to use a two-phased approach.

  1. First index the file system and identify all paths that require a change.

  2. Use this Index file to efficiently perform the needed renumbering.

Since this index not just stores the paths, but also the original state. One can extract exactly the intended end-result, and the required work to get there. (e.g. the number of atomic file system actions to perform)

A two phased approach also allows to prepare the required actions before the migration itself takes place, and can help to assess the required migration time for individual systems. As such file system renumbering can potentially be very time-consuming one can much better predict the required downtime needed for the migration.

For this reason we added 2 optional steps.

  1. A step that can report on what changes are to be expected from the Index file

  2. A step that can restore the original state or can be used to benchmark the process (without impact)

Implementation details

The following potential issues should be considered:

  • Symlinks should not be followed (no-dereference) - DONE

  • Pseudo file systems should be excluded - DONE

  • Crossing file systems boundaries may not be wanted

  • File systems are mounted inside file systems (so device nr is important) - DONE

  • Python os.walk provides every directory as a (new) root, so we need to store excluded device nrs - DONE

  • Bind mounts need to be handled carefully (keep a list of scanned device nrs ?)

Usage

Commands

renumid index   - Create a filesystem index of impacted paths using a map
renumid status  - Show a status report of impacted paths and affected UIDs/GIDs
renumid renumber   - Renumber the impacted paths according to the provided map
renumid restore - Restore the original situation using the filesystem index

General options

-f <index>  - the index file to create/use
              (default: renumid-<date>-<time>.idx)
-m <map>    - the map file to use
              (default: renumid.map)
-d          - enable debugging output
-v          - be more verbose
-x          - do not cross filesystem boundaries
-T          - include only these file system types (eg. ext3,ext4,xfs)

Status options

-u          - list the files for specific UIDs
-g          - list the files for specific GIDs

Renumber options

-t          - test the changes, without actually changing

Restore options

No specific options.

Examples

renumid index -f renumid-20151126-1912.idx -m mapping.yaml /home /usr
renumid status -f renumid-20151126-1912.idx
renumid renumber -f renumid-20151126-1912.idx
renumid restore -f renumid-20151126-1912.idx

File format

Map

The map is created as YAML or JSON file.

Below is an example file showing how it can be used:

---
# The below UIDs/GIDs are examples to use on a RHEL /dev and /tmp file system

# These maps would match for all hosts
uidmap:
  10: 10010      # uucp
  42: 10042      # gdm
gidmap:
  5: 10005       # tty
  16: 10016      # oprofile
  39: 10039      # video

# But you can define per-host maps, or use wildcards on FQDN
moria*:
  uidmap:
    48: 10048    # apache
    69: 10069    # vcsa
  gidmap:
    42: 10042    # gdm
    69: 10069    # vcsa
    484: 10484   # tmux
    505: 10505   # vboxusers

lisse.gent.wieers.com:
  uidmap:
    8: 10008     # mail
  gidmap:
    8: 10008     # mail

File system index

The file system index will be a pickle file with a specific renumid data structure.

Or could become something more efficient (storage and speed-wise).

The file system index stores the following information:

  • A version number of the file format (to detect incompatible files)

  • A set of excluded devices (so it is clear what the scope is)

  • Statistics (time, duration, paths scanned, …​)

  • The time it took to create the file system index

  • The scope (parents) that require changes

  • The actual mapping used for the index

  • The list of paths and original uids/gids