-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reorganization #748
base: reorganization
Are you sure you want to change the base?
Reorganization #748
Conversation
Fix duplicate imports
All other similar "form . import X" statements are fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why this is currently listed as deleted, since it was correctly moved to hexrd/core/fitting/calibration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was deleted by the package mapper.
But it doesn't appear to be necessary, given the ignores already in the root:
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class
# C extensions
*.so
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as before. This is correctly copied to hexrd/core/instrument
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was correctly moved to hexrd/core/material
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was correctly moved to hexrd/hedm/sampleOrienations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was correctly moved to hexrd/powder/wppf
Reorganization Plan
1 - File Movement
The reorganization consists primarily of creating sub-folders under
hexrd
and moving most files into one of them. The goal of the first PR will be to move the files into this new location while making the minimum changes necessary to maintain functionality. The reorganization plan suggests that some files / classes be split up across workflows. During this phase, we will instead duplicate the files across workflows.2 - Reconciliation
During this phase, we will consider in smaller chunks code changes necessary to split up the classes and files across workflows. This should be limited to approximately the following code areas:
3 - Migration Tooling
During this phase, we will consider what the best options are for making code migration for users as painless as possible.
Phase 1 File Migration
All of the mapping was done according to the workflow labeling spreadsheet. A complete list of the file mappings can be found in
file_map.pkl
. It is being kept temporarily to support migration workflows.File Mapping
The files were mapped automatically using a new
package_remapper
tool. This works in a couple of stages to correctly move the files while keeping imports coherent. This tool was used to accomplish the bulk of the work and help ensure that no imports are missed, but all code changes made by the tool were reviewed and verified by hand.File Mappings
The package_remapper requires a 1-to-many map of files from one repository to the other. It will copy the files in the
source
to everywhere listed in thedestination
. The file mapping used was generated from the spreadsheet by rolling up the workflows to the file level. Things likehexrd/valunits.py
clearly map to one workflow,core
. However, things likehexrd/config/instruments.py
need to be split into multiple workflow sub-folders (core
andhedm
in this case. During the file mapping stage, we will just copy the whole file in both locations and log it for a human (me) to disentangle later. It is far too complicated to safely do this disentanglement automatically.Example:
It is not necessary that all files in the source directory have an explicit destination. If files are omitted for some reason, they will be brought over with the containing module. If there is no containing module (e.g.
setup.py
) they will be copied over in the same location relative to the package source. When generating the list of files, the relevant.gitignore
is used to filter out files we don’t care about if any are present.Quick Git Commit
Once the files are moved and before they are patched, the move is committed so that the file history is maintained and tracked through the move. The move and the changes necessary to make the code work after the move will be attributed to me, but no existing history is affected.
As far as I can tell, when copying files to multiple destinations, one is ~arbitrarily picked as the “rename” and others are picked as new files. The rename gets the source file history. In a later stage we will determine which files should get the history, and I will fix the git history then.
Import Patching
Now that all of the files are in potentially different locations, the referencing import statements are fixed*. We do this by parsing the python files and looking for 3 things:
import X.Y as Z
from hexrd.config import Config as ConfigAlias
from ..config import Config as ConfigAlias
*wppf/RietveldHEDM.py does not parse and will not be patched.
During the patching, we will keep any aliases that are listed, and if the ImportFrom statement is relative in the source, we will keep it relative after the transformation. Again we run into the 1-to-many file mapping issue. Here we take the same approach as the file copying, we will transform 1 import statement into multiple, one for each new location, and log the file and imports for a human (me) to resolve later.
Note: There are also ast.Name statements that need patching. Because of the low-frequency of this style of module reference (i.e.
import hexrd
followed by laterhexrd.module.something
) its just easier to do these by hand.Note: this import mapping tool will be used later to help users automatically update their imports.
Fixing things by hand
There are several things that either were too hard to patch automatically or just couldn’t be patched automatically that needed addressing.
setup.py + Extensions
There are 4 extensions in that are built in the setup for HEXRD: Convolution, Old XF, New XF and CPP Transforms. These were changed to take their source code from their new respective folders and install the module in their new respective locations:
There is also a console script that needs to be updated. This was updated in the
setup.py
and in theconda.recipe/meta.yaml
.ast.Name
The following references were found to hexrd modules that are not direct imports and were patched as follows —
Selecting duplicate imports
Duplicating the files across workflows will result in duplicate imports on dependent files. This is resolved by removing the duplicates and selecting the most appropriate file reference. As an example:
hexrd/hedm/somefile.py
During this process, it was found that several locations had non-trivially resolvable inter-workflow dependencies. A couple of
core
files reference HEDM, several things referenceWPPF
, and some others. These were all marked withTODO
comments to address in the next phase once the duplicate file reconciliation is completed. If there are still issues that cannot be clearly resolved, I will solicit additional feedback from SMEs to make the decisions.Old Module Aliasing
This falls squarely under “migration tooling”, but it is included in this initial PR because the test files contain pickled objects that reference the old structure. So to be able to load the data in https://github.com/HEXRD/examples we need to be able to use the old style references. These test files will be updated when we deprecate the old structure and switch to the new one.
Because of the limited remapping we are doing (just inserting a file layer between
hexrd
and the existing code), we only have to consider 2 cases.import hexrd.config.utils
hexrd
and then reference things in the module later:import hexrd; hexrd.config.utils
Case 1 With Import Hooks
The python import system is quite extensible, so we can actually install special hooks to resolve old imports to the correct new imports. Note that users will not get type support or auto-complete when referencing modules imported this way.
To create our import hook, we will make use of sys.meta_path. As described in the documentation, this allows us to add a Finder that is essentially a map from fully-qualified module names to file locations. We will use it in conjunction with the
file_map.pkl
created during the remapping to automatically re-direct import references.The core of our hook will be the following:
This will simply take the full path name that the user was trying to import and figure out what we should actually import.
Fixing module references
For users that import hexrd and then reference its attributes:
we need to be able to intercept these accesses and redirect them to the correct object. This is nicely done with a module level
__getattr__
on hexrd. When users access hexrd throughhexrd.module
they will get pointed to the location of the new module.We are only putting a getattr like this at the top level because the structure changes only occurred just below the top level. If users get to a module inside of one of the workflows, the references are the same. Additionally, the import hooks will handle any direct references lower than
hexrd
.Additional Fixes / Changes
In keeping with the spirit of this PR, the additional changes are kept to a minimum, however, I think there are a couple of things that are non-invasive and useful.
pip install hexrd[test]
. Making a new virtual environment and install the optional test dependencies allows all tests to run.Review Requests
There are a number of things that I think should review should focus on: