-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Terminology: most rapidly varying dimension mislabelling of storage order #583
Comments
I think we should use: "C and Python NumPy uses the same order as CDL...." Since it begins with the definition for CDL. (and I suspect that was the original intent, as saying C uses the same order as C wouldn't have been intentional :-) Agree with the addition of R as an example. What does confuse me, and maybe this isn't the place to clarify in the docs, but if a netcdf File has:
in it, and you open it in Fortran or R, do you access it as: variable[z, y, x] ? (and the same for writing? (I don't use Fortran or R, so ....) |
Proposed text updated as suggested. On confusion: this is a common thing among the best of us, but in the end it really doesn't matter. Dimensions can be stored in any order so a reader has to examine the relevant attributes to determine how to orient the data. I am not sure about the details of the Where it does matter is in processing of the data. Getting a time profile for a specific location from a COARDS compliant 3-dimensional data set is painfully slow compared to getting an area of data for a specific time, due to the contiguity of the data on file and thus the more efficient I/O. That, however, is the same for both storage orders, but just operating on different dimensions. |
well, yes, which is why CF recommends an order, but does not require it -- and why it uses "most rapidly varying" rather than first [last] dimension. Though with modern file formats (netCDF4, zarr, ???) this ends up being more an issue of how the data are chunked, rather than the dimension order. |
Thanks for opening the issue, @pvanlaake, and for spotting the mistake. I agree with the suggested change of Chris's, which you've made, and also agree with his explanation of why CF relaxed the COARDS requirement for ordering of dimensions. Patrick should be added to the list of contributors to the convention once this issue has been concluded. I've added the |
Mis-labelling of storage order in definition of "most rapidly varying dimension"
In #530 a new definition for "most rapidly varying dimension" was developed and this has since been included in 1.12. Unfortunately, row-major ordering and column-major ordering are reversed. In fact, C-style is row-major and Fortran-style is column major.
Moderator
Not assigned.
Requirement Summary
Update the text in the terminology section to correctly reflect the storage order.
Additionally, some minor textual changes are proposed to make the text more accurate and inclusive:
Associated pull request
PR will be made after any comments and suggestions have been processed.
The text was updated successfully, but these errors were encountered: