-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Testing rsdmx with Canada Statistics #47
Comments
The Canada Statistics portal provides a download facility in SDMX-ML format. This download allows to save a zip file containing two
|
Hello Emmanuel; I installed the package with update.packages("rsdmx") So, I rebooted and ran from the command line without any windowing running (normally run KDE 4.14.3) and left it overnight. Came back this morning and it had returned to the command prompt and output the error message "Killed". How do I turn on more comprehensive error messaging? Jan |
Hello, about the bugs i've highlighted above, i've solved it (it was a minor bug), but still needs to commit it to the code repository (it's still voluntary basis on my side, so i need to do it on after work). With this, reading the data as Afterwhat i will closely look the issue of This being said, datasets provided by Canada Statistics are big files. It logically requires lot of time to parse the document (while there is still matter to improve performance), but especially requires memory. On this aspect, By the way, i will also test against huge datasets. |
@nordicgnome I've pushed the first bug fix (dealing with the The sample code is as follows: require(rsdmx)
sdmx <- readSDMX("myfile.xml", isURL = FALSE)
sdmx.df <- as.data.frame(sdmx) I've tested it on a smaller dataset (a file of ~ 50mb), it works but it takes about 20min, for a dataset of more than 127,000 records. I will issue a separate ticket to investigate gaining in performance (processing time). Canada Statistics datasets will be a good test case. Once i have some more few time, i look into the 2d fix. Anyway, your feedback is welcome. |
@nordicgnome the 2d minor bug has been fixed. DataStructuresDefinition files from Canada Statistics are now properly read in R. For the example, you can follow the one provided in the wiki, with the exception that you will need to use Note that following these fixs, i've opened 2 tickets that i will investigate further, one dealing with codelist content & Your feedback is welcome, |
I'm having an issue with DataStructures - i'm not sure what is going on. Using the most current version of rsdmx (0.5-10) and I can't read the StatsCan structure data. I'm downloading this file: http://www12.statcan.gc.ca/nhs-enm/2011/dp-pd/dt-td/OpenDataDownload.cfm?PID=105470 The dropping it into my RStudio Server. Following the instructions on the wiki (eg: sdmx <- readSDMX(sdmx_files[2], isURL = FALSE)). Then trying to read that into a data.frame and getting the following error: Error in as.data.frame.default(sdmx) : Thoughts? I can read the data file but I don't get any of the codes mapped in that case. |
@ghawkins-ott a SDMX DataStructureDefinition can't be read as data.frame because it's a complex object (meaning it includes several subparts that can them be individually read as data.frames such as codelists and concepts). To extract codelists and concepts from the DSD, and read them as data.frame you can look at DSD example in https://github.com/opensdmx/rsdmx/wiki#sdmx-datastructuredefinition-dsd |
@eblondel Thank you! I see the codelists now. I am still struggling with the concept of how to apply them to the data file. For example, I'd like a data frame that would display the code value, (eg: "Female" instead of "2" in the Sex column)... Sorry, I'm fairly new to this. |
@ghawkins-ott No need to apology, what you need for code labels instead of values, is supported by rsdmx in a very easy way by associating the corresponding DSD (data structure definition) to the dataset, but in case of SDMX files downloaded manually (without a proper SDMX web-service) which is the case of Canada Statistics, there is one line of code to write to associate the DSD to the dataset, using the function
Hope this helps |
@eblondel Perfect, thank you so much! |
Testing
rsdmx
with Canada Statistics. Examples are provided here.Theses tests will aim to provide support to a request sent on the rsdmx mailing list, and identify/fix potential issues in the code.
Note: The case of Canada Statistics represents a useful use case for rsdmx, as it shows that not all data providers necessarily handle an SDMX web-service API, and that many SDMX resources may come as downloaded files, hence the added value of rsdmx to enable reading SDMX
local
files.The text was updated successfully, but these errors were encountered: