-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MemoryError in extracting binary ark file #52
Comments
It looks like the format of the scp or ark file is corrupted: the program has read very large numbers of rows and columns. You may need to provide the scp and ark file for me to debug it... |
Please find the attached scp with ark file for your information. Untitled Folder.tar.gz Thanks |
I see that your ark files are in the compressed format, which is not supported by PDNN. |
No no, actually I could not attach files here in github thread as usual then I compressed the files but for experiment I am using the original uncompressed files. |
Alrigtht, you are right, |
You can use either the kaldi_feat.py script in PDNN, or the readArk / readScp functions in the following script: Afterwards, you can arrange them into any shapes required by Tensorflow. Unfortunately neither script supports the compressed matrix format, so you'll have to dump the features into the BFM format first. |
Thanks for your information. |
This should do the job. If you don't specify the text format, by default the output will be in uncompressed binary format. |
I have scp file with corresponding ark file, whenever I want to use class kaldi_feat.py, using read_next_utt() function I get the following error:
MemoryError
I backtraced the error and obtained the following information of error:
The text was updated successfully, but these errors were encountered: