Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xls-dump.py consumes lot of memory on some file #8

Open
xeyownt opened this issue Sep 15, 2021 · 1 comment
Open

xls-dump.py consumes lot of memory on some file #8

xeyownt opened this issue Sep 15, 2021 · 1 comment

Comments

@xeyownt
Copy link

xeyownt commented Sep 15, 2021

Hello,

I'm using xls-dump.py through the indexer "recoll".
It turns out that the index was generating out-of-memory and finally freezing the machine because it was chocking on a specific file named fat-loop.xls. This file is found in Mediawiki website source (at least version 1.33.4, 1.34.4, 1.35.0 and 1.35.1).

To reproduce (adapt path as necessary):

python3 xls-dump.py --dump-mode=canonical-xml --utf-8 --catch /home/data/www/html/mw1.35.1/tests/phpunit/data/MSCompoundFileReader/fat-loop.xls

I tried with xls-dump.py from commit db25622 and could confirm the issue is still present.

@xeyownt
Copy link
Author

xeyownt commented Sep 15, 2021

fat-loop.xls

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant