Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in Step 4 of processing #98

Open
MughilM opened this issue Jun 22, 2020 · 1 comment
Open

Error in Step 4 of processing #98

MughilM opened this issue Jun 22, 2020 · 1 comment

Comments

@MughilM
Copy link

MughilM commented Jun 22, 2020

Issue

Hello,
Due to external constraints, I am unable to download the preprocessed data in Google Drive. However, I do have the raw .story files on hand, so I was going through the steps to preprocess the data myself. To start, I used 250 story files to make sure the steps work. Step 3 worked like a charm. However, when running Step 4, while it generates the json-line files fine, it results in an error the very end:

Traceback (most recent call last):
  File "preprocess.py", line 63, in <module>
    eval('data_builder.'+args.mode + '(args)')
  File "<string>", line 1, in <module>
  File "..../BertSum/src/prepro/data_builder.py", line 315, in format_to_lines
    with open(pt_file, 'w') as save:
FileNotFoundError: [Errno 2] No such file or directory: '../json_data/cnndm.train.0.json'

Steps to Reproduce

Follow steps 1-3 for preprocessing the data yourself in the README. Specifically, the command given in Step 4 gives the error at the end.

python preprocess.py -mode format_to_lines -raw_path RAW_PATH -save_path JSON_PATH -map_path MAP_PATH -lower

RAW_PATH is the directory containing tokenized files (../merged_stories_tokenized), JSON_PATH is the target directory to save the generated json files (../json_data/cnndm), MAP_PATH is the directory containing the urls files (../urls)

Thank you!

@tschomacker
Copy link

The error indicates that the directory could not be found. Please check that the folder../json_data exists before writing files to it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants