You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Users should be able to upload a bulk seed list as a text file, as an alternative to entering URLs in the list text box in the URL List option.
This text file can then be of any size / no limit to how many seeds can be specified (though additional crawl limits can apply).
The text file would be stored in the S3 bucket and mounted as a volume, and use the existing --seedList functionality in the crawler.
Some difference from the list text box:
Validation: Since we're bypassing the frontend here, there'd be no validation at crawl workflow creation time, however, invalid seeds should quickly appear in the error log once the crawl starts running. If failOnFailedSeed is set, then invalid seeds should also fail the whole crawl immediately.
What change would you like to see?
Users should be able to upload a bulk seed list as a text file, as an alternative to entering URLs in the list text box in the URL List option.
This text file can then be of any size / no limit to how many seeds can be specified (though additional crawl limits can apply).
The text file would be stored in the S3 bucket and mounted as a volume, and use the existing --seedList functionality in the crawler.
Some difference from the list text box:
Context
This issue supersedes #1107 and addresses #2312.
The text was updated successfully, but these errors were encountered: