Skip to content

Latest commit

 

History

History
24 lines (19 loc) · 519 Bytes

README.md

File metadata and controls

24 lines (19 loc) · 519 Bytes

Python Crawler

Implementation of crawlers and their manager written on python Crawlers can collect all text, structured data and links from the given list of webpages

Before

Install python libraries with

pip install --no-cache-dir -r requirements.txt

Usage (without Docker):

cd python-crawler
python main.py "in file" "max depth" "number of threads" "concurrent_tasks" "max_queue_size" "max_cycles" "delay"

Usage (with Docker):

To build container:

./build.sh