Skip to content

Latest commit

 

History

History
13 lines (9 loc) · 593 Bytes

README.md

File metadata and controls

13 lines (9 loc) · 593 Bytes

URL Dataset

Disclaimer: This repository is developed and released for educational purposes. Use at your own risk.

This repository crawls the top visited 100 websites and extracts unique URLs to be used for generating a dataset of unique real-world URL examples. The following script creates a out.txt file with each line containing a different URL.

This project uses Node.js. We recommend running the following with code with at least Node 18.

  • For installing dependencies, run npm install
  • To execute the script run npm start and the output will be written out.txt file.