Skip to content

Web crawler implementation for first homework for a course at FRI, UL.

Notifications You must be signed in to change notification settings

jonchisko/ieps_dn1

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

57 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Web information extraction and retrieval - homework #1

This is a web crawler implementation for the first homework for a course at FRI, UL.

Instructions

Prerequisits

The following packages are required:

  1. os
  2. threading
  3. urllib.request
  4. selenium
  5. bs4
  6. multiprocessing
  7. urllib.parse
  8. time
  9. requests
  10. hashlib
  11. psycopg2
  12. datetime
  13. re
  14. posixpath
  15. urllib.robotparser

You need to install the chromedriver for selenium.

Running

Run the Crawler.py in directory crawler.

About

Web crawler implementation for first homework for a course at FRI, UL.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages