- React + Flask + Mongo/Mysql -- database visualization and CRUD API
- React + Tornado + ES -- ElasticSearch query/filter/search
- scheduler -- publish task into Redis queue from Mongodb regularly
- Redis Queue -- duplicate_db / request_queue / start_queue
- Scrapy workers -- one subscribed from start_queue others subscribed from request_queue both connected to duplicate_db publish items into Kafka
- Kafka Queue -- receive items and give it to data cleaning and saving
- NLP worker -- consumen Kafka queue and process it, then save it into ES
- ElasticSearch -- final data
-
Notifications
You must be signed in to change notification settings - Fork 1
RomiVu/miniNews
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
distributed news crawler and an analysis
Topics
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published