diff --git a/README.md b/README.md index 8558eda..0bd7017 100644 --- a/README.md +++ b/README.md @@ -1 +1,29 @@ -Zatím je potřeba v debianu instalovat wkhtmltopdf balíček \ No newline at end of file +# Headline +Monitor how article titles are changed over time on news websites. + +___ +This tool is probably not production ready beacause it was written in two afternoons by an amateur (I'm not a professional programmer). If you want to run it, at least put a reverse proxy between it and public network or run it locally. + +I did't do any research on legality of analysing RSS feeds and it's possible you can get into legal issues by presenting the outcomes publicly. +___ + +## Architecture +The "processor" script will fetch rss feeds configured in `processor/config.yaml` every 5 minutes (configured in `processor/crontab`), store the article in Redis and compare new/old articles to find changes in title. +When change is found, it generates nice visual diff and stores it with other information (detection time, article link, new/old title, etc.) in permanent database (sqlite3 for now). + +The "view" script is reading data from the permanent database (sqlite3) and presents it to the user. + + +## Installation + +Run `docker-compose up -d` and everything should start. You can change ./processor/config.yaml to edit rss sources. +After first start, you have to wait for ~5mins for the "processor" to create first empty database. The webserver will throw error until then. + + + +## to-do +* Collect creation time of orig/new article, write it to permanent storage (sqlite3 for now) and display it. +* Write better readme and little more docs. +* Create view with some more info and stats (list of feeds, articles in redis, etc.) +* Create a routine to clear old articles from Redis (otherwise it will just fill up the disk space at some point...) +* IDEA: Figure out how to monitor changes in article description (maybe just compare hashes?) and how to present them. (Right now, the code can store descriptions in redis, but nothing else)