mirror of
https://git.nolog.cz/NoLog.cz/headline.git
synced 2025-01-31 11:53:35 +01:00
28 lines
1.7 KiB
Markdown
28 lines
1.7 KiB
Markdown
# Headline
|
|
Monitor how article titles are changed over time on news websites.
|
|
|
|
___
|
|
This tool is probably not production ready beacause it was written in two afternoons by an amateur (I'm not a professional programmer). If you want to run it, at least put a reverse proxy between it and public network or run it locally.
|
|
|
|
I did't do any research on legality of analysing RSS feeds and it's possible you can get into legal issues by presenting the outcomes publicly.
|
|
___
|
|
|
|
## Architecture
|
|
The "processor" script will fetch rss feeds configured in `processor/config.yaml` every 5 minutes (configured in `processor/crontab`), store the article in Redis and compare new/old articles to find changes in title.
|
|
When change is found, it generates nice visual diff and stores it with other information (detection time, article link, new/old title, etc.) in permanent database (sqlite3 for now).
|
|
|
|
The "view" script is reading data from the permanent database (sqlite3) and presents it to the user.
|
|
|
|
|
|
## Installation
|
|
|
|
Run `docker-compose up -d` and everything should start. You can change ./processor/config.yaml to edit rss sources.
|
|
After first start, you have to wait for ~5mins for the "processor" to create first empty database. The webserver will throw error until then.
|
|
|
|
|
|
|
|
## to-do
|
|
* Collect creation time of orig/new article, write it to permanent storage (sqlite3 for now) and display it.
|
|
* Write better readme and little more docs.
|
|
* Create view with some more info and stats (list of feeds, articles in redis, etc.)
|
|
* IDEA: Figure out how to monitor changes in article description (maybe just compare hashes?) and how to present them. (Right now, the code can store descriptions in redis, but nothing else)
|