A blue-sky bot that publishes posts from some scrapped websites. https://blog.nuculabs.dev
Find a file
2024-12-30 22:19:33 +02:00
.idea add bot crate 2024-12-27 11:10:48 +02:00
bot implement blob structs 2024-12-30 22:19:33 +02:00
docs add readme.md 2024-12-23 17:19:48 +02:00
infrastructure refactor read_stream 2024-12-28 10:22:18 +02:00
post implement read from redis streams 2024-12-27 18:36:56 +02:00
scrapper increase post retention to 14 days 2024-12-30 17:50:27 +02:00
synology add container_name for scraper 2024-12-26 23:33:27 +02:00
.dockerignore add .idea to docker ignore 2024-12-26 17:14:17 +02:00
.gitignore initial commit 2024-12-17 16:28:57 +02:00
Cargo.toml add bot crate 2024-12-27 11:10:48 +02:00
docker-compose.yaml add application/json Content-Type header to requests 2024-12-30 18:49:21 +02:00
Makefile add docker-compose.yaml for synology deployment 2024-12-26 19:02:51 +02:00
readme.md add readme.md 2024-12-23 17:19:48 +02:00

BlueSky Bot

A simple project that scrapes websites and publishes tweets on BlueSky.

⚠️ Work In Progress ⚠️

Architecture

architecture diagram

The architecture is composed of the following elements:

  1. The Scrapper

It scrapes data from one or more websites and publishes a JSON on Redis Streams.

It is configured via CLI arguments

Usage: scrapper [OPTIONS] --redis-connection-string <REDIS_CONNECTION_STRING> --redis-stream-name <REDIS_STREAM_NAME>

Options:
  -r, --redis-connection-string <REDIS_CONNECTION_STRING>
          Redis host
  -t, --redis-stream-name <REDIS_STREAM_NAME>
          Redis stream name
  -s, --scrape-interval-minutes <SCRAPE_INTERVAL_MINUTES>
          The scraping interval in minutes [default: 60]
  -h, --help
          Print help
  -V, --version
          Print version
  1. Redis

Redis is a key-value store with lots of features. It has been chosen to keep things simple and due to its powerful features and flexibility[1].

  1. BlueSky Bot

The BlueSky bot reads data from Redis Streams and publishes it to BlueSky.

[1] - https://redis.io/about/