Julien Tayon: The true cost and code of parsing the integrality of (french speaking) bluesky ATPROTO in python
The author challenges the notion that running a Bluesky server is prohibitively expensive and complex. They detail their experience running a real-time scan of the entire Bluesky network from a modest home PC. This bot, written in Python, consumes minimal resources: 25% CPU, less than a third of domestic bandwidth, and only 640MB of memory. The author explicitly states that one does not need to spend $300 per month to run a Bluesky AppView as suggested by others. They explain that while intensive API requests like get_post are rate-limited, scanning the firehose itself is free. The bot focuses on post events, which are a smaller fraction of total network traffic. The author also discusses filtering spam and NSFW content, achieving high efficiency with a blacklist based on tags. They provide insights into their coding approach, including using multiprocessing and a simple database structure. The project incorporates a web interface for content classification and a spam detection module. Finally, the author encourages others to experiment with the ATProto/Bluesky API, emphasizing that their "toy code" demonstrates feasibility on ordinary hardware.
get_postare rate-limited, scanning the firehose itself is free. The bot focuses on post events, which are a smaller fraction of total network traffic. The author also discusses filtering spam and NSFW content, achieving high efficiency with a blacklist based on tags. They provide insights into their coding approach, including using multiprocessing and a simple database structure. The project incorporates a web interface for content classification and a spam detection module. Finally, the author encourages others to experiment with the ATProto/Bluesky API, emphasizing that their "toy code" demonstrates feasibility on ordinary hardware.